Introducing WellSaid Labs’ HINTS

Crafting authentic vocal performances via interpolable in-context cues

Audio by Paige L. using WellSaid Labs

This post is from the WellSaid Research team, exploring breakthroughs and thought leadership within audio foundation model technology.


Today, we announce a breakthrough in generative modeling for speech synthesis: HINTS (or Highly Intuitive Naturally Tailored Speech). This work from the WellSaid Labs team introduces a novel generative model architecture combining state-of-the-art neural text-to-speech (TTS) with contextual annotations to enable a new level of artistic direction of synthetic voice outputs.

Read More
voice AI pronunciation Oxford Dictionary

WellSaid Labs Tackles Complex Pronunciation with Oxford Languages

Audio by Joe F. using WellSaid Labs

The State of AI Pronunciation 

Among the countless AI-driven innovations, text-to-speech (TTS) technology is a versatile tool that revolutionizes how we interact with content, from advertising and corporate training to educational modules and audiobooks. AI voiceovers can bring a company’s message to life and establish a voice behind the brand, making it more relatable and memorable. Whether it is an advertisement, presentation, video, or any media content, a voiceover ensures clarity, impact, and emotional resonance – ultimately helping to strengthen the connection with the target audience effectively.

Read More
pronunciation approach guide to respellings

WellSaid Labs’ Approach to Pronunciation: Your guide to Respellings

Audio by Owen C. using WellSaid Labs

This post is from the WellSaid Research team, exploring breakthroughs and thought leadership within audio foundation model technology.


No one sums up the English language better than David Burge: “Yes, English can be weird. It can be understood through tough thorough thought, though”.

Let’s be honest—English is weird. Sometimes it seems as if there are more exceptions than there are rules. And even when these rules begin to feel like second nature, you can still be handed a new word to say aloud and have no clue where to begin. 

Read More
naturalness as primary driver synthetic voice

Defining Naturalness as Primary Driver for Synthetic Voice Quality at WellSaid Labs

Audio by Tilda C. using WellSaid Labs

This post is from the WellSaid Research team, exploring breakthroughs and thought leadership within audio foundation model technology.


The prospect of machines mimicking human speech so well that our minds are unable to discern the difference seems straight out of a sci-fi novel. But for us at WellSaid Labs, it’s just another day at the office. 

Read More
what is AFM

Audio Foundation Models (AFM), More Than a Marketing Stunt

Audio by Paula R. using WellSaid Labs

Ever watched a movie without its score? Or imagined a game without its defining sound effects? Audio is an unsung hero, subtly crafting experiences that evoke emotion, tell stories, and define entire genres. Now, imagine a world where this symphony of sounds, from the deepest bass notes to the gentlest whispers, could be synthesized by a single generative AI model. 

Welcome to the world of Audio Foundation Models (AFM). 🌎🗣️

Read More