Beyond Traditional Voice Solutions: Five9’s Journey with WellSaid Labs

Five9, a leader in cloud contact center solutions, teamed up with WellSaid Labs to vastly enhance their voice interactions. By leveraging WellSaid Labs’ innovative AI text-to-speech technology, Five9 transformed their approach, offering efficient and lifelike customer-agent interactions without the time and cost of hiring voice actors.

How Five9 Benefits from WellSaid

Building the Future of Customer Interaction: Five9's WellSaid-powered pivot

With a history rooted in offering innovative solutions, Five9 saw the potential of AI text-to-speech (TTS) technology to redefine customer-agent interactions. Enter WellSaid Labs: the leader in synthetic voice.

In this partnership story, you’ll discover how companies like Five9 can harness voice AI to offer efficient and lifelike customer experiences. Additionally, you’ll discover how WellSaid Labs can support through every phase of that transformation.

Introducing Five9, and their big challenge

Five9 is a leading cloud contact center solutions provider, streamlining and optimizing customer interactions. Their suite of tools has long provided businesses with innovative ways to enhance their customer experiences, especially with their acquisition of Inference Solutions, a startup that specialized in Interactive Voice Assistant (IVA) for self-service in contact centers.

Despite being a leader in the industry, Five9 was keen on finding more advanced ways to offer voice solutions for their clients. They sought to overcome the traditional method of hiring voice actors, which was not only time-consuming but also costly. The ideal solution would be a tool that generated high-quality, lifelike voice prompts almost instantly.

From acquisitions to WellSaid discovery

“Kumar and I both worked at a startup company named Inference Solutions before joining Five9 due to its acquisition,” Santosh Kulkarni, Conversational AI expert at Five9, began. He detailed their venture, highlighting how Five9 originally sourced an IVA from Inference Solutions. Upon discovering the exceptional TTS quality offered by WellSaid Labs, a pivotal change was on the horizon.

“With WellSaid, users can instantly generate lifelike voice prompts, bypassing the tedious process of hiring actors,” Santosh added, emphasizing the novelty.

Kumar Murthy, Senior Product Manager on the Five9 team, stumbled upon a news article featuring WellSaid Labs. He was so impressed by the lifelike quality of a book excerpt read out using WellSaid’s voice technology that he shared it with his colleagues, including Santosh. Recognizing the potential, Five9 was keen on exploring an integration with WellSaid Labs, especially after learning about its exceptional text-to-speech quality that sounded strikingly human.

Santosh delved into the nuances of implementing the WellSaid technology. “Once we decided to integrate with WellSaid Labs, we faced the challenge of how to present this new feature.” While traditional TTS synthesized voice during a call, WellSaid Labs’ innovation allowed for almost instantaneous audio file generation. This gave birth to the “virtual voice over” concept.

As we inquired about the integration process, Santosh elaborated on their interactions with the WellSaid team. “We had several calls with the WellSaid Labs team, and they were all very helpful,” he recalled.

"We saw that the text to speech quality was, you know, far superior [...] It was just like, actual real people."
Santosh Kulkarni
Conversational AI expert at Five9
Get Started with WellSaid Labs Today
Share this case study

Five9's leap into dynamic TTS integration

Five9 successfully integrated WellSaid Labs’ solution into their new IVA product. Though most of their customer deployments remain on the older version, the team is optimistic about the broad adoption of this new feature in the future. As the dynamics of the contact center change, with virtual agents handling more calls than ever, the role of TTS becomes more vital than ever. The potential for TTS in various areas, including conversational self-service, is immense, and Five9 is poised to explore these opportunities.

When asked about the future implications of AI voice technology, Santosh responded with enthusiasm, foreseeing its application not just in the current context but for live agents as well. The trend indicates a shift, with more requirements being handled by virtual agents, emphasizing the importance of advanced TTS solutions.

The discussion ventured into the intricacies of dynamic TTS and its latency concerns. “We have static TTS… and then we have dynamic TTS where you are generating the text. So speak synthesis in real-time during the call,” Santosh explained.

Upon being asked about the desired latency rate, Santosh and Kumar emphasized its significance: “I can tell you it’ll be 10 times more of what we do now.”

Pioneering AI-driven customer interactions with Five9

With a clear vision for AI voice technology, Santosh highlighted its immense potential for Five9. “Text to speech becomes really important there,” he noted, foreseeing a shift in customer-agent dynamics. Santosh ended on an anticipatory note, suggesting that AI’s role in conversational self-service might soon dominate the contact center scene.


The collaboration between Five9 and WellSaid Labs showcases the evolving nature of customer interactions and the role technology plays in enhancing these experiences. By integrating WellSaid Labs’ cutting-edge text-to-speech functionality, Five9 has positioned itself at the forefront of this foundational shift, ready to redefine the future of customer service.

Indistinguishable from the original voice recordings

Equipped with a custom Voice Avatar made from his own voice, Chris was curious how the new training modules would be received. 

To date, EIA has built approximately 400 hours of online NDT training, 100 of those using the custom Voice Avatar. They have yet to receive a single question regarding the consistency of their course narration.

“Not a single person that I haven’t told has ever thought it wasn’t me, which is a crazy bar for Well Said to have cleared. With half a dozen subject matter experts reviewing it, nobody has questioned the narration.” 

It’s one thing for people taking eLearning courses to not notice the difference in a voice they heard over a few weeks or months. But Chris has even sent his family and those close to him snippets of audio from his avatar, and even they don’t know the difference.

“I sent my family, like eight people, messages in my avatar’s voice. In a blind test, none of them heard any inconsistencies to know it wasn’t me.”

WellSaid exceeds expectations

Before finding WellSaid, voice production was 60% of Chris’s course development time for an eLearning chapter. He needed three hours to create one hour of voiceover. Chris had to spend hours in a recording studio, and even then, the results could be inconsistent. 

He had to go out of his way to create scripts that minimized future updates, because he knew those would be a nightmare to get right.

“We haven’t even started to realize the ROI on updates to our narration. As our subject matter experts correct and update our material, we will have it right there and be able to edit it in a few clicks.”

Now, with his Custom Avatar, anyone on his team can create using his voice while he is handling the many challenges of running a growing business.. What used to take three and a half hours now takes 90 minutes. 

“WellSaid labs has been a game changer. I could not be more thrilled with the return on my investment already.”