Voice is a fundamental communication medium we have as humans, yet voice production simply does not scale. Talent can only speak 3-4 hours a day and editing voice is not iterative. This is restrictive and costly to creatives who need to add voice narration to their stories.
But what exactly do we mean by “scaling voice”?
There are two scenarios that clarify our meaning:
A solo producer whose story has multiple characters, each requiring their own unique voice.
A team of multiple content creators who need to produce content with the same character or voice.
WellSaid
In summary, scaling voice narration is about the one-to-many and many-to-one relationship between the creatives and the voices needed for the story.
When we talk about scale, what we’re really talking about is how technology can help us augment our ability to accomplish a task. At times, it means replacing us so we can free up and reallocate time to more value-producing efforts. With voice, striking this balance is complicated. Can we even replace creative assets like voice? In my judgement, not fully. But there are a lot of scenarios in which technology can become the stunt man of the voice actor. It is this perspective that opens up the notion of scaling something as human, and intimate, as voice.
When voice becomes scalable, the solo producer becomes empowered to embrace her story’s characters. Having voice narration available at her fingertips, she can clearly convey the depth and layers to her story arc. Take conventional audiobooks for example, in which one voice narrates the many dimensions of a story. Now, take the same story and imagine it being told in a more engaging and interactive way. Imagine a format in which every character comes to life on its own and speaks to us in his, or her, own voice. That level of connection between the audience and cast is what we seek.
On the opposite side of the production line, we have the collective creative team. The goal remains the same: to tell the best story possible. How can one of the many writers involved better pitch what the protagonist should say in reaction to that key inflection point of the story? Do we battle through endless debates, or can we easily act the scene with the character’s voice? Beyond adding new tools to “prototype” a story, evolving voice into a scalable creative tool has a significant impact on the workflow management and production costs.
But how do we go about making this vision something real?
It all starts with real people. The talent is the core.
We want to capture her essence when she’s portraying the character. We want to make her voice, persona, and emotions an instrument writers can use, telling their stories with her gusto. Here is where synthetic voices play a fascinating role in augmenting human capacity, supplementing the creative process with an at-par replica of a human voice and the working capacity of a machine. From a producer’s point of view, the workflow gets leaner. From a talent perspective, opportunities open. It is still her voice, her effort, her work. The only difference is that she now has the freedom to create much more because, thanks to the proper use of artificial intelligence, her capacity has been expanded.
Credits
Music by bensound