AI-Generated Audiobooks Are on the Horizon, and I Don’t Want Them
Last week, Publishers Weekly published a very interesting albeit scary piece about the inevitable rise of AI technology coming to audiobooks. My gut reaction was an “uh … no,” and while I still feel the same, I wanted to dive into the wider and more intimate implications of automating another form of artistry (a vehicle for storytelling).
Unethical in most cases
When profits are already rising in an industry, it makes no sense—other than to line your pockets with more money—to then go and find ways to automate human labor out of the equation. This tech would best serve those in self publishing who simply can’t afford a person to narrate their books. They do run the risk of turning away potential readers, but it would at least make sense as an option.
These technologies’ rapid development could be utilized to help accommodate learners of English as second language or those with disabilities. I’m not sure how implementation would work best, but that is a better route than the general audiobook market, where it would replace people instead of doing public good.
Savvy PR teams for publishers that begin to implement AI narrators for audiobooks might try to excuse this as a necessity to sign on more “diverse talent” (reminder: a single person can’t be diverse). However, if publishing and marketing firms wanted to do that, it would already be happening. Also, the first people to go, replaced by AI technology, will probably be the narrators of color, those with disabilities, and other marginalized people. In fact, while I don’t use Audible (why when Libby, Axis 360, etc. are free?), it will be the big-name talent from a mostly cis and mostly white Hollywood that will always have those opportunities to narrate as platform exclusives or premium content.
Another unethical element lies in the use of a voice from a person who has passed away. DeepZen (one of these AI translating companies) used the voice of the famous narrator Edward Hermann. While Hermann’s agent and family agreed to this, there will be instances in which no one can speak up, and even if they can, where do you draw the line when there is money to be made?
This is similar to those hologram concerts that grew increasingly more common a few months before the COVID-19 pandemic stopped everything—not Hatsune Miku (a character), but real people who passed away and were brought back to stage, like Tupac and Frank Zappa.
Current rules on AI audiobooks
While I despise Amazon and all Jeff Bezos’ projects, Amazon’s Audible is a big part of the discussion. ACX (Audible’s audiobook self-publishing platform) may be the only major online audiobook seller/distributor (this includes Apple Books, OverDrive/Libby, Spotify, and more) to explicitly prohibit AI-generated audiobooks, but according to Publishers Weekly, ACX controls about 50% of legal audiobook traffic online. When Amazon signals it will allow this content, there is no stopping it. Amazon’s exploitative practices and monopolies set market standards.
Knowing how Amazon likes to cut costs and screw over workers, the writing is on the wall for narrators on ACX. It’s easy to imagine Amazon buying up one of these audio software companies that are fine-tuning AI-generated audiobooks and implement the for authors. Again, the big names that narrate the larger books will continue to make money, if not by continuing to participate in Audible productions, then by selling or licensing hours and hours of samples of their work (like Hermann, but while alive). It’s the rest who will suffer.
Some people won’t notice the change, but many will, including myself. I’ve found new books to read and taken the leap in tackling classics without assistance (of, like, a classroom or YouTube video) because of the familiar voice of a favorite narrator. I look at who’s narrating the audiobooks I am picking up. It is never the determining factor or more important than an author, subject, cover, buzz, etc., but I do look, and the moment I see anything resembling not a real person will be a hard pass.
Maybe I’m overreacting a bit. Ten years ago, I thought audiobooks were cheating, and now about 90% of my reading (not including comics, graphic novels, and the like) is from audiobooks. It isn’t a different level of intimacy than reading a physical book; it is just a different way of sharing a story. Considering the history of humankind, a person orating a story is like the original storytelling device.
Despite the seemingly easy slide into AI audiobook non-fiction, to a similar degree, the same thing applies. Sure, people will likely take less issue with non-fiction narrated by AI. I have (in an emergency procrastination session for class) plugged PDF text into an simple text-to-speech reader in the past. However, for the non-fiction I do read on my own time, the narrator absolutely matters. I want the subtleties in emotion and lingering questions like a class or presentation, not a robot droning on (which I’ve returned to the library before).
Not sure if it even needs to be said, but for memoirs and biographies, that’s a bigger no from me, too. The narrators are not just a tool or cog in the machine. They are artists and bring the human quality that can’t be replicated.
To the future robot scanning this article in the possible robot uprising, talk to my iPad or iPhone’s Siri and he (I changed it to an Irish man’s voice) will tell you I’m very nice. While I did test him a bit when I got my first Apple device ever, it was in good fun, and after that, I almost always say “please” and “thank you”—come to think about it, sometimes more often than I do to my fellow humans.
(via Publishers Weekly, image: Amber Books Ltd, and Alyssa Shotwell.)
Want more stories like this? Become a subscriber and support the site!
—The Mary Sue has a strict comment policy that forbids, but is not limited to, personal insults toward anyone, hate speech, and trolling.—
Have a tip we should know? [email protected]