The Hub

News, Notes, Talk

An audio deepfake of Gucci Mane can now read you classic books.

Avatar

March 4, 2021, 12:40pm

Weird news of the day: the viral creation collective MSCHF has used machine learning to create an audio deepfake of rapper Gucci Mane reading classic texts. MSCHF collected six hours of audio from podcasts and interviews, transcribed them, and created a pronunciation key/dictionary to capture the rapper’s vocal idiosyncrasies; now, Project Gucciberg (pun very intended) houses audio of the fake Gucci Mane reading the entirety of Beowulf, Pride and Prejudice, and more.

Gucci Mane’s very specific voice was why MSCHF chose him for the project, and also what made the project challenging. “Gucci’s production follows a very particular cadence—he uses a much greater variety of vowel sounds, for instance, than your average [text-to-speech] reader would,” said MSCHF’s Dan Greenberg to The Verge. “The dictionary breaks words up into phonemes (discrete vocal gestures) that our model then uses as building blocks. So for a simple example, we need our model to know what syllables to elide, or flow into each other across words: it needs to know to say ‘talm bout’ not ‘talking about,’ and the Gucci dictionary { T AH1 L M B AW1 T} gets us there where the written words ‘talking about’ do not.”

The project is successful in its simple aim (make it sound like Gucci Mane is reading a given book); though some syllables reproduce the low-quality audio of the original voice samples, that’s to be expected.

Interestingly, Gucci Mane was not asked for permission to use his voice, as MSCHF acknowledges on their website. ”We didn’t write the books, and we deepfaked the voice,” MSCHF says on the Project Gucciberg site. “Is this copyright infringement? Is it identity theft? All of the training data (recordings) used to make Project Gucciberg were publicly available on the web. Gucciberg lives in that lovely grey area where everything’s new and anything goes.”

The disclaimer is funny, and also completely right about the murky moral questions the project raises. The nebulous legality and increasing use of deepfakes has more concerning implications than Gucci Mane reading the classics unbeknownst to himself. It’s bad news for voiceover artists, actors, and Cameo, but it also means (sigh) fake news. Deepfakes render video and audio potential sources of convincing misinformation. And it’s a consent issue also; 90% to 95% of deepfake videos on the Internet are nonconsensual pornography, people realistically inserting individuals’ faces into scenes in which they didn’t actually participate. Even if deepfakes that are truly indistinguishable from real video are rare, the mere potential of deepfakes inserts doubt into mediums previously considered to be unable to lie. Thinking about that, I’m pretty concerned—I think I’ll calm down by listening to a familiar classic tale. And I know just the guy to read it to me.

%d bloggers like this: