The China Project recently spotlighted Taylor Swift’s amazing fluency in speaking Mandarin Chinese. It is amazing to think she is able to gain this level of proficiency while attending to her billion dollar business empire. Except that she can’t speak Chinese that well and the video of her, and others, doing so in the China Project piece were actually accomplished by a sophisticated piece of deepfake AI software which is impressively adept that mimicking a speaker’s natural voice and lip movements.
HeyGen’s “Video Translate” tool, which created the Swift deepfakes, is capable of translating footage into 14 different languages — including Mandarin, Hindi, and Arabic — and can clone the speaker’s voice and sync the person’s lips in an “authentic speaking style,” according to the company’s website.
On Weibo, fans of the tool praised it for its exciting potential to improve dubbing in foreign films, as it is able to match the movements of an actor’s mouth with their translated speech in Chinese. Others also pointed out that it could be an AI-powered solution to revolutionize Chinese ecommerce brands, which have found it difficult to reach global audiences due to a lack of dual-lingual livestreamers.
Others had the obvious concerns about the technology being used as a propaganda tool or to generate false content with which to accuse people of all sorts of wrong doing. The article reports that by moving to the US,
“…HeyGen is no longer subject to China’s deepfake rules, which went into effect in January. As one of the first governments to regulate hyper-realistic, AI-generated media, Beijing requires companies to obtain consent from individuals whose likenesses are being manipulated; deepfakes need to be labeled as such on the internet, and can’t be used for purposes deemed harmful (vaguely defined) to national security or the economy.
Similar rules regarding the use of AI to create content like this without some detectable markers or disclaimers indicating the content is fake is being contemplated in the US. An executive order along those lines were issued today.
But as the article suggests, tools like these could be a boon for arts organizations seeking to increase accessibility. Especially if they are able to work in real time to provide captioning and translations for performances, concerts, and lectures which may not have a formal script or translate notes and commentary in galleries (or whatever the tour guide is saying)
People may be more open to watching foreign films (or anime as I suggest in the title of this post) if the dubbing looks and sounds convincing. Though it will probably be bad news for the voice actors who currently make a living doing anime dubs.