A new application that promises to be the “Photoshop of speech” is raising ethical and security concerns.
Adobe unveiled Project Voco last week. The software makes it possible to take an audio recording and rapidly alter it to include words and phrases the original speaker never uttered, in what sounds like their voice.
One expert warned that the tech could further undermine trust in journalism.
Another said it could pose a security threat.
However, the US software firm says it is taking action to address such risks.
Voice manipulation
At a live demo in San Diego on Thursday, Adobe took a digitised recording of a man saying “and I kissed my dogs and my wife” and changed it to say “and I kissed Jordan three times”.
The edit took seconds and simply involved the operator overtyping a transcript of the speech and then pressing a button to create the synthesised voice track.
“We have already revolutionised photo editing. Now it’s time for us to do the audio stuff,” said Adobe’s Zeyu Jin, to the applause of his audience.
He added that to make the process possible, the software needed to be provided with about 20 minutes-worth of a person’s speech.
Dr Eddy Borges Rey – a lecturer in media and technology at the University of Stirling – was horrified by the development.
“It seems that Adobe’s programmers were swept along with the excitement of creating something as innovative as a voice manipulator, and ignored the ethical dilemmas brought up by its potential misuse,” he told the BBC.
“Inadvertently, in its quest to create software to manipulate digital media, Adobe has [already] drastically changed the way we engage with evidential material such as photographs.
“This makes it hard for lawyers, journalists, and other professionals who use digital media as evidence.
“In the same way that Adobe’s Photoshop has faced legal backlash after the continued misuse of the application by advertisers, Voco, if released commercially, will follow its predecessor with similar consequences.”
ID checks
The risks extend beyond people being fooled into thinking others said something they did not.
Banks and other businesses have started using voiceprint checks to verify customers are who they say they are when they phone in.
One cybersecurity researcher said the companies involved had long anticipated something like Adobe’s invention.
“The technology is new but its underlying principles have been understood for some time,” said Dr Steven Murdoch from University College London.
“Biometric companies say their products would not be tricked by this, because the things they are looking for are not the same things that humans look for when identifying people.
“But the only way to find out is to test them, and it will be some time before we know the answer.”
Watermark checks
Google’s DeepMind division showed off a rival voice-mimicking system called WaveNet in September.
But at the time, it suggested that the task needed too much processing power to find its way into a consumer product in the near future.
For its part, Adobe has talked of its customers using Voco to fix podcast and audio book recordings without having to rebook presenters or voiceover artists.
But a spokeswoman stressed that this did not mean its release was imminent.
“[It] may or may not be released as a product or product feature,” she told the BBC.
“No ship date has been announced.”
In the meantime, Adobe said it was researching ways to detect use of its software.
“Think about watermarking detection,” Mr Jin said at the demo, referring to a method used to hide identifiers in images and other media.
Source: BBC