Ariane Bernard, INMA’s smart data initiative lead and the author of a new report on generative AI’s threats and opportunities, has an interesting slant on the new technology.
And in telling of battles record labels are already having, she points to lessons for the news industry… starting with Drake and The Weeknd.
“There’s a great lil’ bop ‘from’ the two artists making the rounds on TikTok,” she says, “and – to use a technical term – it slaps.”
But while you could hear a sample of it on this tweet earlier, copyright laws “seem to have caught up to it” just before it was published, also removing the platform for the AI generated song.
But interestingly, she says, AI generated the song.
“UMG, the record label behind Drake and The Weeknd, has not enjoyed the work to say the least, fighting on two fronts: working to block streaming services from scraping the songs of its artists, as well as blocking the AI-generated tunes from being distributed.
“It’s been fairly successful on that last goal – the song from TikTok creator Ghostwriter977 was successfully removed from YouTube, Spotify, the original TikTok, but you can still find it if you search for the name of the tune, ‘Heart on my Sleeve’, because the internet never forgets,” she says.
During the three months she spent researching the report into generative AI, Bernard says the legal and intellectual ramification was “one of the most interesting, but also complicated, parts”.
“In the matter of blocking AI-generated tunes, for example, there is a question of authorship, which is rather complex to dissect: Neither the artists or composers of the songs on which the AI trained are the author of the AI’s synthetic work. And the financial interests who back these artists – UMG in this case – aren’t rights holders to the synthetic tunes either.
“Yet there is a question of what I could best call ‘filiation’, which really is not addressed in our current intellectual property concepts simply because there were no real grounds for it until now.”
She says that – until this age of artificial intelligence – someone was an author or they weren’t. “If they used any form of non-human tools to aid in creation, these systems or their originators didn’t get credit (musical instrument makers don’t get credit, recording equipment companies don’t get credit, and the AI doesn’t have copyright either).
“And if someone inspired a piece of work, or even gave the idea for the work, they have no IP claim to it (ideas, famously, cannot be copyrighted or patented).
So this takes us to the question of how the scraped data on which the AI trained can be considered a part of the output – this filiation question.
“Does the training data lead to having copyrights into the output?
“That’s about the only place where there may be some actual enforceable claims, and lawsuits like the one Getty Images brought against Stable Diffusion for training on its licensed content is making the claim that AI training goes beyond Fair Use, even though there is a degree of transformation involved in the synthetic output.”
Much of the output from AI-created sound does not fall under current copyright laws. The rest of the output, like the voice being used is, in the current state of the law, not copyrightable. She says there are some provisions where the law blocks the use of a well-known voice generated by AI, “but they have to do with how it is used rather than by virtue of the voice itself being copyrightable.
In 2020, Jay Z tried to block deep fakes mimicking his voice but found he couldn’t: YouTube reinstated the videos he tried to get banned.
“And lest you think the problem would go away for UMG if they were successful in preventing AI makers from scraping the production of their artists, it would probably only be successful at slowing down the AI’s learning. There are plenty of places for an AI to ‘learn’ the voice of Drake which wouldn’t require using licensed material. Drake gives interviews, he speaks on his social media. The AI can learn hip hop from lots of different non-licensable places and can ‘learn’ Drake’s voice from lots of different places where UMG has no copyright to defend.
“ChatGPT and similar tools commit a highly sophisticated form of plagiarism,” said Jenna Burrell, the director of research at Data & Society, in an op-ed in Tech Policy Press. ‘The bigger concern is how ChatGPT concentrates wealth for its owners off of copyrighted work. It’s not clear if the current state of copyright law is up to the challenge of tools like it, which treat the Internet as a free source of training data.’
“In the news media, we, too, have strong authors whose distinctive voices are part of the final product we put out. And voice may be ‘vocal’, but even writing style can be distinctive enough. As news organisations are often known for distinctive signatures, long-time readers can often recognise their creative styles.
“As far as audio, there are plenty of synthetic voices available to get your content out in automated ways, but audio branding is, of course, a very real thing.”
She says many news brands, like NPR's Radiolab, are instantly recognisable by frequent listeners, meaning their audio is part of the brand itself.
“What should be interesting for the news media is looking at how the broader world of content creators and their rights holders – whether that’s UMG or Hollywood studios – take on AI, because their assets usually have more individual longevity in the market than everyday news has.
“In news, the totality of our catalogues (our archive) has value, but very few assets have long-standing value in individual distribution. Some of your articles continue to do well in search for years, but most articles are a flash-in-the-pan. We therefore tend to think of the value of our archive mostly through the prism of global B2B licensing deals rather than end-user single-item distribution.
“The assets of a movie studio or record label’s catalogue continue to have individual distribution value for much longer than news does, so there are really two battles up ahead for these media rights holders: picking up a fight with AI companies on the matter of the training data and with distributors of the synthetic material on the matter of infringement.”
But she says news media could, in the end, also face a similar challenge if an AI decided to create content in the ‘tone and voice’ of a well-known publication – and do so day in and day out. “At the moment, the topic of scraping is the one that occupies the news media more so than the distribution angle, but the fight of record labels may tell us something about our future, too.”
-INMA has just published it second report on generative this year, ‘News Media at the Dawn of Generative AI’, concluding opportunities outweigh threats for publishers.
The 91-page report written by Ariane Bernard explores:
The foundations for generative AI.
Legislation, regulation, and the law.
Generative AI’s core issues for news publishers.
How media can practically use generative AI now.
The report is available for free to INMA members and may be purchased by non-members here.
Among key learnings in the report:
Why AI is blossoming now in the public consciousness.
What generative AI does well and today’s blind spots.
Governance, copyright, legislation, and Fair Use: the strains and what media companies can do now to get in front of the tension between technology and the law.
What publishers find safe and unsafe about proceeding with generative AI.
A checklist of what newsrooms can achieve now with generative AI.