
Artist Rights Institute challenges UK AI Copyright loopholes
The music and creative industries are up in arms over pending UK AI copyright rules that would allow uncompensated text and data mining to train generative AI models. Chris Castle and the Artist Rights Institute explain why the new rules could be so damaging.
Artist Rights Institute challenges UK AI Copyright loopholes
by CHRIS CASTLE via Music Tech
The Artist Rights Institute filed a comment in the UK Intellectual Property Office’s consultation on Copyright and AI that I drafted. We will be posting excerpts from that comment from time to time.
Confounding culture with data to confuse both the public and lawmakers requires a vulpine lust that we haven’t seen since the breathless Dot Bomb assault on both copyright and the public financial markets.
We strongly disagree that all the world’s culture can be squeezed through the keyhole of “data” to be “mined” as a matter of legal definitions. In fact, a recent study by leading European scholars have found that data mining exceptions were never intended to excuse copyright infringement:
Generative AI is transforming creative fields by rapidly producing texts, images, music, and videos. These AI creations often seem as impressive as human-made works but require extensive training on vast amounts of data, much of which are copyright protected. This dependency on copyrighted material has sparked legal debates, as AI training involves “copying” and “reproducing” these works, actions that could potentially infringe on copyrights. In defense, AI proponents in the United States invoke “fair use” under Section 107 of the [US] Copyright Act [a losing argument in the one reported case on point[1]], while in Europe, they cite Article 4(1) of the 2019 DSM Directive, which allows certain uses of copyrighted works for “text and data mining.”
This study challenges the prevailing European legal stance, presenting several arguments:
1. The exception for text and data mining should not apply to generative AI training because the technologies differ fundamentally – one processes semantic information only, while the other also extracts syntactic information.
2. There is no suitable copyright exception or limitation to justify the massive infringements occurring during the training of generative AI. This concerns the copying of protected works during data collection, the full or partial replication inside the AI model, and the reproduction of works from the training data initiated by the end-users of AI systems like ChatGPT….[2]
Moreover, the existing text and data mining exception in European law was never intended to address AI scraping and training:
Axel Voss, a German centre-right member of the European parliament, who played a key role in writing the EU’s 2019 copyright directive, said that law was not conceived to deal with generative AI models: systems that can generate text, images or music with a simple text prompt.[3]
Confounding culture with data to confuse both the public and lawmakers requires a vulpine lust that we haven’t seen since the breathless Dot Bomb assault on both copyright and the public financial markets. This lust for data, control and money will drive lobbyists and Big Tech’s amen corner to seek copyright exceptions under the banner of “innovation.” Any country that appeases AI platforms in the hope of cashing in on tech at the expense of culture will be appeasing their way towards an inevitable race to the bottom. More countries can be predictably expected to offer ever more accommodating terms in the face of Silicon Valley’s army of lobbyists who mean to engage in a lightning strike across the world. The fight for the survival of culture is on. The fight for survival of humanity may literally be the next one up.
We are far beyond any reasonable definition of “text and data mining.” What we can expect is for Big Tech to seek to distract both creators and lawmakers with inapt legal diversions such as trying to pretend that snarfing down all with world’s creations is mere “text and data mining”. The ensuing delay will allow AI platforms to enlarge their training databases, raise more money, and further the AI narrative as they profit from the delay and capital formation.
[1] Thomson-Reuters Enterprise Centre GMBH v. Ross Intelligence, Inc., (Case No. 1:20-cv-00613 U.S.D.C. Del. Feb. 11, 2025) (Memorandum Opinion, Doc. 770 rejecting fair use asserted by defendant AI platform) available at https://storage.courtlistener.com/recap/gov.uscourts.ded.72109/gov.uscourts.ded.72109.770.0.pdf (“[The AI platform]’s use is not transformative because it does not have a ‘further purpose or different character’ from [the copyright owner]’s [citations omitted]…I consider the “likely effect [of the AI platform’s copying]”….The original market is obvious: legal-research platforms. And at least one potential derivative market is also obvious: data to train legal AIs…..Copyrights encourage people to develop things that help society, like [the copyright owner’s] good legal-research tools. Their builders earn the right to be paid accordingly.” Id. at 19-23). See also Kevin Madigan, First of Its Kind Decision Finds AI Training Is Not Fair Use, Copyright Alliance (Feb. 12, 2025) available at https://copyrightalliance.org/ai-training-not-fair-use/ (discussion of AI platform’s landmark loss on fair use defense).
[2] Professor Tim W. Dornis and Professor Sebastian Stober, Copyright Law and Generative AI Training – Technological and Legal Foundations, Recht und Digitalisierung/Digitization and the Law (Dec. 20, 2024)(Abstract) available at https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4946214.
[3] Jennifer Rankin, EU accused of leaving ‘devastating’ copyright loophole in AI Act, The Guardian (Feb. 19, 2025) available at https://www.theguardian.com/technology/2025/feb/19/eu-accused-of-leaving-devastating-copyright-loophole-in-ai-act