It took me six hours of listening to people with differing points of view discuss AI and copyright at a workshop, organized by the Sussex Centre for Law and Technology at the Sussex Humanities Lab (SHL), to come up with a question that seemed to me significant: what is all this talk about who “wins the AI race”? The US won the “space race” in 1969, and then for 50 years nothing happened.
Fretting about the “AI race”, an argument at least one participant used to oppose restrictions on using copyrighted data for training AI models, is buying into several ideas that are convenient for Big Tech.
One: there is a verifiable endpoint everyone’s trying to reach. That isn’t anything like today’s “AI”, which is a pile of math and statistics predicting the most likely answers to prompts. Instead, they mean artificial general intelligence, which would be as much like generative AI as I am like a mushroom.
Two: it’s a worthy goal. But is it? Why don’t we talk about the renewables race, the zero carbon race, or the sustainability race? All of those could be achievable. Why just this well-lobbied fantasy scenario?
Three: we should formulate public policy to eliminate “barriers” that might stop us from winning it. *This* is where we run up against copyright, a subject only a tiny minority used to care about, but that now affects everyone. And, accordingly, everyone has had time to formulate an opinion since the Internet first challenged the historical operation of intellectual property.
The law as it stands is clear: making a copy is the exclusive right of the rightsholder. This is the basis of AI-related lawsuits. For training data to escape that law, it would have to be granted an exemption: ruled fair use (as in the Anthropic and Meta cases), create an exception for temporary copies, or shoehorned into existing exceptions such as parody. Even then, copyright law is administered territorially, so the US may call it fair use but the rest of the world doesn’t have to agree. This is why the esteemed legal scholar Pamela Samuelson has said copyright law poses an existential threat to generative AI.
But, as one participant pointed out, although the entertainment industry dominates these discussions, there are many other sectors with different needs. Science, for example, both uses and studies AI, and is built on massive amounts of public funding. Surely that data should be free to access?
I wanted to be at this meeting because what should happen with AI, training data, and copyright is a conundrum. You do not have to work for a technology company to believe that there is value in allowing researchers both within and outwith companies to work on machine learning and build AI tools. When people balk at the impossible scale of securing permission from every copyright holder of every text, image, or sound, they have a point. The only organizations that could afford that are the companies we’re already mad at for being too big, rich, and powerful.
At the same time, why should we allow those big, rich, powerful companies to plunder our cultural domain without compensating anyone and extract even larger fortunes while doing it? To a published author who sees years of work reflected in a chatbot’s split-second answer to a prompt, it’s lost income and readers.
So for months, as Parliament has wrangled over the Data bill, the argument narrowed to copyright. Should there be an exception for data mining? Should technology companies have to get permission from creators and rights holders? Or should use of their work be automatically allowed, unless they opt out? All answers seem equally impossible. Technology companies would have to find every copyright holder of every datum to get permission. Licensing by the billion.
If creators must opt out, does that mean one piece at a time? How will they know when they need to opt out and who they have to notify? At the meeting, that was when someone said that the US and China won’t do this. Britain will fall behind internationally. Does that matter?
And yet, we all seemed to converge on this: copyright is the wrong tool. As one person said, technologies that threaten the entertainment industry always bring demands to tighten or expand copyright. See the last 35 years, in which Internet-fueled copying spawned the Digital Millennium Copyright Act and the EU Copyright Directive, and copyright terms expanded from 28 years, renewable once, to author’s life plus 70.
No one could suggest what the right tool would be. But there are good questions. Such as: how do we grant access to information? With business models breaking, is copyright still the right way to compensate creators? One of us believed strongly in the capabilities of collection societies – but these tend to disproportionately benefit the most popular creators, who will survive anyway.
Another proposed the highly uncontroversial idea of taxing the companies. Or levies on devices such as smartphones. I am dubious on this one: we have been there before.
And again, who gets the money? Very successful artists like Paul McCartney, who has been vocal/a> about this? Or do we have a broader conversation about how to enable people to be artists? (And then, inevitably, who gets to be called an artist.)
I did not find clarity in all this. How to resolve generative AI and copyright remains complex and confusing. But I feel better about not having an answer.
Illustrations: Drunk parrot in a Putney garden (by Simon Bisson; used by permission).
Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon or Bluesky.