Conundrum

It took me six hours of listening to people with differing points of view discuss AI and copyright at a workshop, organized by the Sussex Centre for Law and Technology at the Sussex Humanities Lab (SHL), to come up with a question that seemed to me significant: what is all this talk about who “wins the AI race”? The US won the “space race” in 1969, and then for 50 years nothing happened.

Fretting about the “AI race”, an argument at least one participant used to oppose restrictions on using copyrighted data for training AI models, is buying into several ideas that are convenient for Big Tech.

One: there is a verifiable endpoint everyone’s trying to reach. That isn’t anything like today’s “AI”, which is a pile of math and statistics predicting the most likely answers to prompts. Instead, they mean artificial general intelligence, which would be as much like generative AI as I am like a mushroom.

Two: it’s a worthy goal. But is it? Why don’t we talk about the renewables race, the zero carbon race, or the sustainability race? All of those could be achievable. Why just this well-lobbied fantasy scenario?

Three: we should formulate public policy to eliminate “barriers” that might stop us from winning it. *This* is where we run up against copyright, a subject only a tiny minority used to care about, but that now affects everyone. And, accordingly, everyone has had time to formulate an opinion since the Internet first challenged the historical operation of intellectual property.

The law as it stands is clear: making a copy is the exclusive right of the rightsholder. This is the basis of AI-related lawsuits. For training data to escape that law, it would have to be granted an exemption: ruled fair use (as in the Anthropic and Meta cases), create an exception for temporary copies, or shoehorned into existing exceptions such as parody. Even then, copyright law is administered territorially, so the US may call it fair use but the rest of the world doesn’t have to agree. This is why the esteemed legal scholar Pamela Samuelson has said copyright law poses an existential threat to generative AI.

But, as one participant pointed out, although the entertainment industry dominates these discussions, there are many other sectors with different needs. Science, for example, both uses and studies AI, and is built on massive amounts of public funding. Surely that data should be free to access?

I wanted to be at this meeting because what should happen with AI, training data, and copyright is a conundrum. You do not have to work for a technology company to believe that there is value in allowing researchers both within and outwith companies to work on machine learning and build AI tools. When people balk at the impossible scale of securing permission from every copyright holder of every text, image, or sound, they have a point. The only organizations that could afford that are the companies we’re already mad at for being too big, rich, and powerful.

At the same time, why should we allow those big, rich, powerful companies to plunder our cultural domain without compensating anyone and extract even larger fortunes while doing it? To a published author who sees years of work reflected in a chatbot’s split-second answer to a prompt, it’s lost income and readers.

So for months, as Parliament has wrangled over the Data bill, the argument narrowed to copyright. Should there be an exception for data mining? Should technology companies have to get permission from creators and rights holders? Or should use of their work be automatically allowed, unless they opt out? All answers seem equally impossible. Technology companies would have to find every copyright holder of every datum to get permission. Licensing by the billion.

If creators must opt out, does that mean one piece at a time? How will they know when they need to opt out and who they have to notify? At the meeting, that was when someone said that the US and China won’t do this. Britain will fall behind internationally. Does that matter?

And yet, we all seemed to converge on this: copyright is the wrong tool. As one person said, technologies that threaten the entertainment industry always bring demands to tighten or expand copyright. See the last 35 years, in which Internet-fueled copying spawned the Digital Millennium Copyright Act and the EU Copyright Directive, and copyright terms expanded from 28 years, renewable once, to author’s life plus 70.

No one could suggest what the right tool would be. But there are good questions. Such as: how do we grant access to information? With business models breaking, is copyright still the right way to compensate creators? One of us believed strongly in the capabilities of collection societies – but these tend to disproportionately benefit the most popular creators, who will survive anyway.

Another proposed the highly uncontroversial idea of taxing the companies. Or levies on devices such as smartphones. I am dubious on this one: we have been there before.

And again, who gets the money? Very successful artists like Paul McCartney, who has been vocal about this? Or do we have a broader conversation about how to enable people to be artists? (And then, inevitably, who gets to be called an artist.)

I did not find clarity in all this. How to resolve generative AI and copyright remains complex and confusing. But I feel better about not having an answer.

Illustrations: Drunk parrot in a Putney garden (by Simon Bisson; used by permission).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon or Bluesky.

Negative externalities

A sheriff’s office in Texas searched a giant nationwide database of license plate numbers captured by automatic cameras to look for a woman they suspected of self-managing an abortion. As Rindala Alajazi writes at EFF, that’s 83,000 cameras in 6,809 networks belonging to Flock Safety, many of them in states where abortion is legal or protected as a fundamental right until viability.

We’ve known something like this was coming ever since 2022, when the US Supreme Court overturned Roe v. Wade and returned the power to regulate abortion to the individual US states. The resulting unevenness made it predictable that the strongest opponents to legal abortion would turn their attention to interstate travel.

The Electronic Frontier Foundation has been warning for some time about Flock’s database of camera-captured license plates. Recently, Jason Koebler reported at 404 Media that US Immigration and Customs Enforcement has been using Flock’s database to find prospects for deportation. Since ICE does not itself have a contract with Flock, it’s been getting local law enforcement to perform search on its behalf. “Local” refers only to the law enforcement personnel; they have access to camera data that’s shared nationally.

The point is that once the data has been collected it’s very hard to stop mission creep. On its website, Flock says its technology is intended to “solve and eliminate crime” and “protect your community”. That might have worked when we all agreed what was a crime.

***

A new MCTD Cambridge report makes a similar point about menstrual data, when sold at scale. Now, I’m from the generation that managed fertility with a paper calendar, but time has moved on, and fertility tracking apps allow a lot more of the self-quantification that can be helpful in many situations. As Stephanie Felsberger writes in introducing the report, menstrual data is highly revealing of all sorts of sensitive information. Privacy International has studied period-tracking apps, and found that they’ve improved but still pose serious privacy risks.

On the other hand, I’m not so sure about the MCTD report’s third recommendation – that government build a public tracker app within the NHS. The UK doesn’t have anything like the kind of divisive rhetoric around abortion that the US does, but the fact remains that legal abortion is a 1967 carve-out from an 1861 law. In the UK, procuring an abortion is criminal *except* during the first 24 weeks, or if the mother’s life is in danger, or if the fetus has a serious abnormality. And even then, sign-off is required from two doctors.

Investigations and prosecutions of women under that 1861 law have been rising, as Shanti Das reported at the Guardian in January. Pressure in the other direction from US-based anti-choice groups such as the Alliance for Defending Freedom has also been rising. For years it’s seemed like this was a topic no one really wanted to reopen. Now, health care providers are calling for decriminalization, and, as Hannah Al-Oham reported this week, there are two such proposals currently in front of Parliament.

Also relevant: a month ago, Phoebe Davis reported at the Observer that in January the National Police Chiefs’ Council quietly issued guidance advising officers to search homes for drugs that can cause abortions in cases of stillbirths and to seize and examine devices to check Internet searches, messages, and health apps to “establish a woman’s knowledge and intention in relation to the pregnancy”. There was even advice on how to bypass the requirement for a court order to access women’s medical records.

In this context, it’s not clear to me that a publicly owned app is much safer or more private than a commercial one. What’s needed is open source code that can be thoroughly examined that keeps all data on the device itself, encrypted, in a segregated storage space over which the user has control. And even then…you know, paper had a lot of benefits.

***

This week the UK Parliament passed the Data (Use and Access) bill, which now just needs a royal signature to become law. At its site, the Open Rights Group summarizes the worst provisions, mostly a list of ways the bill weakens citizens’ rights over their data.

Brexit was sold to the public on the basis of taking back national sovereignty. But, as then-MEP Felix Reda said the morning after the vote, national sovereignty is a fantasy in a globalized world. Decisions about data privacy can’t be made imagining they are only about *us*.

As ORG notes, the bill has led European Digital Rights to write to the European Commission asking for a review of the UK’s adequacy status. This decision, granted in 2020, was due to expire in June 2025, but the Commission granted a six-month extension to allow the bill’s passage to complete. In 2019, when the UK was at peak Brexit chaos, it seemed possible that the Conservative then-government would allow the UK to leave the EU with no deal in place, net.wars noted the risk to data flows. The current Labour government, with its AI and tech policy ambitions, ought to be more aware of the catastrophe losing adequacy would present. And yet.

Illustrations: Map from the Center for Reproductive Rights showing the current state of abortion rights across the US.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast and a regular guest on the TechGrumps podcast. Follow on Mastodon or Bluesky.

Deja news

At the first event organized by the University of West London group Women Into Cybersecurity, a questioner asked how the debates around the Internet have changed since I wrote the original 1997 book net.wars..

Not much, I said. Some chapters have dated, but the main topics are constants: censorship, freedom of speech, child safety, copyright, access to information, digital divide, privacy, hacking, cybersecurity, and always, always, *always* access to encryption. Around 2010, there was a major change when the technology platforms became big enough to protect their users and business models by opposing government intrusion. That year Google launched the first version of its annual transparency report, for example. More recently, there’s been another shift: these companies have engorged to the point where they need not care much about their users or fear regulatory fines – the stage Ed Zitron calls the rot economy and Cory Doctorow dubs enshittification.

This is the landscape against which we’re gearing up for (yet) another round of recursion. April 25 saw the passage of amendments to the UK’s Investigatory Powers Act (2016). These are particularly charmless, as they expand the circumstances under which law enforcement can demand access to Internet Connection Records, allow the government to require “exceptional lawful access” (read: backdoored encryption) and require technology companies to get permission before issuing security updates. As Mark Nottingham blogs, no one should have this much power. In any event, the amendments reanimate bulk data surveillance and backdoored encryption.

Also winding through Parliament is the Data Protection and Digital Information bill. The IPA amendments threaten national security by demanding the power to weaken protective measures; the data bill threatens to undermine the adequacy decision under which the UK’s data protection law is deemed to meet the requirements of the EU’s General Data Protection Regulation. Experts have already put that adequacy at risk. If this government proceeds, as it gives every indication of doing, the next, presumably Labour, government may find itself awash in an economic catastrophe as British businesses become persona-non-data to their European counterparts.

The Open Rights Group warns that the data bill makes it easier for government, private companies, and political organizations to exploit our personal data while weakening subject access rights, accountability, and other safeguards. ORG is particularly concerned about the impact on elections, as the bill expands the range of actors who are allowed to process personal data revealing political opinions on a new “democratic engagement activities” basis.

If that weren’t enough, another amendment also gives the Department of Work and Pensions the power to monitor all bank accounts that receive payments, including the state pension – to reduce overpayments and other types of fraud, of course. And any bank account connected to those accounts, such as landlords, carers, parents, and partners. At Computer Weekly, Bill Goodwin suggests that the upshot could be to deter landlords from renting to anyone receiving state benefits or entitlements. The idea is that banks will use criteria we can’t access to flag up accounts for the DWP to inspect more closely, and over the mass of 20 million accounts there will be plenty of mistakes to go around. Safe prediction: there will be horror stories of people denied benefits without warning.

And in the EU… Techcrunch reports that the European Commission (always more surveillance-happy and less human rights-friendly than the European Parliament) is still pursuing its proposal to require messaging platforms to scan private communications for child sexual abuse material. Let’s do the math of truly large numbers: billions of messages, even a teeny-tiny percentage of inaccuracy, literally millions of false positives! On Thursday, a group of scientists and researchers sent an open letter pointing out exactly this. Automated detection technologies perform poorly, innocent images may occur in clusters (as when a parent sends photos to a doctor), and such a scheme requires weakening encryption, and in any case, better to focus on eliminating child abuse (taking CSAM along with it).

Finally, age verification, which has been pending in the UK ever since at least 2016, is becoming a worldwide obsession. At least eight US states and the EU have laws mandating age checks, and the Age Verification Providers Association is pushing to make the Internet “age-aware persistently”. Last month, the BSI convened a global summit to kick off the work of developing a worldwide standard. These moves are the latest push against online privacy; age checks will be applied to *everyone*, and while they could be designed to respect privacy and anonymity, the most likely is that they won’t be. In 2022, the French data protection regulator, CNIL, found that current age verification methods are both intrusive and easily circumvented. In the US, Casey Newton is watching a Texas case about access to online pornography and age verification that threatens to challenge First Amendment precedent in the Supreme Court.

Because the debates are so familiar – the arguments rarely change – it’s easy to overlook how profoundly all this could change the Internet. An age-aware Internet where all web use is identified and encrypted messaging services have shut down rather than compromise their users and every action is suspicious until judged harmless…those are the stakes.

Illustrations: Angel sensibly smashes the ring that makes vampires impervious (in Angel, “In the Dark” (S01e03)).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon