net.wars

Predatory inclusion

The recent past is a foreign country; they view the world differently there.

At last week’s We Robot conference on technology, policy, and law, the indefatigably detail-oriented Sue Glueck was the first to call out a reference to the propagation of transparency and accountability by the “US and its allies” as newly out of date. From where we were sitting in Windsor, Ontario, its conjoined fraternal twin, Detroit, Michigan, was clearly visible just across the river. But: recent events.

As Ottawa law professor Teresa Scassa put it, “Before our very ugly breakup with the United States…” Citing, she Anu Bradford, she went on, “Canada was trying to straddle these two [US and EU] digital empires.” Canada’s human rights and privacy traditions seem closer to those of the EU, even though shared geography means the US and Canada are superficially more similar.

We’ve all long accepted that the “technology is neutral” claim of the 1990s is nonsense – see, just this week, Luke O’Brien’s study at Mother Jones of the far-right origins of the web-scraping facial recognition company Clearview AI. The paper Glueck called out, co-authored in 2024 by Woody Hartzog, wants US lawmakers to take a tougher approach to regulating AI and ban entirely some systems that are fundamentally unfair. Facial recognition, for example, is known to be inaccurate and biased, but improving its accuracy raises new dangers of targeting and weaponization, a reality Cynthia Khoo called “predatory inclusion”. If he were writing this paper now, Hartzog said, he would acknowledge that it’s become clear that some governments, not just Silicon Valley, see AI as a tool to destroy institutions. I don’t *think* he was looking at the American flags across the water.

Later, Khoo pointed out her paper on current negotiations between the US and Canada to develop a bilateral law enforcement data-sharing agreement under the US CLOUD Act. The result could allow US police to surveil Canadians at home, undermining the country’s constitutional human rights and privacy laws.

In her paper, Clare Huntington proposed deriving approaches to human relationships with robots from family law. It can, she argued, provide analogies to harms such as emotional abuse, isolation, addiction, invasion of privacy, and algorithmic discrimination. In response, Kate Darling, who has long studied human responses to robots, raised an additional factor exacerbating the power imbalance in such cases: companies, “because people think they’re talking to a chatbot when they’re really talking to a company.” That extreme power imbalance is what matters when trying to mitigate risk (see also Sarah Wynn-Williams’ recent book and Congressional testimony on Facebook’s use of data to target vulnerable teens).

In many cases, however, we are not agents deciding to have relationships with robots but what AJung Moon called “incops”, or “incidentally co-present”. In the case of the Estonian Starship delivery robots you can find in cities from San Francisco to Milton Keynes, that broad category includes human drivers, pedestrians, and cyclists who share their spaces. In a study, Adeline Schneider found that white men tended to be more concerned about damage to the robot, where others worried more about communication, the data they captured, safety, and security. Delivery robots are, however, typically designed with only direct users in mind, not others who may have to interact with it.

These are all social problems, not technological ones, as conference chair Kristen Thomasen observed. Carys Craig later modified it: technology “has compounded the problems”.

This is the perennial We Robot question: what makes robots special? What qualities require new laws? Just as we asked about the Internet in 1995, when are robots just new tools for old rope, and when do they bring entirely new problems? In addition, who is responsible in such cases? This was asked in a discussion of Beatrice Panattoni‘s paper on Italian proposals to impose harsher penalties for crime committed with AI or facilitated by robots. The pre-conference workshop raised the same question. We already know the answer: everyone will try to blame someone or everyone else. But in formulating a legal repsonse, will we tinker around the edges or fundamentally question the criminal justice system? Andrew Selbst helpfully summed up: “A law focusing on specific harms impedes a structural view.”

At We Robot 2012, it was novel to push lawyers and engineers to think jointly about policy and robots. Now, as more disciplines join the conversation, familiar Internet problems surface in new forms. Human-robot interaction is a four-dimensional version of human-computer interaction; I got flashbacks to old hacking debates when Elizabeth Joh wondered in response to Panattoni’s paper if transforming a robot into a criminal should be punished; and a discussion of the use of images of medicalized children for decades in fundraising invoked publicity rights and tricky issues of consent.

Also consent-related, lawyers are starting to use generative AI to draft contracts, a step that Katie Szilagyi and Marina Pavlović suggested further diminishes the bargaining power already lost to “clickwrap”. Automation may remove our remaining ability to object from more specialized circumstances than the terms and conditions imposed on us by sites and services. Consent traditionally depends on a now-absent “meeting of minds”.

The arc of We Robot began with enthusiasm for robots, which waned as big data and generative AI became players. Now, robots/AI are appearing as something being done to us.

Illustrations: Detroit, seen across the river from Windsor, Ontario.

Catoptromancy

It’s a commonly held belief that technology moves fast, and law slowly. This week’s We Robot workshop day gave the opposite impression: these lawyers are moving ahead, while the technology underlying robots is moving slower than we think.

A mainstay of this conference over the years has been Bill Smart‘s and Cindy Grimm‘s demonstrations of the limitations of the technologies that make up robots. This year, that gambit was taken up by Jason Millar and AJung Moon. Their demonstration “robot” comprised six people – one brain, four sensors, and one color sensor. Ordering it to find the purple shirt quickly showed that robot programming isn’t getting any easier. The human “sensors” can receive useful information only as far as their outstretched fingertips, and even then the signal they receive is minimal.

“Many of my students program their robots into a ditch and can’t understand why,” Moon said. It’s the required specificity. For one thing, a color sensor doesn’t see color; it sends a stream of numeric values. It’s all 1s and 0s and tiny engineering decisions whose existence is never registered at the policy level but make all the difference. One of her students, for example, struggled with a robot that kept missing the croissant it was supposed to pick up by 30 centimeters. The explanation turned out to be that the sensor was so slow that the robot was moving a half-second too early, based on historical information. They had to insert a pause before the robot could get it right.

So much of the way we talk about robots and AI misrepresents those inner workings. A robot can’t “smell honey”; it merely has a sensor that’s sensitive to some chemicals and not others. It can’t “see purple” if its sensors are the usual red, green, blue. Even green may not be identifiable to an RGB sensor if the lighting is such that reflections make a shiny green surface look white. Faster and more diverse sensors won’t change the underlying physics. How many lawmakers understand this?

Related: what does it mean to be a robot? Most people attach greater intelligence to things that can move autonomously. But a modern washing machine is smarter than a Roomba, while an iPhone is smarter than either but can’t affect the physical world, as Smart observed at the very first We Robot, in 2012.

This year we are in Canada – to be precise, in Windsor, Ontario, looking across the river to Detroit, Michigan. Canadian law, like the country itself, is a mosaic: common law (inherited from Britain), civil law (inherited from France), and myriad systems of indigenous peoples’ law. Much of the time, said Suzie Dunn, new technology doesn’t require new law so much as reinterpretation and, especially, enforcement of existing law.

“Often you can find criminal law that already applies to digital spaces, but you need to educate the legal system how to apply it,” she said. Analogous: in the late 1990s, editors of the technology section at the Daily Telegraph had a deal-breaking question: “Would this still be a story if it were about the telephone instead of the Internet?”

We can ask that same question about proposed new law. Dunn and Katie Szilagyi asked what robots and AI change that requires a change of approach. They set us to consider scenarios to study this question: an autonomous vehicle kills a cyclist; an autonomous visa system denies entry to a refugee who was identified in her own country as committing a crime when facial recognition software identifies her in images of an illegal LGBTQ protest. In the first case, it’s obvious that all parties will try to blame someone – or everyone – else, probably, as Madeleine Clare Elish suggested in 2016, on the human driver, who becomes the “moral crumple zone”. The second is the kind of case the EU’s AI Act sought to handle by giving individuals the right to meaningful information about the automated decision made about them.

Nadja Pelkey, a curator at Art Windsor-Essex, provided a discussion of AI in a seemingly incompatible context. Citing Georges Bataille, who in 1929 saw museums as mirrors, she invoked the word “catoptromancy”, the use of mirrors in mystical divination. Social and political structures are among the forces that can distort the reflection. So are the many proliferating AI tools such as “personalized experiences” and other types of automation, which she called “adolescent technologies without legal or ethical frameworks in place”.

Where she sees opportunities for AI is in what she called the “invisible archives”. These include much administrative information, material that isn’t digitized, ephemera such as exhibition posters, and publications. She favors small tools and small private models used ethically so they preserve the rights of artists and cultural contexts, and especially consent. In a schematic she outlined a system that can’t be scraped, that allows data to be withdrawn as well as added, and that enables curiosity and exploration. It’s hard to imagine anything less like the “AI” being promulgated by giant companies. *That* type of AI was excoriated in a final panel on technofascism and extractive capitalism.

It’s only later I remember that Pelkey also said that catoptromancy mirrors were first made of polished obsidian.

In other words, black mirrors.

Illustrations: Divination mirror made of polished obsidian by artisans of the Aztec Empire of Mesoamerica between the 15th and 16th centuries (via Wikimedia

A short history of We Robot 2012-

On the eve of We Robot 2025, here are links to my summaries of previous years. 2014 is missing; I didn’t make it that year for family reasons. There was no conference in 2024 in order to move the event back to its original April schedule (covid caused its move to September in 2020). These are my personal impressions; nothing I say here should be taken as representing the conference, its founders, its speakers, or their institutions.

We Robot was co-founded by Michael Froomkin, Ryan Calo, and Ian Kerr to bring together lawyers and engineers to think early about the coming conflicts in robots, law, and policy.

2024 No conference.

2023 The end of cool. After struggling to design a drone delivery service that had any benefits over today’s cycling couriers, we find ourselves less impressed by robot that can do somersaults but not do anything useful.

2022 Insert a human. “Robots” are now “sociotechnical systems”.

Workshop day Coding ethics. The conference struggles to design an ethical robot.

2021 Plausible diversions. How will robots rehape human space?

Workshop day Is the juice worth the squeeze?. We think about how to regulate delivery robots, which will likely have no user-serviceable parts. Title from Woody Hartzog.

2020 (virtual) The zero on the phone. AI exploitation becomes much more visible.

2019 Math, monsters, and metaphors. The trolley problem is dissected; the true danger is less robots than the “pile of math that does some stuff”.

Workshop day The Algernon problem. New participants remind that robots/AI are carrying out the commands of distant owners.

2018 Deception. The conference tries to tease out what makes robots different and revisits Madeleine Clare Elish’s moral crumple zones after the first pedestrian death by self-driving car.

Workshop day Late, noisy, and wrong. Engineers Bill Smart and Cindy Grimm explain why sensors never capture what you think they’re capturing and how AI systems use their data.

2017 Have robot, will legislate. Discussion of risks this year focused on the intermediate sitaution, when automation and human norms clash.

2016 Humans all the way down. Madeline Clare Elish introduces “moral crumple zones”.

Workshop day: The lab and the world. Bill Smart uses conference attendees in formation to show why building a robot is difficult.

2015 Multiplicity. A robot pet dog begs its owner for an upgraded service subscription.

2014 Missed conference

2013 Cautiously apocalyptic. Diversity of approaches to regulation will be needed to handle the diversity of robots.

2012 A really fancy hammer with a gun. Unsentimental engineer Bill Smart provided the title.

Cognitive dissonance

The annual State of the Net, in Washington, DC, always attracts politically diverse viewpoints. This year was especially divided.

Three elements stood out: the divergence between the only remaining member of the Privacy and Civil Liberties Oversight Board (PCLOB) and a recently-fired colleague; a contentious panel on content moderation; and the yay, American innovation! approach to regulation.

As noted previously, on January 29 the days-old Trump administration fired PCLOB members Travis LeBlanc, Ed Felten, and chair Sharon Bradford Franklin; the remaining seat was already empty.

Not to worry, remaining member Beth Williams, said. “We are open for business. Our work conducting important independent oversight of the intelligence community has not ended just because we’re currently sub-quorum.” Flying solo she can greenlight publication, direct work, and review new procedures and policies; she can’t start new projects. A review is ongoing of the EU-US Privacy Framework under Executive Order 14086 (2022). Williams seemed more interested in restricting government censorship and abuse of financial data in the name of combating domestic terrorism.

Soon afterwards, LeBlanc, whose firing has him considering “legal options”, told Brian Fung that the outcome of next year’s reauthorization of Section 702, which covers foreign surveillance programs, keeps him awake at night. Earlier, Williams noted that she and Richard E. DeZinno, who left in 2023, wrote a “minority report” recommending “major” structural change within the FBI to prevent weaponization of S702.

LeBlanc is also concerned that agencies at the border are coordinating with the FBI to surveil US persons as well as migrants. More broadly, he said, gutting the PCLOB costs it independence, expertise, trustworthiness, and credibility and limits public options for redress. He thinks the EU-US data privacy framework could indeed be at risk.

A friend called the panel on content moderation “surreal” in its divisions. Yael Eisenstat and Joel Thayer tried valiantly to disentangle questions of accountability and transparency from free speech. To little avail: Jacob Mchangama and Ari Cohn kept tangling them back up again.

This largely reflects Congressional debates. As in the UK, there is bipartisan concern about child safety – see also the proposed Kids Online Safety Act – but Republicans also separately push hard on “free speech”, claiming that conservative voices are being disproportionately silenced. Meanwhile, organizations that study online speech patterns and could perhaps establish whether that’s true are being attacked and silenced.

Eisenstat tried to draw boundaries between speech and companies’ actions. She can still find on Facebook the sme Telegram ads containing illegal child sexual abuse material that she found when Telegram CEO Pavel Durov was arrested. Despite violating the terms and conditions, they bring Meta profits. “How is that a free speech debate as opposed to a company responsibility debate?”

Thayer seconded her: “What speech interests do these companies have other than to collect data and keep you on their platforms?”

By contrast, Mchangama complained that overblocking – that is, restricting legal speech – is seen across EU countries. “The better solution is to empower users.” Cohn also disliked the UK and European push to hold platforms responsible for fulfilling their own terms and conditions. “When you get to whether platforms are living up to their content moderation standards, that puts the government and courts in the position of having to second-guess platforms’ editorial decisions.”

But Cohn was talking legal content; Eisenstat was talking illegal activity: “We’re talking about distribution mechanisms.” In the end, she said, “We are a democracy, and part of that is having the right to understand how companies affect our health and lives.” Instead, these debates persist because we lack factual knowledge of what goes on inside. If we can’t figure out accountability for these platforms, “This will be the only industry above the law while becoming the richest companies in the world.”

Twenty-five years after data protection became a fundamental right in Europe, the DC crowd still seem to see it as a regulation in search of a deal. Representative Kat Cammack (R-FL), who described herself as the “designated IT person” on the energy and commerce committee, was particularly excited that policy surrounding emerging technologies could be industry-driven, because “Congress is *old*!” and DC is designed to move slowly. “There will always be concerns about data and privacy, but we can navigate that. We can’t deter innovation and expect to flourish.”

Others also expressed enthusiasm for “the great opportunities in front of our country”, compared the EU’s Digital Markets Act to a toll plaza congesting I-95. Samir Jain, on the AI governance panel, suggested the EU may be “reconsidering its approach”. US senator Marsha Blackburn (R-TN) highlighted China’s threat to US cybersecurity without noting the US’s own goal, CALEA.

On that same AI panel, Olivia Zhu, the Assistant Director for AI Policy for the White House Office of Science and Technology Policy, seemed more realistic: “Companies operate globally, and have to do so under the EU AI Act. The reality is they are racing to comply with [it]. Disengaging from that risks a cacophony of regulations worldwide.”

Shortly before, Johnny Ryan, a Senior Fellow at the Irish Council for Civil Liberties posted: “EU Commission has dumped the AI Liability Directive. Presumably for “innovation”. But China, which has the toughest AI law in the world, is out innovating everyone.”

Illustrations: Kat Cammack (R-FL) at State of the Net 2025.

The AI moment

“Why are we still talking about digital transformation?” The speaker was convening a session at last weekend’s UK Govcamp, an event organized by and for civil servants with an interest in digital stuff.

“Because we’ve failed?” someone suggested. These folks are usually *optimists*.

Govcamp is a long-running tradition that began as a guerrilla effort in 2008. At the time, civil servants wanting to harness new technology in the service of government were so thin on the ground they never met until one of them, Jeremy Gould, convened the first Govcamp. These are people who are willing to give up a Saturday in order to do better at their jobs working for us. All hail.

It’s hard to remember now, nearly 15 years on, the excitement in 2010 when David Cameron’s incoming government created the Government Digital Service and embedded it into the Cabinet Office. William Heath immediately ended the Ideal Government blog he’d begun writing in 2004 to press insistently for better use of digital technologies in government. The government had now hired all the people he could have wanted it to, he said, and therefore, “its job is done”.

Some good things followed: tilting government procurement to open the way for smaller British companies, consolidating government publishing, other things less visible but still important. Some data became open. This all has improved processes like applying for concessionary travel passes and other government documents, and made government publishing vastly more usable. The improvement isn’t universal: my application last year to renew my UK driver’s license was sent back because my signature strayed outside the box provided for it.

That’s just one way the business of government doesn’t feel that different. The whole process of developing legislation – green and white papers, public consultations, debates, and amendments – marches on much as it ever has, though with somewhat wider access because the documents are online. Thoughts about how to make it more participatory were the subject of a teacamp in 2013. Eleven years on, civil society is still reading and responding to government consultations in the time-honored way, and policy is still made by the few for the many.

At Govcamp, the conversation spread between the realities of their working lives and the difficulties systems posed for users – that is, the rest of us. “We haven’t removed those little frictions,” one said, evoking the old speed comparisons between Amazon (delivers tomorrow or even today) and the UK government (delivers in weeks, if not months).

“People know what good looks like,” someone else said, in echoing that frustration. That’s 2010-style optimism, from when Amazon product search yielded useful results, search engines weren’t spattered with AI slime and blanketed with ads, today’s algorithms were not yet born, and customer service still had a heartbeat. Here in 2025, we’re all coming up against rampant enshittification, with the result that the next cohort of incoming young civil servants *won’t* know any more what “good” looks like. There will be a whole new layer of necessary education.

Other comments: it’s evolution, not transformation; resistance to change and the requirement to ask permission are embedded throughout the culture; usability is still a problem; trying to change top-down only works in a large organization if it sets up an internal start-up and allows it to cannibalize the existing business; not enough technologists in most departments; the public sector doesn’t have the private sector option of deciding what to ignore; every new government has a new set of priorities. And: the public sector has no competition to push change.

One suggestion was that technological change happens in bursts – punctuated equilibrium. That sort of fits with the history of changing technological trends: computing, the Internet, the web, smartphones, the cloud. Today, that’s “AI”, which prime minister Keir Starmer announced this week he will mainline into the UK’s veins “for everything from spotting potholes to freeing up teachers to teach”.

The person who suggested “punctuated equilibrium” added: “Now is a new moment of change because of AI. It’s a new ‘GDS moment’.” This is plausible in the sense that new paradigms sometimes do bring profound change. Smartphones changed life for homeless people. On the other hand, many don’t do much. Think audio: that was going to be a game-changer, and yet after years of loss-making audio assistants, most of us are still typing.

So is AI one of those opportunities? Many brought up generative AI’s vast consumption of energy and water and rampant inaccuracy. Starmer, like Rishi Sunak before him, seems to think AI can make Britain the envy of other major governments.

Complex systems – such as digital governance – don’t easily change the flow of information or, therefore, the flow of power. It can take longer than most civil servants’ careers. Organizations like Mydex, which seeks to up-end today’s systems to put users in control, have been at work for years now. The upcoming digital identity framework has Mydex chair Alan Mitchell optimistic that the government’s digital identity framework is a breakthrough. We’ll see.

One attendee captured this: “It doesn’t feel like the question has changed from more efficient bureaucracy to things that change lives.” Said another in response, “The technology is the easy bit.”

Illustrations: Sir Humphrey Appleby (Nigel Hawthorne), Bernard Woolley (Derek Fowldes), and Jim Hacker (Paul Eddington) arguing over cultural change in Yes, Minister.

Beware the duck

Once upon a time, “convergence” was a buzzword. That was back in the days when audio was on stereo systems, television was on a TV, and “communications” happened on phones that weren’t computers. The word has disappeared back into its former usage pattern, but it could easily be revived to describe what’s happening to content as humans dive into using generative tools.

Said another way. Roughly this time last year, the annual technology/law/pop culture conference Gikii was awash in (generative) AI. That bubble is deflating, but in the experiments that nonetheless continue a new topic more worthy of attention is emerging: artificial content. It’s striking because what happens at this gathering, which mines all types of popular culture for cues for serious ideas, is often a good guide to what’s coming next in futurelaw.

That no one dared guess which of Zachary Cooper‘s pair of near-identicalaudio clips was AI-generated, and which human-performed was only a starting point. One had more static? Cooper’s main point: “If you can’t tell which clip is real, then you can’t decide which one gets copyright.” Right, because only human creations are eligible (although fake bands can still scam Spotify).

Cooper’s brief, wild tour of the “generative music underground” included using AI tools to create songs whose content is at odds with their genre, whole generated albums built by a human producer making thousands of tiny choices, and the new genre “gencore”, which exploits the qualities of generated sound (Cher and Autotune on steroids). Voice cloning, instrument cloning, audio production plugins, “throw in a bass and some drums”….

Ultimately, Cooper said, “The use of generative AI reveals nothing about the creative relationship to work; it destabilizes the international market by having different authorship thresholds; and there’s no means of auditing any of it.” Instead of uselessly trying to enforce different rights predicated on the use or non-use of a specific set of technologies, he said, we should tackle directly the challenges new modes of production pose to copyright. Precursor: the battles over sampling.

Soon afterwards, Michael Veale was showing us Civitai, an Idaho-based site offering open source generative AI tools, including fine-tuned models. “Civitai exists to democratize AI media creation,” the site explains. “Everything has a valid legal purpose,” Veale said, but the way capabilities can be retrained and chained together to create composites makes it hard to tell which tools, if any, should be taken down, even for creators (see also the puzzlement as Redditors try to work this out). Even environmental regulation can’t help, as one attendee suggested: unlike large language models, these smaller, fine-tuned models (as Jon Crowcroft and I surmised last year would be the future) are efficient; they can run on a phone.

Even without adding artificial content there is always an inherent conflict when digital meets an analog spectrum. This is why, Andy Phippen said, the threshold of 18 for buying alcohol and cigarettes turns into a real threshold of 25 at retail checkouts. Both software and humans fail at determining over-or-under-18, and retailers fear liability. Online age verification as promoted in the Online Safety Act will not work.

If these blurred lines strain the limits of current legal approaches, others expose gaps in the law. Andrea Matwyshyn, for example, has been studying parallels I’ve also noticed between early 20th century company towns and today’s tech behemoths’ anti-union, surveillance-happy working practices. As a result, she believes that regulatory authorities need to start considering closely the impact of data aggregation when companies merge and look for company town-like dynamics”.

Andelka Phillips parodied the overreach of app contracts by imagining the EULA attached to “ThoughtReader app”. A sample clause: “ThoughtReader may turn on its service at any time. By accepting this agreement, you are deemed to accept all monitoring of your thoughts.” Well, OK, then. (I also had a go at this here, 19 years ago.)

Emily Roach toured the history of fan fiction and the law to end up at Archive of Our Own, a “fan-created, fan-run, nonprofit, noncommercial archive for transformative fanworks, like fanfiction, fanart, fan videos, and podfic”, the idea being to ensure that the work fans pour their hearts into has a permanent home where it can’t be arbitrarily deleted by corporate owners. The rules are strict: not so much as a “buy me a coffee” tip link that could lead to a court-acceptable claim of commercial use.

History, the science fiction writer Charles Stross has said, is the science fiction writer’s secret weapon. Also at Gikii: Miranda Mowbray unearthed the 18th century “Digesting Duck” automaton built by Jacques de Vauconson. It was a marvel that appeared to ingest grain and defecate waste and that in its day inspired much speculation about the boundaries between real and mechanical life. Like the amazing ancient Greek automata before it, it was, of course, a purely mechanical fake – it stored the grain in a small compartment and released pellets from a different compartment – but today’s humans confused into thinking that sentences mean sentience could relate.

Illustrations: One onlooker’s rendering of his (incorrect) idea of the interior of Jacques de Vaucanson’s Digesting Duck (via Wikimedia).

Soap dispensers and Skynet

In the TV series Breaking Bad, the weary ex-cop Mike Ehrmantraut tells meth chemist Walter White : “No more half measures.” The last time he took half measures, the woman he was trying to protect was brutally murdered.

Apparently people like to say there are no dead bodies in privacy (although this is easily countered with ex-CIA director General Michael Hayden’s comment, “We kill people based on metadata”). But, as Woody Hartzog told a Senate committee hearing in September 2023, summarizing work he did with Neil Richards and Ryan Durrie, half measures in AI/privacy legislation are still a bad thing.

A discussion at Privacy Law Scholars last week laid out the problems. Half measures don’t work. They don’t prevent societal harms. They don’t prevent AI from being deployed where it shouldn’t be. And they sap the political will to follow up with anything stronger.

In an article for The Brink, Hartzog said, “To bring AI within the rule of law, lawmakers must go beyond half measures to ensure that AI systems and the actors that deploy them are worthy of our trust,”

He goes on to list examples of half measures: transparency, committing to ethical principles, and mitigating bias. Transparency is good, but doesn’t automatically bring accountability. Ethical principles don’t change business models. And bias mitigation to make a technology nominally fairer may simultaneously make it more dangerous. Think facial recognition: debias the system and improve its accuracy for matching the faces of non-male, non-white people, and then it’s used to target those same people with surveillance.

Or, bias mitigation may have nothing to do with the actual problem, an underlying business model, as Arvind Narayanan, author of the forthcoming book AI Snake Oil, pointed out a few days later at an event convened by the Future of Privacy Forum. In his example, the Washington Post reported in 2019 on the case of an algorithm intended to help hospitals predict which patients will benefit from additional medical care. It turned out to favor white patients. But, Narayanan said, the system’s provider responded to the story by saying that the algorithm’s cost model accurately predicted the costs of additional health care – in other words, the algorithm did exactly what the hospital wanted it to do.

“I think hospitals should be forced to use a different model – but that’s not a technical question, it’s politics.”.

Narayanan also called out auditing (another Hartzog half measure). You can, he said, audit a human resources system to expose patterns in which resumes it flags for interviews and which it drops. But no one ever commissions research modeled on the expensive random controlled testing common in medicine that follows up for five years to see if the system actually picks good employees.

Adding confusion is the fact that “AI” isn’t a single thing. Instead, it’s what someone called a “suitcase term” – that is, a container for many different systems built for many different purposes by many different organizations with many different motives. It is absurd to conflate AGI – the artificial general intelligence of science fiction stories and scientists’ dreams that can surpass and kill us all – with pattern-recognizing software that depends on plundering human-created content and the labeling work of millions of low-paid workers

To digress briefly, some of the AI in that suitcase is getting truly goofy. Yum Brands has announced that its restaurants, which include Taco Bell, Pizza Hut, and KFC, will be “AI-first”. Among Yum’s envisioned uses, the company tells Benj Edwards at Ars Technica, are being able to ask an app what temperature to set the oven. I can’t help suspecting that the real eventual use will be data collection and discriminatory pricing. Stuff like this is why Ed Zitron writes postings like The Rot-Com Bubble, which hypothesizes that the reason Internet services are deteriorating is that technology companies have run out of genuinely innovative things to sell us.

That you cannot solve social problems with technology is a long-held truism, but it seems to be especially true of the messy middle of the AI spectrum, the use cases active now that rarely get the same attention as the far ends of that spectrum.

As Neil Richards put it at PLSC, “The way it’s presented now, it’s either existential risk or a soap dispenser that doesn’t work on brown hands when the real problem is the intermediate level of societal change via AI.”

The PLSC discussion included a list of the ways that regulations fail. Underfunded enforcement. Regulations that are pure theater. The wrong measures. The right goal, but weakly drafted legislation. Make the regulation ambiguous, or base it on principles that are too broad. Choose conflicting half-measures – for example, require transparency but add the principle that people should own their own data.

Like Cristina Caffarra a week earlier at CPDP, Hartzog, Richards, and Durrie favor finding remedies that focus on limiting abuses of power. Full measures include outright bans, the right to bring a private cause of action, imposing duties of “loyalty, care, and confidentiality”, and limiting exploitative data practices within these systems. Curbing abuses of power, as he says, is nothing new. The shiny new technology is a distraction.

Or, as Narayanan put it, “Broken AI is appealing to broken institutions.”

Illustrations: Mike (Jonathan Banks) telling Walt (Bryan Cranston) in Breaking Bad (S03e12) “no more half measures”.

Admiring the problem

In one sense, the EU’s barely dry AI Act and the other complex legislation – the Digital Markets Act, Digital Services Act, GDPR, and so on -= is a triumph. Flawed it may be, but it’s a genuine attempt to protect citizens’ human rights against a technology that is being birthed with numerous trigger warnings. The AI-with-everything program at this year’s Computers, Privacy, and Data Protection, reflected that sense of accomplishment – but also the frustration that comes with knowing that all legislation is flawed, all technology companies try to game the system, and gaps will widen.

CPDP has had these moments before: new legislation always comes with a large dollop of frustration over the opportunities that were missed and the knowledge that newer technologies are already rushing forwards. AI, and the AI Act, more or less swallowed this year’s conference as people considered what it says, how it will play internationally, and the necessary details of implementation and enforcement. Two years at this event, inadequate enforcement of GDPR was a big topic.

The most interesting future gaps that emerged this year: monopoly power, quantum sensing, and spatial computing.

For at least 20 years we’ve been hearing about quantum computing’s potential threat to public key encryption – that day of doom has been ten years away as long as I can remember, just as the Singularity is always 30 years away. In the panel on quantum sensing, Chris Hoofnagle argued that, as he and Simson Garfinkel recently wrote at Lawfare and in their new book, quantum cryptanalysis is overhyped as a threat (although there are many opportunities for quantum computing in chemistry and materials science). However, quantum sensing is here now, works (because qubits are fragile), and is cheap. There is plenty of privacy threat here to go around: quantum sensing will benefit entirely different classes of intelligence, particularly remote, undetectable surveillance.

Hoofnagle and Garfinkel are calling this MASINT, for machine and signature intelligence, and believe that it will become very difficult to hide things, even at a national level. In Hoofnagle’s example, a quantum sensor-equipped drone could fly over the homes of parolees to scan for guns.

Quantum sensing and spatial computing have this in common: they both enable unprecedented passive data collection. VR headsets, for example, collect all sorts of biomechanical data that can be mined more easily for personal information than people expect.

Barring change, all that data will be collected by today’s already-powerful entities.

The deeper level on which all this legislation fails particularly exercised Cristina Caffarra, the co-founder of the Centre for Economic Policy Research in the panel on AI and monopoly, saying that all this legislation is basically nibbling around the edges because they do not touch the real, fundamental problem of the power being amassed by the handful of companies who own the infrastructure.

“It’s economics 101. You can have as much downstream competition as you like but you will never disperse the power upstream.” The reports and other material generated by government agencies like the UK’s Competition and Markets Authority are, she says, just “admiring the problem”.

A day earlier, the Novi Sad professor Vladen Joler had already pointed out the fundamental problem: at the dawn of the Internet anyone could start with nothing and build something; what we’re calling “AI” requires billions in investment, so comes pre-monopolized. Many people dismiss Europe for not having its own homegrown Big Tech, but that overlooks open technologies: the Raspberry Pi, Linux, and the web itself, which all have European origins.

In 2010, the now-departing MP Robert Halfon (Con-Harlow) said at an event on reining in technology companies that only a company the size of Google – not even a government – could create Street View. Legend has it that open source geeks heard that as a challenge, and so we have OpenStreetMap. Caffarra’s fiery anger raises the question: at what point do the infrastructure providers become so entrenched that they could choke off an open source competitor at birth? Caffarra wants to build a digital public interest infrastructure using the gaps where Big Tech doesn’t yet have that control.

The Dutch Groenlinks MEP Kim van Sparrentak offered an explanation for why the AI Act doesn’t address market concentration: “They still dream of a European champion who will rule the world.” An analogy springs to mind: people who vote for tax cuts for billionaires because one day that might be *them*. Meanwhile, the UK’s Competition and Markets Authority finds nothing to investigate in Microsoft’s partnership with the French AI startup Mistral.

Van Sparrentak thinks one way out is through public procurement; adopt goals of privacy and sustainability, and support European companies. It makes sense; as the AI Now Institute’s Amba Kak, noted, at the moment almost everything anyone does digitally has to go through the systems of at least one Big Tech company.

As Sebastiano Toffaletti, head of the secretariat of the European SME Alliance, put it, “Even if you had all the money in the world, these guys still have more data than you. If you don’t and can’t solve it, you won’t have anyone to challenge these companies.”

Illustrations: Vladen Joler shows Anatomy of an AI System, a map he devised with Kate Crawford of the human labor, data, and planetary resources that are extracted to make “AI”.

The second greatest show on earth

There is this to be said for seeing your second total eclipse of the sun: if the first one went well, you can be more relaxed about what you get to see. In 2017, sitting in Centennial Park in Nashville, we saw everything. So in Dallas in 2024, I could tell myself, “It will be interesting even if we can’t see the sun.”

As it happened, we had cloud with lots of breaks. The cloud obscured such phenomena as Bailey’s Beads and the diamond ring – but the play of light on the broken clouds as the sun popped back out was amazing all by itself. The corona-surrounded sun playing peek-a-boo with us was stunningly beautiful. And all too soon it was over. It seemed shorter than 2017, even though totality was nearly twice as long – 3:52 compared to about two minutes.

One thing definitely missing from Nashville was a phenomenon that’s less often discussed: the 360-degree sunset all around the horizon. Sitting in Dallas surrounded by buildings, the horizon was not visible as it was in that Nashville park.

On Sunday, April 7, it seemed like half the country was moving into position for today in a process that involved placing a bet on the local weather. I had friends scattered in Vermont, Montreal, and several locations in upstate New York. Our intermittent cloud compared favorably with at least one of the New York locations. Daytime darkness and watching and listening to animals’ reactions is still interesting…but it remains frustrating to know that the Big Show is going on without you.

The hundreds of photos on show hide the real thrill of seeing totality: the sense of connection to humanity past, present, and future, and across the animal kingdom. The strangers around you become part of your life, however briefly. The inexorable movements of earth, sun, and moon put us all in our place.

Small data

Shortly before this gets posted, Jon Crowcroft and I will have presented this year’s offering at Gikii, the weird little conference that crosses law, media, technology, and pop culture. This is what we will possibly may have said, as I understand it, with some added explanation for the slightly less technical audience I imagine will read this.

Two years ago, a team of four researchers – Timnit Gebru, Emily Bender, Margaret Mitchell (writing as Shmargaret Shmitchell), and Angelina McMillan-Major – wrote a now-famous paper called On the Dangers of Stochastic Parrots (PDF) calling into question the usefulness of the large language models (LLMs) that have caused so much ruckus this year. The “Stochastic Four” argued instead of small models built on carefully curated data: less prone to error, less exploitive of people’s data, less damaging to the planet. Gebru got fired over this paper; Google also fired Mitchell soon afterwards. Two years later, neural networks pioneer Geoff Hinton quit Google in order to voice similar concerns.

Despite the hype, LLMs have many problems. They are fundamentally an extractive technology and are resource-intensive. Building LLMs requires massive amounts of training data; so far, the companies have been unwilling to acknowledge their sources, perhaps because (as is happening already) they fear copyright suits.

More important from a technical standpoint, is the issue of model collapse; that is, models degrade when they begin to ingest synthetic AI-generated data instead of human input. We’ve seen this before with Google Flu Trends, which degraded rapidly as incoming new search data included many searches on flu-like symptoms that weren’t actually flu, and others that simply reflected the frequency of local news coverage. “Data pollution” as LLM-generated data fills the web, will mean that the web will be an increasingly useless source of training data for future generations of generative AI. Lots more noise, drowning out the signal (in the photo above, the signal would be the parrot).

Instead, if we follow the lead of the Stochastic Four, the more productive approach is small data – small, carefully curated datasets that train models to match specific goals. Far less resource-intensive, far fewer issues with copyright, appropriation, and extraction.

We know what the LLM future looks like in outline: big, centralized services, because no one else will be able to amass enough data. In that future, surveillance capitalism is an essential part of data gathering. SLM futures could look quite different: decentralized, with realigned incentives. At one point, we wanted to suggest that small data could bring the end of surveillance capitalism; that’s probably an overstatement. But small data could certainly create the ecosystem in which the case for mass data collection would be less compelling.

Jon and I imagined four primary alternative futures: federation, personalization, some combination of those two, and paradigm shift.

Precursors to a federated small data future already exist; these include customer service chatbots, predictive text assistants. In this future, we could imagine personalized LLM servers designed to serve specific needs.

An individualized future might look something like I suggested here in March: a model that fits in your pocket that is constantly updated with material of your own choosing. Such a device might be the closest yet to Vannevar Bush’s 1945 idea of the Memex (PDF), updated for the modern era by automating the dozens of secretary-curators he imagined doing the grunt work of labeling and selection. That future again has precursors in techniques for sharing the computation but not the data, a design we see proposed for health care, where the data is too sensitive to share unless there’s a significant public interest (as in pandemics or very rare illnesses), or in other data analysis designs intended to protect privacy.

In 2007, the science fiction writer Charles Stross suggested something like this, though he imagined it as a comprehensive life log, which he described as a “google for real life”. So this alternative future would look something like Stross’s pocket $10 life log with enhanced statistics-based data analytics.

Imagining what a paradigm shift might look like is much harder. That’s the kind of thing science fiction writers do; it’s 16 years since Stross gave that life log talk. However, in his 2018 history of advertising, The Attention Merchants, Columbia professor Tim Wu argued that industrialization was the vector that made advertising and its grab for our attention part of commerce. A hundred and fifty-odd years later, the centralizing effects of industrialization are being challenged starting with energy via renewables and local power generation and social media via the fediverse. Might language models also play their part in bringing a new, more collaborative and cooperative society?

It is, in other words, just possible that the hot new technology of 2023 is simply a dead end bringing little real change. It’s happened before. There have been, as Wu recounts, counter-moves and movements before, but they didn’t have the technological affordances of our era.

In the Q&A that followed, Miranda Mowbray pointed out that companies are trying to implement the individualized model, but that it’s impossible to do unless there are standardized data formats, and even then hard to do at scale.

Illustrations: Spot the parrot seen in a neighbor’s tree.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Author Wendy M. GrossmanPosted on September 8, 2023September 8, 2023Categories AI, Events, New tech, old knowledgeTags gikii4 Comments on Small data

Posts pagination

Page 1 Page 2 Next page