net.wars

Review: Supremacy

Supremacy: AI, ChatGPT, and the Race That Will Change the World
By Parmy Olson
Macmillan Business
ISBN: 978-1035038220

One of the most famous books about the process of writing software is Frederick Brooks’ The Mythical Man Month. The essay that gives the book its title makes the point that you cannot speed up the process by throwing more and more people at it. The more people you have, the more they have to all communicate with each other, and the pathways multiply exponentially. Think of it this way: 500 people can’t read a book faster than five people can.

Brooks’ warning immediately springs to mind when Parmy Olson reports, late in her new book, Supremacy, that Microsoft CEO Sadya Nadella was furious to discover that Microsoft’s 5,000 direct employees working on AI lagged well behind the rapid advances being made by the fewer than 200 working working at OpenAI. Some things just aren’t improved by parallel processing.

The story Olson tells is a sad one: two guys, both eager to develop an artificial general intelligence in order to save, or least help, humanity, who both wind up working for large commercial companies whose primary interests are to 1) make money and 2) outcompete the other guy. For Demis Hassabis, the company was Google, which bought his DeepMind startup in 2014. For Sam Altman, founder of OpenAI, it was Microsoft. Which fits: Hassabis’s approach to “solving AI” was to let them teach themselves by playing games, hoping to drive science discovery; Altman sought to solve real-world problems and bring everyone wealth. Too late for Olson’s book, Hassabis has achieved enough of a piece of his dream to have been one of three awarded the 2024 Nobel Prize in chemistry for using AI to predict how proteins will fold.

For both the reason was the same: the resources they sought to work in AI – data, computing power, and high-priced personnel – are too expensive for either traditional startup venture capital funding or for academia. (Cure Vladen Joler, at this year’s Computers, Privacy, and Data Protection, noting that AI is arriving “pre-monopolized”.) As Olson tells the story, they both tried repeatedly to keep the companies they founded independent. Yet, both have wound up positioned to run the companies whose money they took apparently believing, like many geek founders with more IQ points than sense, that they would not have to give up control.

In comparing and contrasting the two founders, Olson shows where many of today’s problems came from. Allying themselves with Big Tech meant giving up on transparency. The ethicists who are calling out these companies over real and present harms caused by the tools they’ve built, such as bias, discrimination, and exploitation of workers performing tasks like labeling data, have 1% or less of the funding of those pushing safety for superintelligences that may never exist.

Olson does a good job of explaining the technical advances that led to the breakthroughs of recent years, as well as the business and staff realities of their different paths. A key point she pulls out is the extent to which both Google and Microsoft have become the kind of risk-averse, slow-moving, sclerotic company they despised when they were small, nimble newcomers.

Different paths, but ultimately, their story is the same: they fought the money, and the money won.

Blown

“This is a public place. Everyone has the right to be left in peace,” Jane (Vanessa Redgrave) tells Thomas (David Hemmings), whom she’s just spotted photographing her with her lover in the 1966 film Blow-Up, by Michelangelo Antonioni. The movie, set in London, proceeds as a mystery in which Thomas’s only tangible evidence is a grainy, blown-up shot of a blob that may be a murdered body.

Today, Thomas would probably be wielding a latest-model smartphone instead of a single lens reflex film camera. He would not bother to hide behind a tree. And Jane would probably never notice, much less challenge Thomas to explain his clearly-not-illegal, though creepy, behavior. Phones and cameras are everywhere. If you want to meet a lover and be sure no one’s photographing you, you don’t go to a public park, even one as empty as the film finds Maryon Park. Today’s 20-somethings grew up with that reality, and learned early to agree some gatherings are no-photography zones.

Even in the 1960s individuals had cameras, but taking high-quality images at a distance was the province of a small minority of experts; Antonioni’s photographer was a professional with his own darkroom and enlarging equipment. The first CCTV cameras went up in the 1960s; their proliferation became public policy issue in the 1980s, and was propagandized as “for your safety without much thought in the post-9/11 2000s. In the late 2010s, CCTV surveillance became democratized: my neighbor’s Ring camera means no one can leave an anonymous gift on their doorstep – or (without my consent) mine.

I suspect one reason we became largely complacent about ubiquitous cameras is that the images mostly remained unidentifiable, or at least unidentified. Facial recognition – especially the live variant police seem to feel they have the right to set up at will – is changing all that. Which all leads to this week, when Joseph Cox at 404 Media reports ($) (and Ars Technica summarizes) that two Harvard students have mashed up a pair of unremarkable $300 Meta Ray-Bans with the reverse image search service Pimeyes and a large language model to produce I-XRAY, an app that identifies in near-real time most of the people they pass on the street, including their name, home address, and phone number.

The students – AnhPhu Nguyen and Caine Ardayfio – are smart enough to realize the implications, imagining for Cox the scenario of a random male spotting a young woman and following her home. This news is breaking the same week that the San Francisco Standard and others are reporting that two men in San Francisco stood in front of a driverless Waymo taxi to block it from proceeding while demanding that the female passenger inside give them her phone number (we used to give such males the local phone number for time and temperature).

Nguyen and Ardayfio aren’t releasing the code they’ve written, but what two people can do, others with fewer ethics can recreate independently, as 30 years of Black Hat and Def Con have proved. This is a new level of democratizated surveillance. Today, giant databases like Clearview AI are largely only accessible to governments and law enforcement. But the data in them has been scraped from the web, like LLMs’ training data, and merged with commercial sources

This latest prospective threat to privacy has been created by the marriage of three technologies that were developed separately by different actors without regard to one another and, more important, without imagining how one might magnify the privacy risks of the others. A connected car with cameras could also run I-XRAY.

The San Francisco story is a good argument against allowing cars on the roads without steering wheels, pedals, and other controls or *something* to allow a passenger to take charge to protect their own safety. In Manhattan cars waiting at certain traffic lights often used to be approached by people who would wash the windshield and demand payment. Experienced drivers knew to hang back at red lights so they could roll forward past the oncoming would-be washer. How would you do this in a driverless car with no controls?

We’ve long known that people will prank autonomous cars. Coverage focused on the safety of the *cars* and the people and vehicles surrounding them, not the passengers. Calling a remote technical support line for help is never going to get a good enough response.

What ties these two cases together – besides (potentially) providing new ways to harass women – is the collision between new technologies and human nature. Plus, the merger of three decades’ worth of piled-up data and software that can make things happen in the physical world.

Arguably, we should have seen this coming, but the manufacturers of new technology have never been good at predicting what weird things their users will find to do with it. This mattered less when the worst outcome was using spreadsheet software to write letters. Today, that sort of imaginative failure is happening at scale in software that controls physical objects and penetrates the physical world. The risks are vastly greater and far more unsettling. It’s not that we can’t see the forest for the trees; it’s that we can’t see the potential for trees to aggregate into a forest.

Illustrations: Jane (Vanessa Redgrave) and her lover, being photographed by Thomas (David Hemmings) in Michelangelo Antonioni’s 1966 film, Blow-Up.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

This perfect day

To anyone remembering the excitement over DNA testing just a few years ago, this week’s news about 23andMe comes as a surprise. At CNN, Allison Morrow reports that all seven board members have resigned to protest CEO Anne Wojcicki’s plan to take the company private by buying up all the shares she doesn’t already own at 40 cents each (closing price yesterday was 0.3301. The board wanted her to find a buyer offering a better price.

In January, Rolfe Winkler reported at the Wall Street Journal ($) that 23andMe is likely to run out of cash by next year. Its market cap has dropped from $6 billion to under $200 million. He and Morrow catalogue the company’s problems: it’s never made a profit nor had a sustainable business model.

The reasons are fairly simple: few repeat customers. With DNA testing, as Winkler writes, “Customers only need to take the test once, and few test-takers get life-altering health results.” 23andMe’s mooted revolution in health care instead was a fad. Now, the company is pivoting to sell subscriptions to weight loss drugs.

This strikes me as an extraordinarily dangerous moment: the struggling company’s sole unique asset is a pile of more than 10 million DNA samples whose owners have agreed they can be used for research. Many were alarmed when, in December 2023, hackers broke into 1.7 million accounts and gained access to 6.9 million customer profiles<, though. The company said the hacked data did not include DNA records but did include family trees and other links. We don't think of 23andMe as a social network. But the same affordances that enabled Cambridge Analytica to leverage a relatively small number of user profiles to create a mass of data derived from a much larger number of their Friends worked on 23andMe. Given the way genetics works, this risk should have been obvious.

In 2004, the year of Facebook’s birth, the Australian privacy campaigner Roger Clarke warned in Very Black “Little Black Books” that social networks had no business model other than to abuse their users’ data. 23andMe’s terms and conditions promise to protect user privacy. But in a sale what happens to the data?

The same might be asked about the data that would accrue from Oracle CEO Larry Ellison‘s surveillance-embracing proposals this week. Inevitably, commentators invoked George Orwell’s 1984. At Business Insider, Kenneth Niemeyer was first to report: “[Ellison] said AI will usher in a new era of surveillance that he gleefully said will ensure ‘citizens will be on their best behavior.'”

The all-AI-surveillance all-the-time idea could only be embraced “gleefully” by someone who doesn’t believe it will affect him.

Niemeyer:

“Ellison said AI would be used in the future to constantly watch and analyze vast surveillance systems, like security cameras, police body cameras, doorbell cameras, and vehicle dashboard cameras.

“We’re going to have supervision,” Ellison said. “Every police officer is going to be supervised at all times, and if there’s a problem, AI will report that problem and report it to the appropriate person. Citizens will be on their best behavior because we are constantly recording and reporting everything that’s going on.”

Ellison is twenty-six years behind science fiction author David Brin, who proposed radical transparency in his 1998 non-fiction outing, The Transparent Society. But Brin saw reciprocity as an essential feature, believing it would protect privacy by making surveillance visible. Ellison is claiming that *inscrutable* surveillance will guarantee good behavior.

At 404 Media, Jason Koebler debunks Ellison point by point. Research and other evidence shows securing schools is unlikely to make them safer; body cameras don’t appear to improve police behavior; and all the technologies Ellison talks about have problems with accuracy and false positives. Indeed, the mayor of Chicago wants to end the city’s contract with ShotSpotter (now SoundThinking), saying it’s expensive and doesn’t cut crime; some research says it slows police 911 response. Worth noting Simon Spichak at Brain Facts, who finds with AI tools humans make worse decisions. So…not a good idea for police.

More disturbing is Koebler’s main point: most of the technology Ellison calls “future” is already here and failing to lower crime rates or solve its causes – while being very expensive. Ellison is already out of date.

The book Ellison’s fantasy evokes for me is the less-known This Perfect Day, by Ira Levin, written in 1970. The novel’s world is run by a massive computer (“Unicomp”) that decides all aspects of individuals’ lives: their job, spouse, how many children they can have. Enforcing all this are human counselors and permanent electronic bracelets individuals touch to ubiquitous scanners for permission.

Homogeneity rules: everyone is mixed race, there are only four boys’ and girls’ names, they eat “totalcakes”, drink cokes, wear identical clothing. For the rest, regularly administered drugs keep everyone healthy and docile. “Fight” is an abominable curse word. The controlled world over which Unicomp presides is therefore almost entirely benign: there is no war, crime, and little disease. It rains only at night.

Naturally, the novel’s hero rebels, joins a group of outcasts (“the Incurables”), and finds his way to the secret underground luxury bunker where a few “Programmers” help Unicomp’s inventor, Wei Li Chun, run the world to his specification. So to me, Ellison’s plan is all about installing himself as world ruler. Which, I mean, who could object except other billionaires?

Illustrations: The CCTV camera on George Orwell’s Portobello Road house.

A three-hour tour

It should be easy for the UK’s Competition Authority to shut down the proposed merger of Vodafone and Three, two of the UK’s four major mobile network providers. Remaining as competition post-merger would be EE (owned by BT) and Virgin Media O2 (owned by the Spanish company Telefónica and the US-listed company Liberty Global).

The trade union Unite is correctly calling the likely consequences: higher prices, fewer choices, job losses, and poorer customer service. In response, Vodafone and Three are dangling a shiny object of temptation: investment in building 5G network.

Well, hogwash. I would say “Don’t do this” even if I weren’t a Three customer (who left Vodafone years ago). Let them agree to collaborate on building a sbared network and compete on quality and services, but not merge. See the US broadband market, where prices are high, speeds are low, and frustrated consumers rarely have more than one option and take heed.

***

It’s a relief to see some sanity arriving around generative AI. As a glance at the archives will show, I’ve never been a fan; last year Jon Crowcroft and I predicted the eventual demise of large language models due to model collapse. Now, David Gray Widder and Mar Hicks warn in a paper that although the generative AI bubble is deflating, its damage will persist: “…carbon can’t be put back in the ground, workers continue to need to fend off AI’s disciplining effects, and the poisonous effect on our information commons will be hard to undo.”

This week offers worked examples. Re disinformation, at The Verge Sarah Jeong describes the change in our relationship with photographs arriving with new smartphones’ ability to fake realistic images. At The Register, Dan Robinson reports that data centers and AI are causing a substantial rise in water use in the US state of Virginia.

As evidence of the deflating bubble, Widder and Hicks cite the recent Goldman Sachs report arguing that generative AI is unlikely ever to pay back its investment.

And yet: to exploit generative AI, companies and governments are reversing or delaying programs to lower carbon emissions. Also alarmingly, Widder and Hicks wonder if generative AI was always meant to fail and its promoters’ only real goals were to scoop up profits and use the inevitability narrative to make generative AI a vector for embedding infrastructural dependencies (for example, on cloud computing).

That outcome doesn’t have to have been a plan – or a conspiracy theory, just as robber barons don’t actually need to conspire in order to serve each other’s interests. It could just as well be a circumstances-led pivot. But companies that have put money into generative AI will want to scrounge whatever return they can get. So the idea that we will be left with infrastructure that’s a poor fit for our actual needs is a disturbing – and entirely possible – outcome.

***

It’s fascinating – and an example of how you never know where new technologies will lead – to learn that people are using DNA testing to prove they qualify for citizenship in other countries such as Ireland, where a single grandparent will get you in. In some cases, such as the children of unmarried Irish women who were transported to England, this use of DNA testing rights historic wrongs. For others, it opens new opportunities such as the right to live in the EU. Unfortunately, it’s easy to imagine that in countries where citizenship by birthright is a talking point for the right wing this type of DNA testing could be mooted as a requirement. I’d like to think that rounding up babies for deportation is beyond even the most bigoted authoritarians, but…

***

The controversial British technology entrepreneur Mike Lynch has died a billionaire’s death; his superyacht sank in a tornado off the coast of Sicily. I interviewed him for Salon in 2000, when he was newly Britain’s first software billionaire. It was the first time I heard of the theorem developed by Thomas Bayes, an 18th century minister and mathematician (which now is everywhere), and for a long time afterwards I wasn’t certain I’d correctly understood his comments about perception and philosophy. This was exacerbated by early experience with his software in 1996, when it was still a consumer desktop search product fronted by an annoying cartoon dog – I thought it unusably slow compared to pre-Google search engines. By 2000, Autonomy had pivoted to enterprise software, which seemed a better fit.

In 2011, Sharon Bertsch McGrayne‘s book, The Theory That Would Not Die, explained things more clearly. That year, Lynch hit a business peak by selling Autonomy to Hewlett-Packard for $11 billion. A year later, he left HP, and set up Invoke Capital to invest in companies with fundamental technology ideas that scale.

Soon afterwards, HP wrote down $8.8 billion and accused Lynch of accounting fraud. The last 12 years of his life were spent in courtrooms: first a UK civil case, decided for HP in 2022, which Lynch was appealing, then a fight against extradition, and finally a criminal trial in the US, where former Autonomy CFO Sushovan Hussein had already been sent to jail for five years. Lynch’s fatal yacht trip was to celebrate his acquittal.

Illustrations: A Customs and Border Protection scientist reads a DNA profile to determine the origin of a commodity (via Wikimedia.

Twenty comedians walk into a bar…

The Internet was, famously, created to withstand a bomb outage. In 1998 Matt Blaze and Steve Bellovin said it, in 2002 it was still true, and it remains true today, after 50 years of development: there are more efficient ways to kill the Internet than dropping a bomb.

Take today. The cybersecurity company Crowdstrike pushed out a buggy update, and half the world is down. Airports, businesses, the NHS appointment booking system, supermarkets, the UK’s train companies, retailers…all showing the Blue Screen of Death. Can we say “central points of failure”? Because there are two: Crowdstrike, whose cybersecurity is widespead, and Microsoft, whose Windows operating system is everywhere.

Note this hasn’t killed the *Internet*. It’s temporarily killed many systems *connected to* the Internet. But if you’re stuck in an airport where nothing’s working and confronted with a sign that says “Cash only” when you only have cards…well, at least you can go online to read the news.

The fix will be slow, because it involves starting the computer in safe mode and manually deleting files. Like Y2K remediation, one computer at a time.

***

Speaking of things that don’t work, three bits from the generative AI bubble. First, last week Goldman Sachs issued a scathing report on generative AI that concluded it is unlikely to ever repay the trillion-odd dollars companies are spending on it, while its energy demands could outstrip available supply. Conclusion: generative AI is a bubble that could nonetheless take a long time to burst.

Second, at 404 Media Emanuel Weiburg reads a report from the Tony Blair Institute that estimates that 40% of tasks performed by public sector workers could be partially automated. Blair himself compares generative AI to the industrial revolution. This comparison is more accurate than he may realize, since the industrial revolution brought climate change, and generative AI pours accelerant on it.

TBI’s estimate conflicts with that provided to Goldman by MIT economist Daron Acemoglu, who believes that AI will impact at most 4.6% of tasks in the next ten years. The source of TBI’s estimate? ChatGPT itself. It’s learned self-promotion from parsing our output?

Finally, in a study presented at ACM FAccT, four DeepMind researchers interviewed 20 comedians who do live shows and use AI to participate in workshops using large language models to help write jokes. “Most participants felt the LLMs did not succeed as a creativity support tool, by producing bland and biased comedy tropes, akin to ‘cruise ship comedy material from the 1950s, but a bit less racist’.” Last year, Julie Seabaugh at the LA Times interviewed 13 professional comedians and got similar responses. Ahmed Ahmed compared AI-generated comedy to eating processed foods and, crucially, it “lacks timing”.

***

Blair, who spent his 1997-2007 premiership pushing ID cards into law, has also been trying to revive this longheld obsession. Two days after Keir Starmer took office, Blair published a letter in the Sunday Times calling for its return. As has been true throughout the history of ID cards (PDF), every new revival presents it as a solution to a different problem. Blair’s 2024 reason is to control immigration (and keep the far-right Reform party at bay). Previously: prevent benefit fraud, combat terorism, streamline access to health, education, and other government services (“the entitlement card”), prevent health tourism.

Starmer promptly shot Blair down: “not part of the government’s plans”. This week Alan West, a home office minister 2007-2010 under Gordon Brown, followed up with a letter to the Guardian calling for ID cards because they would “enhance national security in the areas of terrorism, immigration and policing; facilitate access to online government services for the less well-off; help to stop identity theft; and facilitate international travel”.

Neither Blair (born 1953) nor West (born 1948) seems to realize how old and out of touch they sound. Even back then, the “card” was an obvious decoy. Given pervasive online access, a handheld reader, and the database, anyone’s identity could be checked anywhere at any time with no “card” required.

To sound modern they should call for institutionalizing live facial recognition, which is *already happening* by police fiat. Or sprinkled AI bubble on their ID database.

Databases and giant IT projects that failed – like the Post Office scandal – that was the 1990s way! We’ve moved on, even if they haven’t.

***

If you are not a deposed Conservative, Britain this week is like waking up sequentially from a series of nightmares. Yesterday, Keir Starmer definitively ruled out leaving the European Convention on Human Rights – Starmer’s background as a human rights lawyer to the fore. It’s a relief to hear after 14 years of Tory ministers – David Cameron,, Boris Johnson, Suella Braverman, Liz Truss, Rishi Sunak – whining that human rights law gets in the way of their heart’s desires. Like: building a DNA database, deporting refugees or sending them to Rwanda, a plan to turn back migrants in boats at sea.

Principles have to be supported in law; under the last government’s Public Order Act 2023 curbing “disruptive protest”, yesterday five Just Stop Oil protesters were jailed for four and five years. Still, for that brief moment it was all The Brotherhood of Man.

Illustrations: Windows’ Blue Screen of Death (via Wikimedia).

Safe

That didn’t take long. Since last week’s fret about AI startups ignoring the robots.txt convention, Thomas Claburn has reported at The Register that Cloudflare has developed a scraping prevention tool that identifies and blocks “content extraction” bots attempting to crawl sites at scale.

It’s a stopgap, not a solution. As Cloudflare’s announcement makes clear, the company knows there will be pushback; given these companies’ lack of interest in following existing norms, blocking tools versus scraping bots is basically the latest arms race (previously on this plotline: spam). Also, obviously, the tool only works on sites that are Cloudflare customers. Although these include many of the web’s largest sites, there are hundreds of millions more that won’t, don’t, or can’t pay for its services. If we want to return control to site owners, we’re going to need a more permanent and accesible solution.

In his 1999 book Code and Other Laws of Cyberspace, Lawrence Lessig finds four forms of regulation: norms, law, markets, and architecture. Norms are failing. Markets will just mean prolonged arms races. We’re going to need law and architecture.

***

We appear to be reaching peak “AI” hype, defined by (as in the peak of app hype) the increasing absurdity of things venture capitalists seem willing to fund. I recall reading the comment that at the peak of app silliness a lot of startups were really just putting a technological gloss on services that young men will previously have had supplied by their mothers. The AI bubble seems to be even less productive of long-term value, calling things “AI” that are not at all novel, and proposing “AI” to patch problems that call for real change.

As an example of the first of those, my new washing machine has a setting called “AI patterns”. The manual explains: it reorders the preset programs on the machine’s dial so the ones you use most appear first. It’s not stupid (although I’ve turned it off anyway, along with the wifi and “smart” features I would rather not pay for), but let’s call it what it is: customizing a menu.

As an example of the second…at Gizmodo, Maxwell Zeff reports that Softbank is claiming to have developed an “emotion canceling” AI that “alters angry voices into calm ones”. The use Softbank envisages is to lessen the stress for call center employees by softening the voices of angry customers without changing their actual words. There are, as people pointed out on Mastodon after the article was posted there, a lot smarter alternatives to reducing those individuals’ stress. Like giving them better employment conditions, or – and here’s a really radical thought – designing your services and products so your customers aren’t so frustrated and angry. What this software does is just falsify the sound. My guess is that if there is a result it will be to make customers even more angry and frustrated. More anger in the world. Great.

***

Oh! Sarcasm, even if only slight! At the Guardian, Ned Carter Miles reports on “emotional AI” (can we say “oxymoron”?). Among his examples is a team at the University of Groningen that is teaching an AI to recognize sarcasm using scenes from US sitcoms such as Friends and The Big Bang Theory. Even absurd-sounding research can be a good thing. I’m still not sure how good a guide sitcoms are for identifying emotions in real-world context even apart from the usual issues of algorithmic bias. After all, actors are given carefully crafted words and work harder to communicate their emotional content than ordinary people normally do.

***

Finally, again in the category of peak-AI-hype is this: at the New York Times Cade Metz is reporting that Ilya Sutskever, a co-founder and former chief scientist at OpenAI, has a new startup whose goal is to create a “safe superintelligence”.

Even if you, unlike me, believe that a “superintelligence” is an imminent possibility, what does “safe” mean, especially in an industry that still treats security and accessibility as add-ons? “Safe” is, like “secure”, meaningless without context and a threat model. Safe from what? Safe for what? To do what? Operated by whom? Owned by whom? With what motives? For how long? We create new intelligent humans all the time. Do we have any ability to ensure they’re “safe” technology? If an AGI is going to be smarter than a human, how can anyone possibly promise it will be, in the industry parlance, “aligned” with our goals? And for what value of “our”? Beware the people who want to build the Torment Nexus!

It’s nonsense. Safety can’t be programmed into a superintelligence any more than Isaac Asimov’s Laws of Robotics.

Sutskever’s own comments are equivocal. In a video clip at the Guardian, Sutsekver confusingly says both that “AI will solve all our problems” and that it will make fake news, cyber attacks, and weapons much worse and “has the potential to create infinitely stable dictatorships”. Then he adds, “I feel that technology is a force of nature.” Which is exactly the opposite of what technology is…but it suits the industry to push the inevitability narrative that technological progress cannot be stopped.

Cue Douglas Adams: “This is obviously some strange use of the word ‘safe’ I wasn’t previously aware of.”

Illustrations: The Big Bang Theory‘s Leonard (Johnny Galecki) teaching Sheldon (Jim Parsons) about sarcasm (Season 1, episode 2, “The Big Bran Hypothesis”).

Changing the faith

The governance of Britain and the governance of the Internet have this in common: the ultimate authority in both cases is to a large extent a “gentleman’s agreement”. For the same reason: both were devised by a relatively small, homogeneous group of people who trusted each other. In the case of Britain, inertia means that even without a written constitution the country goes on electing governments and passing laws as if.

Most people have no reason to know that the Internet’s technical underpinnings are defined by a series of documents known as RFCs, for Requests(s) for Comments. RFC1 was defined in April 1969; the most recent, RFC9598, is dated just last month. While the Internet Engineering Task Force oversees RFCs’ development and administration, it has no power to force anyone to adopt them. Throughout, RFC standards have been created collaboratively by volunteers and adopted on merit.

A fair number of RFCs promote good “Internet citizenship”. There are, for example, email addresses (chiefly, webmaster and postmaster) that anyone running a website is supposed to maintain in order to make it easy for a third party to report problems. Today, probably millions of website owners don’t even know this expectation exists. For Internet curmudgeons over a certain age, however, seeing email to those addresses bounce is frustrating.

Still, many of these good-citizen practices persist. One such is the Robots Exclusion Protocol, updated in 2022 as RFC 9309, which defines a file, “robots.txt”, that website owners can put in place to tell automated web crawlers which parts of the site they may access and copy. This may have mattered less in recent years than it did in 1994, when it was devised. As David Pierce recounts at The Verge, at that time an explosion of new bots were beginning to crawl the web to build directories and indexes (no Google until 1998!). Many of those early websites were hosted on very small systems based in people’s homes or small businesses, and could be overwhelmed by unrestrained crawlers. Robots txt, devised by a small group of administrators and developers, managed this problem.

Even without a legal requirement to adopt it, early Internet companies largely saw being good Internet citizens as benefiting them. They, too, were small at the time, and needed good will to bring them the users and customers that have since made them into giants. It served everyone’s interests to comply.

Until more or less now. This week, Katie Paul is reporting at Reuters that “AI” companies are blowing up this arrangement by ignoring robots.txt and scraping whatever they want. This news follows reporting by Randall Lane at Forbes that Perplexity.ai is using its software to generate stories and podcasts using news sites’ work without credit. At Wired, Druv Mehrotra and Tim Marchman report a similar story: Perplexity is ignoring robots.txt and scraping areas of sites that owners want left alone. At 404 Media, Emmanuel Maiberg reports that Perplexity also has a dubious history of using fake accounts to scrape Twitter data.

Let’s not just pick on Perplexity; this is the latest in a growing trend. Previously, hiQ Labs tried scraping data from LinkedIn in order to build services to sell employers, the courts finally ruled in 2019 that hiQ violated LinkedIn’s terms and conditions. More controversially, in the last few years Clearview AI has been responding to widespread criticism by claiming that any photograph published on the Internet is “public” and therefore within its rights to grab for its database and use to identify individuals online and offline. The result has been myriad legal actions under data protection law in the EU and UK, and, in the US, a sheaf of lawsuits. Last week, Kashmir Hill reported at the New York Times, that because Clearview lacks the funds to settle a class action lawsuit it has offered a 23% stake to Americans whose faces are in its database.

As Pierce (The Verge) writes, robots.txt used to represent a fair-enough trade: website owners got search engine visibility in return for their data, and the owners of the crawlers got the data but in return sent traffic.

But AI startups ingesting data to build models don’t offer any benefit in return. Where search engines have traditionally helped people find things on other sites, the owners of AI chatbots want to keep the traffic for themselves. Perplexity bills itself as an “answer engine”. A second key difference is this: none of these businesses are small. As Vladen Joler pointed out last month at CPDP, “AI comes pre-monopolized.” Getting into this area requires billions in funding; by contrast many early Internet businesses started with just a few hundred dollars.

This all feels like a watershed moment for the Internet. For most of its history, as Charles Arthur writes at The Overspill, every advance has exposed another area where the Internet operates on the basis of good faith. Typically, the result is some form of closure – spam, for example, led the operators of mail servers to close to all but authenticated users. It’s not clear to a non-technical person what stronger measure other than copyright law could replace the genteel agreement of robots.txt, but the alternative will likely be closing open access to large parts of the web – a loss to all of us.

Illustrations: Vladen Joler at CPDP 2024, showing his map of the extractive industries required to underpin “AI”.

Hostages

If you grew up with the slow but predictable schedule of American elections, the abruptness with which a British prime minister can prorogue Parliament and hit the campaign trail is startling. Among the pieces of legislation that fell by the wayside this time is the Data Protection and Digital Information bill, which had reached the House of Lords for scrutiny. The bill had many problems. This was the bill that proposed to give the Department of Work and Pensions the right to inspect the bank accounts and financial assets of anyone receiving any government benefits and undermined aspects of the adequacy agreement that allows UK companies to exchange data with businesses in the EU.

Less famously, it also includes the legislative underpinnings for a trust framework for digital verification. On Monday, at a UCL’s conference on crime science, Sandra Peaston, director of research and development at the fraud prevention organization Cifas, outlined how all this is intended to work and asked some pertinent questions. Among them: whether the new regulator will have enough teeth; whether the certification process is strong enough for (for example) mortgage lenders; and how we know how good the relevant algorithm is at identifying deepfakes.

Overall, I think we should be extremely grateful this bill wasn’t rushed through. Quite apart from the digital rights aspects, the framework for digital identity really needs to be right; there’s just too much risk in getting it wrong.

***

At Bloomberg, Mark Gurman reports that Apple’s arrangement with OpenAI to integrate ChatGPT into the iPhone, iPad, and Mac does not involve Apple paying any money. Instead, Gurman cites unidentified sources to the effect that “Apple believes pushing OpenAI’s brand and technology to hundreds of millions of its devices is of equal or greater value than monetary payments.”

We’ve come across this kind of claim before in arguments between telcos and Internet companies like Netflix or between cable companies and rights holders. The underlying question is who brings more value to the arrangement, or who owns the audience. I can’t help feeling suspicious that this will not end well for users. It generally doesn’t.

***

Microsoft is on a roll. First there was the Recall debacle. Now come accusations by a former employee that it ignored a reported security flaw in order to win a large government contract, as Renee Dudley and Doris Burke report at Pro Publica. Result: the Russian Solarwinds cyberattack on numerous US government departments and agencies, including the National Nuclear Security Administration.

This sounds like a variant of Cory Doctorow’s enshittification at the enterprise level (see also: Boeing). They don’t have to be monopolies: these organizations’ evolving culture has let business managers override safety and security engineers. This is how Challenger blew up in 1986.

Boeing is too big and too lacking in competition to be allowed to fail entirely; it will have to find a way back. Microsoft has a lot of customer lock-in. Is it too big to fail?

***

I can’t help feeling a little sad at the news that Raspberry Pi has had an IPO. I see no reason why it shouldn’t be successful as a commercial enterprise, but its values will inevitably change over time. CEO Eben Upton swears they won’t, but he won’t be CEO forever, as even he admits. But: Raspberry Pi could become the “unicorn” Americans keep saying Europe doesn’t have.

***

At that same UCL event, I finally heard someone say something positive about AI – for a meaning of “AI” that *isn’t* chatbots. Sarah Lawson, the university’s chief information security officer, said that “AI and machine learning have really changed the game” when it comes to detecting email spam, which remains the biggest vector for attacks. Dealing with the 2% that evades the filters is still a big job, as it leaves 6,000 emails a week hitting people’s inboxes – but she’ll take it. We really need to be more specific when we say “AI” about what kind of system we mean; success at spam filtering has nothing to say about getting accurate information out of a large language model.

***

Finally, I was highly amused this week when long-time security guy Nick Selby, posted on Mastodon about a long-forgotten incident from 1999 in which I disparaged the sort of technology Apple announced this week that’s supposed to organize your life for you – tell you when it’s time to leave for things based on the traffic, juggle meetings and children’s violin recitals, that sort of thing. Selby felt I was ahead of my time because “it was stupid then and is stupid now because even if it works the cost is insane and the benefit really, really dodgy”,

One of the long-running divides in computing is between the folks who want computers to behave predictably and those who want computers to learn from our behavior what’s wanted and do that without intervention. Right now, the latter is in ascendance. Few of us seem to want the “AI features” being foisted on us. But only a small percentage of mainstream users turn off defaults (a friend was recently surprised to learn you can use the history menu to reopen a closed browser tab). So: soon those “AI features” will be everywhere, pointlessly and extravagantly consuming energy, water, and human patience. How you use information technology used to be a choice. Now, it feels like we’re hostages.

Illustrations: Raspberry Pi: the little computer that could (via Wikimedia).

Soap dispensers and Skynet

In the TV series Breaking Bad, the weary ex-cop Mike Ehrmantraut tells meth chemist Walter White : “No more half measures.” The last time he took half measures, the woman he was trying to protect was brutally murdered.

Apparently people like to say there are no dead bodies in privacy (although this is easily countered with ex-CIA director General Michael Hayden’s comment, “We kill people based on metadata”). But, as Woody Hartzog told a Senate committee hearing in September 2023, summarizing work he did with Neil Richards and Ryan Durrie, half measures in AI/privacy legislation are still a bad thing.

A discussion at Privacy Law Scholars last week laid out the problems. Half measures don’t work. They don’t prevent societal harms. They don’t prevent AI from being deployed where it shouldn’t be. And they sap the political will to follow up with anything stronger.

In an article for The Brink, Hartzog said, “To bring AI within the rule of law, lawmakers must go beyond half measures to ensure that AI systems and the actors that deploy them are worthy of our trust,”

He goes on to list examples of half measures: transparency, committing to ethical principles, and mitigating bias. Transparency is good, but doesn’t automatically bring accountability. Ethical principles don’t change business models. And bias mitigation to make a technology nominally fairer may simultaneously make it more dangerous. Think facial recognition: debias the system and improve its accuracy for matching the faces of non-male, non-white people, and then it’s used to target those same people with surveillance.

Or, bias mitigation may have nothing to do with the actual problem, an underlying business model, as Arvind Narayanan, author of the forthcoming book AI Snake Oil, pointed out a few days later at an event convened by the Future of Privacy Forum. In his example, the Washington Post reported in 2019 on the case of an algorithm intended to help hospitals predict which patients will benefit from additional medical care. It turned out to favor white patients. But, Narayanan said, the system’s provider responded to the story by saying that the algorithm’s cost model accurately predicted the costs of additional health care – in other words, the algorithm did exactly what the hospital wanted it to do.

“I think hospitals should be forced to use a different model – but that’s not a technical question, it’s politics.”.

Narayanan also called out auditing (another Hartzog half measure). You can, he said, audit a human resources system to expose patterns in which resumes it flags for interviews and which it drops. But no one ever commissions research modeled on the expensive random controlled testing common in medicine that follows up for five years to see if the system actually picks good employees.

Adding confusion is the fact that “AI” isn’t a single thing. Instead, it’s what someone called a “suitcase term” – that is, a container for many different systems built for many different purposes by many different organizations with many different motives. It is absurd to conflate AGI – the artificial general intelligence of science fiction stories and scientists’ dreams that can surpass and kill us all – with pattern-recognizing software that depends on plundering human-created content and the labeling work of millions of low-paid workers

To digress briefly, some of the AI in that suitcase is getting truly goofy. Yum Brands has announced that its restaurants, which include Taco Bell, Pizza Hut, and KFC, will be “AI-first”. Among Yum’s envisioned uses, the company tells Benj Edwards at Ars Technica, are being able to ask an app what temperature to set the oven. I can’t help suspecting that the real eventual use will be data collection and discriminatory pricing. Stuff like this is why Ed Zitron writes postings like The Rot-Com Bubble, which hypothesizes that the reason Internet services are deteriorating is that technology companies have run out of genuinely innovative things to sell us.

That you cannot solve social problems with technology is a long-held truism, but it seems to be especially true of the messy middle of the AI spectrum, the use cases active now that rarely get the same attention as the far ends of that spectrum.

As Neil Richards put it at PLSC, “The way it’s presented now, it’s either existential risk or a soap dispenser that doesn’t work on brown hands when the real problem is the intermediate level of societal change via AI.”

The PLSC discussion included a list of the ways that regulations fail. Underfunded enforcement. Regulations that are pure theater. The wrong measures. The right goal, but weakly drafted legislation. Make the regulation ambiguous, or base it on principles that are too broad. Choose conflicting half-measures – for example, require transparency but add the principle that people should own their own data.

Like Cristina Caffarra a week earlier at CPDP, Hartzog, Richards, and Durrie favor finding remedies that focus on limiting abuses of power. Full measures include outright bans, the right to bring a private cause of action, imposing duties of “loyalty, care, and confidentiality”, and limiting exploitative data practices within these systems. Curbing abuses of power, as he says, is nothing new. The shiny new technology is a distraction.

Or, as Narayanan put it, “Broken AI is appealing to broken institutions.”

Illustrations: Mike (Jonathan Banks) telling Walt (Bryan Cranston) in Breaking Bad (S03e12) “no more half measures”.

Admiring the problem

In one sense, the EU’s barely dry AI Act and the other complex legislation – the Digital Markets Act, Digital Services Act, GDPR, and so on -= is a triumph. Flawed it may be, but it’s a genuine attempt to protect citizens’ human rights against a technology that is being birthed with numerous trigger warnings. The AI-with-everything program at this year’s Computers, Privacy, and Data Protection, reflected that sense of accomplishment – but also the frustration that comes with knowing that all legislation is flawed, all technology companies try to game the system, and gaps will widen.

CPDP has had these moments before: new legislation always comes with a large dollop of frustration over the opportunities that were missed and the knowledge that newer technologies are already rushing forwards. AI, and the AI Act, more or less swallowed this year’s conference as people considered what it says, how it will play internationally, and the necessary details of implementation and enforcement. Two years at this event, inadequate enforcement of GDPR was a big topic.

The most interesting future gaps that emerged this year: monopoly power, quantum sensing, and spatial computing.

For at least 20 years we’ve been hearing about quantum computing’s potential threat to public key encryption – that day of doom has been ten years away as long as I can remember, just as the Singularity is always 30 years away. In the panel on quantum sensing, Chris Hoofnagle argued that, as he and Simson Garfinkel recently wrote at Lawfare and in their new book, quantum cryptanalysis is overhyped as a threat (although there are many opportunities for quantum computing in chemistry and materials science). However, quantum sensing is here now, works (because qubits are fragile), and is cheap. There is plenty of privacy threat here to go around: quantum sensing will benefit entirely different classes of intelligence, particularly remote, undetectable surveillance.

Hoofnagle and Garfinkel are calling this MASINT, for machine and signature intelligence, and believe that it will become very difficult to hide things, even at a national level. In Hoofnagle’s example, a quantum sensor-equipped drone could fly over the homes of parolees to scan for guns.

Quantum sensing and spatial computing have this in common: they both enable unprecedented passive data collection. VR headsets, for example, collect all sorts of biomechanical data that can be mined more easily for personal information than people expect.

Barring change, all that data will be collected by today’s already-powerful entities.

The deeper level on which all this legislation fails particularly exercised Cristina Caffarra, the co-founder of the Centre for Economic Policy Research in the panel on AI and monopoly, saying that all this legislation is basically nibbling around the edges because they do not touch the real, fundamental problem of the power being amassed by the handful of companies who own the infrastructure.

“It’s economics 101. You can have as much downstream competition as you like but you will never disperse the power upstream.” The reports and other material generated by government agencies like the UK’s Competition and Markets Authority are, she says, just “admiring the problem”.

A day earlier, the Novi Sad professor Vladen Joler had already pointed out the fundamental problem: at the dawn of the Internet anyone could start with nothing and build something; what we’re calling “AI” requires billions in investment, so comes pre-monopolized. Many people dismiss Europe for not having its own homegrown Big Tech, but that overlooks open technologies: the Raspberry Pi, Linux, and the web itself, which all have European origins.

In 2010, the now-departing MP Robert Halfon (Con-Harlow) said at an event on reining in technology companies that only a company the size of Google – not even a government – could create Street View. Legend has it that open source geeks heard that as a challenge, and so we have OpenStreetMap. Caffarra’s fiery anger raises the question: at what point do the infrastructure providers become so entrenched that they could choke off an open source competitor at birth? Caffarra wants to build a digital public interest infrastructure using the gaps where Big Tech doesn’t yet have that control.

The Dutch Groenlinks MEP Kim van Sparrentak offered an explanation for why the AI Act doesn’t address market concentration: “They still dream of a European champion who will rule the world.” An analogy springs to mind: people who vote for tax cuts for billionaires because one day that might be *them*. Meanwhile, the UK’s Competition and Markets Authority finds nothing to investigate in Microsoft’s partnership with the French AI startup Mistral.

Van Sparrentak thinks one way out is through public procurement; adopt goals of privacy and sustainability, and support European companies. It makes sense; as the AI Now Institute’s Amba Kak, noted, at the moment almost everything anyone does digitally has to go through the systems of at least one Big Tech company.

As Sebastiano Toffaletti, head of the secretariat of the European SME Alliance, put it, “Even if you had all the money in the world, these guys still have more data than you. If you don’t and can’t solve it, you won’t have anyone to challenge these companies.”

Illustrations: Vladen Joler shows Anatomy of an AI System, a map he devised with Kate Crawford of the human labor, data, and planetary resources that are extracted to make “AI”.