net.wars

Changing the faith

The governance of Britain and the governance of the Internet have this in common: the ultimate authority in both cases is to a large extent a “gentleman’s agreement”. For the same reason: both were devised by a relatively small, homogeneous group of people who trusted each other. In the case of Britain, inertia means that even without a written constitution the country goes on electing governments and passing laws as if.

Most people have no reason to know that the Internet’s technical underpinnings are defined by a series of documents known as RFCs, for Requests(s) for Comments. RFC1 was defined in April 1969; the most recent, RFC9598, is dated just last month. While the Internet Engineering Task Force oversees RFCs’ development and administration, it has no power to force anyone to adopt them. Throughout, RFC standards have been created collaboratively by volunteers and adopted on merit.

A fair number of RFCs promote good “Internet citizenship”. There are, for example, email addresses (chiefly, webmaster and postmaster) that anyone running a website is supposed to maintain in order to make it easy for a third party to report problems. Today, probably millions of website owners don’t even know this expectation exists. For Internet curmudgeons over a certain age, however, seeing email to those addresses bounce is frustrating.

Still, many of these good-citizen practices persist. One such is the Robots Exclusion Protocol, updated in 2022 as RFC 9309, which defines a file, “robots.txt”, that website owners can put in place to tell automated web crawlers which parts of the site they may access and copy. This may have mattered less in recent years than it did in 1994, when it was devised. As David Pierce recounts at The Verge, at that time an explosion of new bots were beginning to crawl the web to build directories and indexes (no Google until 1998!). Many of those early websites were hosted on very small systems based in people’s homes or small businesses, and could be overwhelmed by unrestrained crawlers. Robots txt, devised by a small group of administrators and developers, managed this problem.

Even without a legal requirement to adopt it, early Internet companies largely saw being good Internet citizens as benefiting them. They, too, were small at the time, and needed good will to bring them the users and customers that have since made them into giants. It served everyone’s interests to comply.

Until more or less now. This week, Katie Paul is reporting at Reuters that “AI” companies are blowing up this arrangement by ignoring robots.txt and scraping whatever they want. This news follows reporting by Randall Lane at Forbes that Perplexity.ai is using its software to generate stories and podcasts using news sites’ work without credit. At Wired, Druv Mehrotra and Tim Marchman report a similar story: Perplexity is ignoring robots.txt and scraping areas of sites that owners want left alone. At 404 Media, Emmanuel Maiberg reports that Perplexity also has a dubious history of using fake accounts to scrape Twitter data.

Let’s not just pick on Perplexity; this is the latest in a growing trend. Previously, hiQ Labs tried scraping data from LinkedIn in order to build services to sell employers, the courts finally ruled in 2019 that hiQ violated LinkedIn’s terms and conditions. More controversially, in the last few years Clearview AI has been responding to widespread criticism by claiming that any photograph published on the Internet is “public” and therefore within its rights to grab for its database and use to identify individuals online and offline. The result has been myriad legal actions under data protection law in the EU and UK, and, in the US, a sheaf of lawsuits. Last week, Kashmir Hill reported at the New York Times, that because Clearview lacks the funds to settle a class action lawsuit it has offered a 23% stake to Americans whose faces are in its database.

As Pierce (The Verge) writes, robots.txt used to represent a fair-enough trade: website owners got search engine visibility in return for their data, and the owners of the crawlers got the data but in return sent traffic.

But AI startups ingesting data to build models don’t offer any benefit in return. Where search engines have traditionally helped people find things on other sites, the owners of AI chatbots want to keep the traffic for themselves. Perplexity bills itself as an “answer engine”. A second key difference is this: none of these businesses are small. As Vladen Joler pointed out last month at CPDP, “AI comes pre-monopolized.” Getting into this area requires billions in funding; by contrast many early Internet businesses started with just a few hundred dollars.

This all feels like a watershed moment for the Internet. For most of its history, as Charles Arthur writes at The Overspill, every advance has exposed another area where the Internet operates on the basis of good faith. Typically, the result is some form of closure – spam, for example, led the operators of mail servers to close to all but authenticated users. It’s not clear to a non-technical person what stronger measure other than copyright law could replace the genteel agreement of robots.txt, but the alternative will likely be closing open access to large parts of the web – a loss to all of us.

Illustrations: Vladen Joler at CPDP 2024, showing his map of the extractive industries required to underpin “AI”.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

Review: Money in the Metaverse

Money in the Metaverse: Digital Assets, Online Identities, Spatial Computing, and Why Virtual Worlds Mean Real Business
by David Birch and Victoria Richardson
London Publishing Partnership
ISBN: 978-1-916749-05-4

In my area of London there are two buildings whose architecture unmistakably identifies them as former banks. Time has moved on, and one houses a Pizza Express, the other a Tesco Direct. The obviously-built-to-be-a-Post-Office building, too, is now a restaurant, and the post office itself now occupies a corner of a newsagent’s. They ilustrate a point David Birch has frequently made: there is nothing permanent about our financial arrangements. Banking itself is only a few hundred years old.

Writing with Victoria Richardson, in their new book Money in the Metaverse: Birch argues this point anew. At one time paper notes seemed as shocking and absurd as cryptocurrencies and non-fungible tokens do today. The skeptic reads that and wonders if the early days of paper notes were as rife with fraud and hot air as NFTs have been. Is the metaverse even still a thing? It’s all AI hype round here now.

Birch and Richardson, however, believe that increasingly our lives will be lived online – a flight to the “cyburbs”, they call it. In one of their early examples of our future, they suggest it will be good value to pay for a virtual ticket (NFT) to sit next to a friend to listen to a concert in a virtual auditorium. It may be relevant that they were likely writing this during the acute phase of the covid pandemic. By now, most of the people I zoomed with then are back doing things in the real world and are highly resistant to returning to virtual, or even hybrid, meetups.

But exactly how financial services might operate isn’t really their point and would be hard to get right eve if it were. Instead, their goal is to explain various novel financial technologies and tools such as NFTs, wallets, smart contracts, and digital identities and suggest possible strategies for businesses to use them to build services. Some of the underlying ideas have been around for at least a couple of decades: software agents that negotiate on an individual’s behalf, and support for multiple disconnected identities to be used in the different roles in life we all have, for example. Others are services that seem to have little to do with the metaverse, such as paperless air travel, already being implemented, and virtual tours of travel destination, which have been with us in some form since video arrived on the web.

The key question – whether the metaverse will see mass adoption – is not one Birch and Richardson can answer. Certainly, I’m dubious about some of the use cases they propose – such as the idea of gamifying life insurance by offering reduced premiums to those who reach various thresholds of physical activity or healthy living. Insurance is supposed to manage risk by pooling it; their proposal would penalize disability and illness.

A second question occurs: what new kinds of crime will these technologies enable? Just this week, Fortune reported that cashlessness has brought a new level of crime to Sweden. Why should the metaverse be different? This, too, is beyond the scope of Birch’s and Richardson’s work, which is to explain but not to either hype or critique. The overall impression the book leaves, however, is of a too-clean computer-generated landscape or smart city mockup, where the messiness of real life is missing.

Outbound

As the world and all knows by now, the UK is celebrating this year’s American Independence Day by staging a general election. The preliminaries are mercifully short by US standards, in that the period between the day it was called and the day the winners will be announced is only about six weeks. I thought the announcement would bring more sense of relief than it did. Instead, these six weeks seem interminable for two reasons: first, the long, long wait for the announcement, and second, the dominant driver for votes is largely negative – voting against, rather than voting for.

Labour, which is in polling position to win by a lot, is best served by saying and doing as little as possible, lest a gaffe damage its prospects. The Conservatives seem to be just trying not to look as hopeless as they feel. The only party with much exuberance is the far-right upstart Reform, which measures success in terms of whether it gets a larger share of the vote than the Conservatives and whether Nigel Farage wins a Parliamentary seat on his eighth try. And the Greens, who are at least motivated by genuine passion for their cause, and whose only MP is retiring this year. For them, sadly, success would be replacing her.

Particularly odd is the continuation of the trend visible in recent years for British right-wingers to adopt the rhetoric and campaigning style of the current crop of US Republicans. This week, they’ve been spinning the idea that Labour may win a dangerous “supermajority”. “Supermajority” has meaning in the US, where the balance of powers – presidency, House of Representatives, Senate – can all go in one party’s direction. It has no meaning in the UK, where Parliament is sovereign. All it means is Labour could wind up with a Parliamentary majority so large that they can pass any legislation they want. But this has been the Conservatives’ exact situation for the last five years, ever since the 2019 general election gave Boris Johnson a majority of 86. We should probably be grateful they largely wasted the opportunity squabbling among themselves.

This week saw the launch, day by day, of each party manifesto in turn. At one time, this would have led to extensive analysis and comparisons. This year, what discussion there is focuses on costs: whose platform commits to the most unfunded spending, and therefore who will raise taxes the most? Yet my very strong sense is that few among the electorate are focused on taxes; we’d all rather have public services that work and an end to the cost-of-living crisis. You have to be quite wealthy before private health care offers better value than paying taxes. But here may lie the explanation for both this and the weird Republican-ness of 2024 right-wing UK rhetoric: they’re playing to the same wealthy donors.

In this context, it’s not surprising that there’s not much coverage of what little the manifestos have to say about digital rights or the Internet. The exception is Computer Weekly, which finds the Conservatives promising more of the same and Labour offering a digital infrastructure plan, which includes building data centers and easing various business regulations but not to reintroduce the just-abandoned Data Protection and Digital Information bill.

In the manifesto itself: “Labour will build on the Online Safety Act, bringing forward provisions as quickly as possible, and explore further measures to keep everyone safe online, particularly when using social media. We will also give coroners more powers to access information held by technology companies after a child’s death.” The latter is a reference to recent cases such as that of 14-year-old Molly Russell, whose parents fought for five years to gain access to her Instagram account after her death.

Elsewhere, the manifesto also says, “Too often we see families falling through the cracks of public services. Labour will improve data sharing across services, with a single unique identifier, to better support children and families.”

“A single unique identifier” brings a kind of PTSD flashback: the last Labour government, in power from 1997 to 2010, largely built the centralized database state, and was obsessed with national ID cards, which were finally killed by David Cameron’s incoming coalition government. At the time, one of the purported benefits was streamlining government interaction. So I’m suspicious: this number could easily be backed by biometrics and checked via phone apps on the spot, anywhere and grow into…?

In terms of digital technologies, the LibDems mostly talk about health care, mandating interoperability for NHS systems and improving both care and efficiency. That can only be assessed if the detail is known. Also of interest: the LibDems’ proposed anti-SLAPP law, increasingly needed.

The LibDems also commit to advocate for a “Digital Bill of Rights”. I’m not sure it’s worth the trouble: “digital rights” as a set of civil liberties separate from human rights is antiquated, and many aspects are already enshrined in data protection, competition, and other law. In 2019, under the influence of then-deputy leader Tom Watson, this was a Labour policy. The LibDems are unlikely to have any power; but they lead in my area.

I wish the manifestos mattered and that we could have a sensible public debate about what technology policy should look like and what the priorities should be. But in a climate where everyone votes to get one lot out, the real battle begins on July 5, when we find out what kind of bargain we’ve made.

Illustrations: Polling station in Canonbury, London, in 2019 (via Wikimedia).

Hostages

If you grew up with the slow but predictable schedule of American elections, the abruptness with which a British prime minister can prorogue Parliament and hit the campaign trail is startling. Among the pieces of legislation that fell by the wayside this time is the Data Protection and Digital Information bill, which had reached the House of Lords for scrutiny. The bill had many problems. This was the bill that proposed to give the Department of Work and Pensions the right to inspect the bank accounts and financial assets of anyone receiving any government benefits and undermined aspects of the adequacy agreement that allows UK companies to exchange data with businesses in the EU.

Less famously, it also includes the legislative underpinnings for a trust framework for digital verification. On Monday, at a UCL’s conference on crime science, Sandra Peaston, director of research and development at the fraud prevention organization Cifas, outlined how all this is intended to work and asked some pertinent questions. Among them: whether the new regulator will have enough teeth; whether the certification process is strong enough for (for example) mortgage lenders; and how we know how good the relevant algorithm is at identifying deepfakes.

Overall, I think we should be extremely grateful this bill wasn’t rushed through. Quite apart from the digital rights aspects, the framework for digital identity really needs to be right; there’s just too much risk in getting it wrong.

***

At Bloomberg, Mark Gurman reports that Apple’s arrangement with OpenAI to integrate ChatGPT into the iPhone, iPad, and Mac does not involve Apple paying any money. Instead, Gurman cites unidentified sources to the effect that “Apple believes pushing OpenAI’s brand and technology to hundreds of millions of its devices is of equal or greater value than monetary payments.”

We’ve come across this kind of claim before in arguments between telcos and Internet companies like Netflix or between cable companies and rights holders. The underlying question is who brings more value to the arrangement, or who owns the audience. I can’t help feeling suspicious that this will not end well for users. It generally doesn’t.

***

Microsoft is on a roll. First there was the Recall debacle. Now come accusations by a former employee that it ignored a reported security flaw in order to win a large government contract, as Renee Dudley and Doris Burke report at Pro Publica. Result: the Russian Solarwinds cyberattack on numerous US government departments and agencies, including the National Nuclear Security Administration.

This sounds like a variant of Cory Doctorow’s enshittification at the enterprise level (see also: Boeing). They don’t have to be monopolies: these organizations’ evolving culture has let business managers override safety and security engineers. This is how Challenger blew up in 1986.

Boeing is too big and too lacking in competition to be allowed to fail entirely; it will have to find a way back. Microsoft has a lot of customer lock-in. Is it too big to fail?

***

I can’t help feeling a little sad at the news that Raspberry Pi has had an IPO. I see no reason why it shouldn’t be successful as a commercial enterprise, but its values will inevitably change over time. CEO Eben Upton swears they won’t, but he won’t be CEO forever, as even he admits. But: Raspberry Pi could become the “unicorn” Americans keep saying Europe doesn’t have.

***

At that same UCL event, I finally heard someone say something positive about AI – for a meaning of “AI” that *isn’t* chatbots. Sarah Lawson, the university’s chief information security officer, said that “AI and machine learning have really changed the game” when it comes to detecting email spam, which remains the biggest vector for attacks. Dealing with the 2% that evades the filters is still a big job, as it leaves 6,000 emails a week hitting people’s inboxes – but she’ll take it. We really need to be more specific when we say “AI” about what kind of system we mean; success at spam filtering has nothing to say about getting accurate information out of a large language model.

***

Finally, I was highly amused this week when long-time security guy Nick Selby, posted on Mastodon about a long-forgotten incident from 1999 in which I disparaged the sort of technology Apple announced this week that’s supposed to organize your life for you – tell you when it’s time to leave for things based on the traffic, juggle meetings and children’s violin recitals, that sort of thing. Selby felt I was ahead of my time because “it was stupid then and is stupid now because even if it works the cost is insane and the benefit really, really dodgy”,

One of the long-running divides in computing is between the folks who want computers to behave predictably and those who want computers to learn from our behavior what’s wanted and do that without intervention. Right now, the latter is in ascendance. Few of us seem to want the “AI features” being foisted on us. But only a small percentage of mainstream users turn off defaults (a friend was recently surprised to learn you can use the history menu to reopen a closed browser tab). So: soon those “AI features” will be everywhere, pointlessly and extravagantly consuming energy, water, and human patience. How you use information technology used to be a choice. Now, it feels like we’re hostages.

Illustrations: Raspberry Pi: the little computer that could (via Wikimedia).

Soap dispensers and Skynet

In the TV series Breaking Bad, the weary ex-cop Mike Ehrmantraut tells meth chemist Walter White : “No more half measures.” The last time he took half measures, the woman he was trying to protect was brutally murdered.

Apparently people like to say there are no dead bodies in privacy (although this is easily countered with ex-CIA director General Michael Hayden’s comment, “We kill people based on metadata”). But, as Woody Hartzog told a Senate committee hearing in September 2023, summarizing work he did with Neil Richards and Ryan Durrie, half measures in AI/privacy legislation are still a bad thing.

A discussion at Privacy Law Scholars last week laid out the problems. Half measures don’t work. They don’t prevent societal harms. They don’t prevent AI from being deployed where it shouldn’t be. And they sap the political will to follow up with anything stronger.

In an article for The Brink, Hartzog said, “To bring AI within the rule of law, lawmakers must go beyond half measures to ensure that AI systems and the actors that deploy them are worthy of our trust,”

He goes on to list examples of half measures: transparency, committing to ethical principles, and mitigating bias. Transparency is good, but doesn’t automatically bring accountability. Ethical principles don’t change business models. And bias mitigation to make a technology nominally fairer may simultaneously make it more dangerous. Think facial recognition: debias the system and improve its accuracy for matching the faces of non-male, non-white people, and then it’s used to target those same people with surveillance.

Or, bias mitigation may have nothing to do with the actual problem, an underlying business model, as Arvind Narayanan, author of the forthcoming book AI Snake Oil, pointed out a few days later at an event convened by the Future of Privacy Forum. In his example, the Washington Post reported in 2019 on the case of an algorithm intended to help hospitals predict which patients will benefit from additional medical care. It turned out to favor white patients. But, Narayanan said, the system’s provider responded to the story by saying that the algorithm’s cost model accurately predicted the costs of additional health care – in other words, the algorithm did exactly what the hospital wanted it to do.

“I think hospitals should be forced to use a different model – but that’s not a technical question, it’s politics.”.

Narayanan also called out auditing (another Hartzog half measure). You can, he said, audit a human resources system to expose patterns in which resumes it flags for interviews and which it drops. But no one ever commissions research modeled on the expensive random controlled testing common in medicine that follows up for five years to see if the system actually picks good employees.

Adding confusion is the fact that “AI” isn’t a single thing. Instead, it’s what someone called a “suitcase term” – that is, a container for many different systems built for many different purposes by many different organizations with many different motives. It is absurd to conflate AGI – the artificial general intelligence of science fiction stories and scientists’ dreams that can surpass and kill us all – with pattern-recognizing software that depends on plundering human-created content and the labeling work of millions of low-paid workers

To digress briefly, some of the AI in that suitcase is getting truly goofy. Yum Brands has announced that its restaurants, which include Taco Bell, Pizza Hut, and KFC, will be “AI-first”. Among Yum’s envisioned uses, the company tells Benj Edwards at Ars Technica, are being able to ask an app what temperature to set the oven. I can’t help suspecting that the real eventual use will be data collection and discriminatory pricing. Stuff like this is why Ed Zitron writes postings like The Rot-Com Bubble, which hypothesizes that the reason Internet services are deteriorating is that technology companies have run out of genuinely innovative things to sell us.

That you cannot solve social problems with technology is a long-held truism, but it seems to be especially true of the messy middle of the AI spectrum, the use cases active now that rarely get the same attention as the far ends of that spectrum.

As Neil Richards put it at PLSC, “The way it’s presented now, it’s either existential risk or a soap dispenser that doesn’t work on brown hands when the real problem is the intermediate level of societal change via AI.”

The PLSC discussion included a list of the ways that regulations fail. Underfunded enforcement. Regulations that are pure theater. The wrong measures. The right goal, but weakly drafted legislation. Make the regulation ambiguous, or base it on principles that are too broad. Choose conflicting half-measures – for example, require transparency but add the principle that people should own their own data.

Like Cristina Caffarra a week earlier at CPDP, Hartzog, Richards, and Durrie favor finding remedies that focus on limiting abuses of power. Full measures include outright bans, the right to bring a private cause of action, imposing duties of “loyalty, care, and confidentiality”, and limiting exploitative data practices within these systems. Curbing abuses of power, as he says, is nothing new. The shiny new technology is a distraction.

Or, as Narayanan put it, “Broken AI is appealing to broken institutions.”

Illustrations: Mike (Jonathan Banks) telling Walt (Bryan Cranston) in Breaking Bad (S03e12) “no more half measures”.