net.wars

A three-hour tour

It should be easy for the UK’s Competition Authority to shut down the proposed merger of Vodafone and Three, two of the UK’s four major mobile network providers. Remaining as competition post-merger would be EE (owned by BT) and Virgin Media O2 (owned by the Spanish company Telefónica and the US-listed company Liberty Global).

The trade union Unite is correctly calling the likely consequences: higher prices, fewer choices, job losses, and poorer customer service. In response, Vodafone and Three are dangling a shiny object of temptation: investment in building 5G network.

Well, hogwash. I would say “Don’t do this” even if I weren’t a Three customer (who left Vodafone years ago). Let them agree to collaborate on building a sbared network and compete on quality and services, but not merge. See the US broadband market, where prices are high, speeds are low, and frustrated consumers rarely have more than one option and take heed.

***

It’s a relief to see some sanity arriving around generative AI. As a glance at the archives will show, I’ve never been a fan; last year Jon Crowcroft and I predicted the eventual demise of large language models due to model collapse. Now, David Gray Widder and Mar Hicks warn in a paper that although the generative AI bubble is deflating, its damage will persist: “…carbon can’t be put back in the ground, workers continue to need to fend off AI’s disciplining effects, and the poisonous effect on our information commons will be hard to undo.”

This week offers worked examples. Re disinformation, at The Verge Sarah Jeong describes the change in our relationship with photographs arriving with new smartphones’ ability to fake realistic images. At The Register, Dan Robinson reports that data centers and AI are causing a substantial rise in water use in the US state of Virginia.

As evidence of the deflating bubble, Widder and Hicks cite the recent Goldman Sachs report arguing that generative AI is unlikely ever to pay back its investment.

And yet: to exploit generative AI, companies and governments are reversing or delaying programs to lower carbon emissions. Also alarmingly, Widder and Hicks wonder if generative AI was always meant to fail and its promoters’ only real goals were to scoop up profits and use the inevitability narrative to make generative AI a vector for embedding infrastructural dependencies (for example, on cloud computing).

That outcome doesn’t have to have been a plan – or a conspiracy theory, just as robber barons don’t actually need to conspire in order to serve each other’s interests. It could just as well be a circumstances-led pivot. But companies that have put money into generative AI will want to scrounge whatever return they can get. So the idea that we will be left with infrastructure that’s a poor fit for our actual needs is a disturbing – and entirely possible – outcome.

***

It’s fascinating – and an example of how you never know where new technologies will lead – to learn that people are using DNA testing to prove they qualify for citizenship in other countries such as Ireland, where a single grandparent will get you in. In some cases, such as the children of unmarried Irish women who were transported to England, this use of DNA testing rights historic wrongs. For others, it opens new opportunities such as the right to live in the EU. Unfortunately, it’s easy to imagine that in countries where citizenship by birthright is a talking point for the right wing this type of DNA testing could be mooted as a requirement. I’d like to think that rounding up babies for deportation is beyond even the most bigoted authoritarians, but…

***

The controversial British technology entrepreneur Mike Lynch has died a billionaire’s death; his superyacht sank in a tornado off the coast of Sicily. I interviewed him for Salon in 2000, when he was newly Britain’s first software billionaire. It was the first time I heard of the theorem developed by Thomas Bayes, an 18th century minister and mathematician (which now is everywhere), and for a long time afterwards I wasn’t certain I’d correctly understood his comments about perception and philosophy. This was exacerbated by early experience with his software in 1996, when it was still a consumer desktop search product fronted by an annoying cartoon dog – I thought it unusably slow compared to pre-Google search engines. By 2000, Autonomy had pivoted to enterprise software, which seemed a better fit.

In 2011, Sharon Bertsch McGrayne‘s book, The Theory That Would Not Die, explained things more clearly. That year, Lynch hit a business peak by selling Autonomy to Hewlett-Packard for $11 billion. A year later, he left HP, and set up Invoke Capital to invest in companies with fundamental technology ideas that scale.

Soon afterwards, HP wrote down $8.8 billion and accused Lynch of accounting fraud. The last 12 years of his life were spent in courtrooms: first a UK civil case, decided for HP in 2022, which Lynch was appealing, then a fight against extradition, and finally a criminal trial in the US, where former Autonomy CFO Sushovan Hussein had already been sent to jail for five years. Lynch’s fatal yacht trip was to celebrate his acquittal.

Illustrations: A Customs and Border Protection scientist reads a DNA profile to determine the origin of a commodity (via Wikimedia.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

The fear factor

Be careful what you allow the authorities to do to people you despise, because one day those same tools will be turned against you.

In the last few weeks, the shocking stabbing of three young girls at a dance class in Southport became the spark to ignite riots across the UK by people who apparently believed social media theories that the 17-year-old boy responsible was Muslim, a migrant, or a terrorist. With the boy a week from his 18th birthday, the courts ruled police could release his name in order to make clear he was not Muslim and born in Wales. It failed to stop the riots.

Police and the courts have acted quickly; almost 800 people have been arrested, 350 have been charged, and hundreds are in custody. In a moving development, on a night when more than 100 riots were predicted, tens of thousands of ordinary citizens thronged city streets and formed protective human chains around refugee centers in order to block the extremists. The riots have quieted down, but police are still busy arresting newly-identified suspects. And the inevitable question is being asked: what do we do next to keep the streets safe and calm?

London mayor Sadiq Khan quickly called for a review of the Online Safety Act, saying he doesn’t believe it’s fit for purpose. Cabinet minister Nick Thomas-Symonds (Labour-Torfaen) has suggested the month-old government could change the law.

Meanwhile, prime minister Keir Starmer favours a wider rollout of live facial recognition to track thugs and prevent them from traveling to places where they plan to cause social unrest, copying systems the police use to prevent football hooligans from even boarding trains to matches. This proposal is startling because: before standing for Parliament Starmer was a human rights lawyer. One could reasonably expect him to know that facial recognition systems have a notorious history of inaccuracy due to biases baked into their algorithms via training data, and that in the UK there is no regulatory framework to provide oversight. Silkie Carlo, the director of Big Brother Watch immediately called the proposal “alarming” and “ineffective”, warning that it turns people into “walking ID cards”.

As the former head of Liberty, Shami Chakrabarti used to say when ID cards were last proposed, moves like these fundamentally change the relationship between the citizen and the state. Such a profound change deserves more thought than a reflex fear reaction in a crisis. As Ciaran Thapar argues at the Guardian, today’s violence has many causes, beginning with the decay of public services for youth, mental health, and , and it’s those causes that need to be addressed. Thapar invokes his memories of how his community overcame the “open, violent racism” of the 1980s Thatcher years in making his recommendations.

Much of the discussion of the riots has blamed social media for propagating hate speech and disinformation, along with calls for rethinking the Online Safety Act. This is also frustrating. First of all, the OSA, which was passed in 2023, isn’t even fully implemented yet. When last seen, Ofcom, the regulator designated to enforce it, was in the throes of recruiting people by the dozen, working out what sites will be in scope (about 150,000, they said), and developing guidelines. Until we see the shape of the regulation in practice, it’s too early to say the act needs expansion.

Second, hate speech and incitement to violence are already illegal under other UK laws. Just this week, a woman was jailed for 15 months for a comment to a Facebook group with 5,100 members that advocated violence against mosques and the people inside them. The OSA was not needed to prosecute her.

And third, while Elon Musk and Mark Zuckerberg definitely deserve to have anger thrown their way, focusing solely on the ills of social media makes no sense given the decades that right-wing newspapers have spent sowing division and hatred. Even before Musk, Twitter often acted as a democratization of the kind of angry, hate-filled coverage long seen in the Daily Mail (and others). These are the wedges that created the divisions that malicious actors can now exploit by disseminating disinformation, a process systematically explained by Renee DiResta in her new book, Invisible Rulers.

The FBI’s investigation of the January 6, 2021 insurrection at the US Capitol provides a good exemplar for how modern investigations can exploit new technologies. Law enforcement applied facial recognition to CCTV footage and massive databases, and studied social media feeds, location data and cellphone tracking, and other data. As Charlie Warzel and Stuart A. Thompson wrote at the New York Times in 2021, even though most of us agree with the goal of catching and punishing insurrectionists and rioters, the data “remains vulnerable to use and abuse” against protests of other types – such as this year’s pro-Palestinian encampments.

The same argument applies in the UK. Few want violence in the streets. But the unilateral imposition of live facial recognition, among other tracking technologies, can’t be allowed. There must be limits and safeguards. ID cards issued in wartime could be withdrawn when peace came; surveillance technologies, once put in place, tend to become permanent.

Illustrations: The CCTV camera at 22 Portobello Road, where George Orwell once lived.

Gather ye lawsuits while ye may

Most of us howled with laughter this week when the news broke that Elon Musk is suing companies for refusing to advertise on his exTwitter platform. To be precise, Musk is suing the World Federation of Advertisers, Unilever, Mars, CVS, and Ørsted in a Texas court.

How could Musk, who styles himself a “free speech absolutist”, possibly force companies to advertise on his site? This is pure First Amendment stuff: both the right to free speech (or to remain silent) and freedom of assembly. It adds to the nuttiness of it all that last November Musk was telling advertisers to “go fuck yourselves” if they threatened him with a boycott. Now he’s mad because they responded in kind.

Does the richest man in the world even need advertisers to finance his toy?

At Techdirt, Mike Masnick catalogues the “so much stupid here”.

The WFA initiative that offends Musk is the Global Alliance for Responsible Media, which develops guidelines for content moderation – things like a standard definition for “hate speech” to help sites operate consistent and transparent policies and reassure advertisers that their logos don’t appear next to horrors like the livestreamed shooting in Christchurch, New Zealand. GARM’s site says: membership is voluntary, following its guidelines is voluntary, it does not provide a rating service, and it is apolitical.

Pre-Musk, Twitter was a member. After Musk took over, he pulled exTwitter out of it – but rejoined a month ago. Now, Musk claims that refusing to advertise on his site might be a criminal matter under RICO. So he’s suing himself? Blink.

Enter US Republicans, who are convinced that content moderation exists only to punish conservative speech. On July 10, House Judiciary Committee, under the leadership of Jim Jordan (R-OH), released an interim report on its ongoing investigation of GARM.

The report says GARM appears to “have anti-democratic views of fundamental American freedoms” and likens its work to restraint of trade Among specific examples, it says GARM’s recommended that its members stop advertising on exTwitter, threatened Spotify when podcaster Joe Rogan told his massive audience that young, healthy people don’t need to be vaccinated against covid, and considered blocking news sites such as Fox News, Breitbart, and The Daily Wire. In addition, the report says, GARM advised its members to use fact-checking services like NewsGuard and the Global Disinformation Index “which disproportionately label right-of-center news sites as so-called misinformation”. Therefore, the report concludes, GARM’s work is “likely illegal under the antitrust laws”.

I don’t know what a court would have made of that argument – for one thing, GARM can’t force anyone to follow its guidelines. But now we’ll never know. Two days after Musk filed suit, the WFA announced it’s shuttering GARM immediately because it can’t afford to defend the lawsuit and keep operating even though it believes it’s complied with competition rules. Such is the role of bullies in our public life.

I suppose Musk can hope that advertisers decide it’s cheaper to buy space on his site than to fight the lawsuit?

But it’s not really a laughing matter. GARM is just one of a number of initiatives that’s come under attack as we head into the final three months of campaigning before the US presidential election. In June, Renee DiResta, author of the new book Invisible Rulers, announced that her contract as the research manager of the Stanford Internet Observatory was not being renewed. Founding director Alex Stamos was already gone. Stanford has said the Observatory will continue under new leadership, but no details have been published. The Washington Post says conspiracy theorists have called DiResta and Stamos part of a government-private censorship consortium.

Meanwhile, one of the Observatory’s projects, a joint effort with the University of Washington called the Election Integrity Partnership, has announced, in response to various lawsuits and attacks, that it will not work on the 2024 or future elections. At the same time, Meta is shutting down CrowdTangle next week, removing a research tool that journalists and academics use to study content on Facebook and Instagram. While CrowdTangle will be replaced with Meta Content Library, access will be limited to academics and non-profits, and those who’ve seen it say it’s missing useful data that was available through CrowdTangle.

The concern isn’t the future of any single initiative; it’s the pattern of these things winking out. As work like DiResta’s has shown, the flow of funds financing online political speech (including advertising) is dangerously opaque. We need access and transparency for those who study it, and in real time, not years after the event.

In this, as so much else, the US continues to clash with the EU, which it accused in December of breaching its rules with respect to disinformation, transparency, and extreme content. Last month, it formally charged Musk’s site for violating the Digital Services Act, for which Musk could be liable for a fine of up to 6% of exTwitter’s global revenue. Among the EU’s complaints is the lack of a searchable and reliable advertisement repository – again, an important element of the transparency we need. Its handling of disinformation and calls to violence during the current UK riots may be added to the investigation.

Musk will be suing *us*, next.

Illustrations: A cartoon caricature of Christina Rossetti by her brother Dante Gabriel Rossetti 1862, showing her having a tantrum after reading The Times’ review of her poetry (via Wikimedia).

Review: Invisible Rulers

Invisible Rulers: The People Who Turn Lies Into Reality
by Renée DiResta
Hachette
ISBN: 978-1-54170337-7

For the last week, while violence has erupted in British cities, commentators asked, among other things: what has social media contributed to the inflammation? Often, the focus lands on specific famous people such as Elon Musk, who told exTwitter that the UK is heading for civil war (which basically shows he knows nothing about the UK).

It’s a particularly apt moment to read Renée DiResta‘s new book, Invisible Rulers: The People Who Turn Lies Into Reality. Until June, DiResta was the technical director of the Stanford Internet Observatory, which studies misinformation and disinformation online.

In her book, DiResta, like James Ball in The Other Pandemic and Naomi Klein in Doppelganger, traces how misinformation and disinformation propagate online. Where Ball examined his subject from the inside out (having spent his teenaged years on 4chan) and Klein is drawn from the outside in, DiResta’s study is structural. How do crowds work? What makes propaganda successful? Who drives engagement? What turns online engagement into real world violence?

One reason these questions are difficult to answer is the lack of transparency regarding the money flowing to influencers, who may have audiences in the millions. The trust they build with their communities on one subject, like gardening or tennis statistics, extends to other topics when they stray. Someone making how-to knitting videos one day expresses concern about their community’s response to a new virus, finds engagement, and, eventually, through algorithmic boosting, greater profit in sticking to that topic instead. The result, she writes, is “bespoke realities” that are shaped by recommendation engines and emerge from competition among state actors, terrorists, ideologues, activists, and ordinary people. Then add generative AI: “We can now mass-produce unreality.”

DiResta’s work on this began in 2014, when she was checking vaccination rates in the preschools she was looking at for her year-old son in the light of rising rates of whooping cough in California. Why, she wondered, were there all these anti-vaccine groups on Facebook, and what went on in them? When she joined to find out, she discovered a nest of evangelists promoting lies to little opposition, a common pattern she calls “asymmetry of passion”. The campaign group she helped found succeeded in getting a change in the law, but she also saw that the future lay in online battlegrounds shaping public opinion. When she presented her discoveries to the Centers for Disease Control, however, they dismissed it as “just some people online”. This insouciance would, as she documents in a later chapter, come back to bite during the covid emergency, when the mechanisms already built whirred into action to discredit science and its institutions.

Asymmetry of passion makes those holding extreme opinions seem more numerous than they are. The addition of boosting algorithms and “charismatic leaders” such as Musk or Robert F. Kennedy, Jr (your mileage may vary) adds to this effect. DiResta does a good job of showing how shifts within groups – anti-vaxx groups that also fear chemtrails and embrace flat earth, flat earth groups that shift to QAnon – lead eventually from “asking questions” to “take action”. See also today’s UK.

Like most of us, DiResta is less clear on potential solutions. She gives some thought to the idea of prebunking, but more to requiring transparency: platforms around content moderation decisions, influencers around their payment for commercial and political speech, and governments around their engagement with social media platforms. She also recommends giving users better tools and introducing some friction to force a little more thought before posting.

The Observatory’s future is unclear, as several other key staff have left; Stanford told The Verge in June that the Observatory would continue under new leadership. It is just one of several election integrity monitors whose future is cloudy; in March Facebook announced it would shut down research tool CrowdTangle on August 14. DiResta’s book is an important part of its legacy.

Crowdstricken

This time two weeks ago the media were filled with images from airports clogged with travelers unable to depart because of…a software failure. Not a cyberattack, and not, as in 2017, limited to a single airline’s IT systems failure.

The outage wasn’t just in airports: NHS hospitals couldn’t book appointments, the London Stock Exchange news service and UK TV channel Sky News stopped functioning, and much more. It was the biggest computer system outage not caused by an attack to date, a watershed moment like 1988’s Internet worm.

Experienced technology observers quickly predicted: “bungled software update”. There are prior examples aplenty. In February, an AT&T outage lasted more than 12 hours, spanned 50 US states, Puerto Rico, and the US Virgin Islands, and blocked an estimated 25,000 attempted calls to the 911 emergency service. Last week, the Federal Communications Commission attributed the cause to an employee’s addition of a “misconfigured network element” to expand capacity without following the established procedure of peer review. The resulting cascade of failures was an automated response designed to prevent a misconfigured device from propagating. AT&T has put new preventative controls in place, and FCC chair Jessica Rosenworcel said the agency is considering how to increase accountabiliy for failing to follow best practice.

Much of this history is recorded in Peter G. Neumann’s ongoing RISKS Forum mailing list. In 2014, an update Apple issued to fix a flaw in a health app blocked users of its then-new iPhone 6 from connecting. In 2004, a failed modem upgrade knocked Cox Communications subscribers offline. My first direct experience was in the 1990s, when for a day CompuServe UK subsccribers had to dial Germany to pick up our email.

In these previous cases, though, the individuals affected had a direct relationship with the screw-up company. What’s exceptional about Crowdstrike is that the directly affected “users” were its 29,000 huge customer businesses. It was those companies’ resulting failures that turned millions of us into hostages to technological misfortune.

What’s more, in those earlier outages only one company and their direct customers were involved, and understanding the problem was relatively simple. In the case of Crowdstrike, it was hard to pinpoint the source of the problem at first because the direct effects were scattered (only Windows PCs awake to receive Crowdstrike updates) and the indirect effects were widespread.

The technical explanation of what happened, simplified, goes like this: Crowdstrike issued an update to its Falcon security software to block malware it spotted exploiting a vulnerability in Windows. The updated Falcon software sparked system crashes as PCs reacted to protect themselves against potential low-level damage (like a circuit breaker in your house tripping to protect your wiring from overload). Crowdstrike realized the error and pushed out a corrected update 79 minutes later. That fixed machines that hadn’t yet installed the faulty update. The machines that had updated in those 79 minutes, however, were stuck in a doom loop, crashing every time they restarted. Hence the need for manual intervention to remove those files in order to reboot successfully.

Microsoft initially estimated that 8.5 million PCs were affected – but that’s probably a wild underestimate as the only machines it could count were those that had crash reporting turned on.

The root cause is still unclear. Crowdstrike has said it found a hole in its Content Validator Tool, which should have caught the flaw. Microsoft is complaining that a 2009 interoperability agreement forced on it by the EU required it to allow Crowdstrike’s software to operate at the very low level on Windows machines that pushed the systems to crash. It’s wrong, however, to blame companies for enabling automated updates; security protection has to respond to new threats in real time.

The first financial estimates are emerging. Delta Airlines estimates the outage, which borked its crew tracking system for a week, cost it $500 million. CEO Ed Bastian told CNN, “They haven’t offered us anything.” Delta has hired lawyer David Boies, whose high-profile history began with leading the successful 1990s US government prosecution of Microsoft, to file its lawsuit.

Delta will need to take a number. Massachusetts-based Plymouth County Retirement Association has already filed a class action suit on behalf of Crowdstrike shareholders in Texas federal court, where Crowdstrike is headquartered, for misrepresenting its software and its capabilities. Crowdstrike says the case lacks merit.

Lawsuits are likely the only way companies will get recompense unless they have insurance to cover supplier-caused system failures. Like all software manufacturers, Crowdstrike has disclaimed all liability in its terms of use.

In a social media post, Federal Trade Commission chair Lina Khan said that, “These incidents reveal how concentration can create fragile systems.”

Well, yes. Technology experts have long warned of the dangers of monocultures that make our world more brittle. The thing is, we’re stuck with them because of scale. There were good reasons why the dozens of early network and operating systems consolidated: it’s simpler and cheaper for hiring, maintenance, and even security. Making our world less brittle will require holding companies – especially those that become significant points of failure – to meet higher standards of professionalism, including product liability for software, and requiring their customers to boost their resilience.

As for Crowdstrike, it is doomed to become that worst of all things for a company: a case study at business schools everywhere.

Illustrations: XKCD’s Dependency comic, altered by Mary Branscombe to reflect Crowdstrike’s reality.

Boxed up

If the actions of the owners of streaming services are creating the perfect conditions for the return of piracy, it’s equally true that the adtech industry’s decisions continue to encourage installing ad blockers as a matter of self-defense. This is overall a bad thing, since most of us can’t afford to pay for everything we want to read online.

This week, Google abruptly aborted a change it’s been working on for four years: it will abandon its plan to replace third-party cookies with new technology it called Privacy Sandbox. From the sounds of it, Google will continue working on the Sandbox, but will continue to retain third-party cookies. The privacy consequences of this are…muddy.

To recap: there are two kinds of cookies, which are small files websites place on your computer, distinguished by their source and use. Sites use first-party cookies to give their pages the equivalent of memory. They’re how the site remembers which items you’ve put in your cart, or that you’ve logged in to your account. These are the “essential cookies” that some consent banners mention, and without them you couldn’t use the web interactively.

Third-party cookies are trackers. Once a company deposits one of these things on your computer, it can use it to follow along as you browse the web, collecting data about you and your habits the whole time. To capture the ickiness of this, Demos researcher Carl Miller has suggested renaming them slime trails. Third-party cookies are why the same ads seem to follow you around the web. They are also why people in the UK and Europe see so many cookie consent banners: the EU’s General Data Protection Regulation requires all websites to obtain informed consent before dropping them on our machines. Ad blockers help here. They won’t stop you from seeing the banners, but they can save you the time you’d have to spend adjusting settings on the many sites that make it hard to say no.

The big technology companies are well aware that people hate both ads and being tracked in order to serve ads. In 2020, Apple announced that its Safari web browser would block third-party cookies by default, continuing work it started in 2017. This was one of several privacy-protecting moves the company made; in 2021, it began requiring iPhone apps to offer users the opportunity to opt out of tracking for advertising purposes at installation. In 2022, Meta estimated Apple’s move would cost it $10 billion that year.

If the cookie seemed doomed at that point, it seemed even more so when Google announced it was working on new technology that would do away with third-party cookies in its dominant Chrome browser. Like Apple, however, Google proposed to give users greater control only over the privacy invasions of third parties without in any way disturbing Google’s own ability to track users. Privacy advocates quickly recognized this.

At Ars Technica, Ron Amadeo describes the Sandbox’s inner workings. Briefly, it derives a list of advertising topics from the websites users visits, and shares those with web pages when they ask. This is what you turn on when you say yes to Chrome’s “ad privacy feature”. Back when it was announced, EFF’s Bennett Cyphers was deeply unimpressed: instead of new tracking versus old tracking, he asked, why can’t we have *no* tracking? Just a few days ago, EFF followed up with the news that its Privacy Badger browser add-on now opts users out of the Privacy Sandbox (EFF has also published manual instructions.).

Google intended to make this shift in stages, beginning the process of turning off third-party cookies in January 2024 and finishing the job in the second half of 2024. Now, when the day of completion should be rapidly approaching, the company has said it’s over – that is, it no longer plans to turn off third-party cookies. As Thomas Claburn writes at The Register, implementing the new technology still requires a lot of work from a lot of companies besides Google. The technology will remain in the browser – and users will “get” to choose which kind of tracking they prefer; Kevin Purdy reports at Ars Technica that the company is calling this a “new experience”.

At The Drum, Kendra Barnett reports that the UK’s Information Commissioner’s Office is unhappy about Google’s decision. Even though it had also identified possible vulnerabilities in the Sandbox’s design, the ICO had welcomed the plan to block third-party cookies.

I’d love to believe that Google’s announcement might have been helped by the fact that Sandbox is already the subject of legal action. Last month the privacy-protecting NGO noyb complained to the Austrian data protection authority, arguing that Sandbox tracking still requires user consent. Real consent, not obfuscated “ad privacy feature” stuff, as Richard Speed explains at The Register. But far more likely it’s money, At the Press Gazette, Jim Edwards reports that Sandbox could cost publishers 60% of their revenue “from programmatically sold ads”. Note, however, that the figure is courtesy of adtech company Criteo, likely a loser under Sandbox.

The question is what comes next. As Cyphers said, we deserve real choices: *whether* we are tracked, not just who gets to do it. Our lives should not be the leverage big technology companies use to enhance their already dominant position.

Illustrations: A sandbox (via Wikimedia)

Twenty comedians walk into a bar…

The Internet was, famously, created to withstand a bomb outage. In 1998 Matt Blaze and Steve Bellovin said it, in 2002 it was still true, and it remains true today, after 50 years of development: there are more efficient ways to kill the Internet than dropping a bomb.

Take today. The cybersecurity company Crowdstrike pushed out a buggy update, and half the world is down. Airports, businesses, the NHS appointment booking system, supermarkets, the UK’s train companies, retailers…all showing the Blue Screen of Death. Can we say “central points of failure”? Because there are two: Crowdstrike, whose cybersecurity is widespead, and Microsoft, whose Windows operating system is everywhere.

Note this hasn’t killed the *Internet*. It’s temporarily killed many systems *connected to* the Internet. But if you’re stuck in an airport where nothing’s working and confronted with a sign that says “Cash only” when you only have cards…well, at least you can go online to read the news.

The fix will be slow, because it involves starting the computer in safe mode and manually deleting files. Like Y2K remediation, one computer at a time.

***

Speaking of things that don’t work, three bits from the generative AI bubble. First, last week Goldman Sachs issued a scathing report on generative AI that concluded it is unlikely to ever repay the trillion-odd dollars companies are spending on it, while its energy demands could outstrip available supply. Conclusion: generative AI is a bubble that could nonetheless take a long time to burst.

Second, at 404 Media Emanuel Weiburg reads a report from the Tony Blair Institute that estimates that 40% of tasks performed by public sector workers could be partially automated. Blair himself compares generative AI to the industrial revolution. This comparison is more accurate than he may realize, since the industrial revolution brought climate change, and generative AI pours accelerant on it.

TBI’s estimate conflicts with that provided to Goldman by MIT economist Daron Acemoglu, who believes that AI will impact at most 4.6% of tasks in the next ten years. The source of TBI’s estimate? ChatGPT itself. It’s learned self-promotion from parsing our output?

Finally, in a study presented at ACM FAccT, four DeepMind researchers interviewed 20 comedians who do live shows and use AI to participate in workshops using large language models to help write jokes. “Most participants felt the LLMs did not succeed as a creativity support tool, by producing bland and biased comedy tropes, akin to ‘cruise ship comedy material from the 1950s, but a bit less racist’.” Last year, Julie Seabaugh at the LA Times interviewed 13 professional comedians and got similar responses. Ahmed Ahmed compared AI-generated comedy to eating processed foods and, crucially, it “lacks timing”.

***

Blair, who spent his 1997-2007 premiership pushing ID cards into law, has also been trying to revive this longheld obsession. Two days after Keir Starmer took office, Blair published a letter in the Sunday Times calling for its return. As has been true throughout the history of ID cards (PDF), every new revival presents it as a solution to a different problem. Blair’s 2024 reason is to control immigration (and keep the far-right Reform party at bay). Previously: prevent benefit fraud, combat terorism, streamline access to health, education, and other government services (“the entitlement card”), prevent health tourism.

Starmer promptly shot Blair down: “not part of the government’s plans”. This week Alan West, a home office minister 2007-2010 under Gordon Brown, followed up with a letter to the Guardian calling for ID cards because they would “enhance national security in the areas of terrorism, immigration and policing; facilitate access to online government services for the less well-off; help to stop identity theft; and facilitate international travel”.

Neither Blair (born 1953) nor West (born 1948) seems to realize how old and out of touch they sound. Even back then, the “card” was an obvious decoy. Given pervasive online access, a handheld reader, and the database, anyone’s identity could be checked anywhere at any time with no “card” required.

To sound modern they should call for institutionalizing live facial recognition, which is *already happening* by police fiat. Or sprinkled AI bubble on their ID database.

Databases and giant IT projects that failed – like the Post Office scandal – that was the 1990s way! We’ve moved on, even if they haven’t.

***

If you are not a deposed Conservative, Britain this week is like waking up sequentially from a series of nightmares. Yesterday, Keir Starmer definitively ruled out leaving the European Convention on Human Rights – Starmer’s background as a human rights lawyer to the fore. It’s a relief to hear after 14 years of Tory ministers – David Cameron,, Boris Johnson, Suella Braverman, Liz Truss, Rishi Sunak – whining that human rights law gets in the way of their heart’s desires. Like: building a DNA database, deporting refugees or sending them to Rwanda, a plan to turn back migrants in boats at sea.

Principles have to be supported in law; under the last government’s Public Order Act 2023 curbing “disruptive protest”, yesterday five Just Stop Oil protesters were jailed for four and five years. Still, for that brief moment it was all The Brotherhood of Man.

Illustrations: Windows’ Blue Screen of Death (via Wikimedia).

The return of piracy

In Internet terms, it’s been at least a generation since the high-profile fights over piracy – that is, the early 2000s legal actions against unauthorized sites offering music, TV, and films, and the people who used them. Visits to the news site TorrentFreak this week feel like a revival.

The wildest story concerns Z-Library, for some years the largest shadow book collection. Somewhere someone must be busily planning a true crime podcast series. Z-Library was briefly offline in 2022, when the US Department of Justice seized many of its domains. Shortly afterwards there arrived Anna’s Archive, a search engine for shadow libraries – Z-Library and many others, and the journal article shadow repository Sci-Hub. Judging from a small sampling exercise, you can find most books that have been out for longer than a month. Newer books tend to be official ebooks stripped of digital rights management.

In November 2022, the Russian nationals Anton Napolsky and Valeriia Ermakova were arrested in Argentina, alleged to be Z-Library administrators. The US requested extradition, and an Argentinian judge agreed. They appealed to the Argentinian supreme court, asking to be classed as political refugees. This week, a story in local publication La Voz, made its way north. As Ashley Belanger explains at Ars Technica, Napolsky and Ermakova decided not to wait for a judgment, escaped house arrest back in May, and vanished. The team running Z-library say the pair are innocent of copyright infringement.

Also missing in court: Anna’s Archive’s administrators. As noted here in February; the library service company OCLC sued Anna’s Archive for having exploited a security hole in its website in order to scrape 2,2TB of its book metadata. This might have gone unnoticed, except that the admins published the news on its blog. OCLC is claiming that the breach has cost millions to remediate its systems.

This week saw an update to the case: OCLC has moved for summary judgment as Anna’s Archive’s operators have failed to turn up in court. At TorrentFreak, Ernesto van der Sar reports that OCLC is also demanding millions in damages and injunctive relief barring Anna’s from continuing to publish the scraped data, though it does not ask for the site to be blocked. (The bit demanding that Anna’s Archive pay the costs of remediating OCLC’s flawed systems is puzzling; do we make burglars who climb in through open windows pay for locksmiths?)

And then there is the case of the Internet Archive’s Open Library, which claims its scanned-in books are legal under the novel theory of controlled digital lending. When the Internet Archive responded to the covid crisis by removing those controls in 2020, four major publishers filed suit. In 2023, the US District Court for the Southern District of New York ruled against the Internet Archive, saying its library enables copyright infringement. Since then, the Archive has removed 500,000 books.

This is the moment when lessons from the past of music, TV, and video piracy could be useful. Critics always said that the only workable answer to piracy is legal, affordable services, and they were right, as shown by Pandora, Spotify, Netflix, which launched its paid streaming service in 2007, and so many others.

It’s been obvious for at least two years that things are now going backwards. Last December, in one of many such stories, the Discovery/Warner Brothers merger ended a licensing agreement with Sony, leading the latter to delete from Playstation users’ libraries TV shows they had paid for in the belief that they would remain permanently available. The next generation is learning the lesson. Many friends over 40 say they can no longer play CDs or DVD; teenaged friends favor physical media because they’ve already learned that digital services can’t be trusted.

Last September, we learned that Hollywood studios were deleting finished, but unaired programs and parts of their back catalogues for tax reasons. Sometimes, shows have abruptly disappeared mid-season. This week, Paramount removed decades of Comedy Central video clips; last month it axed the MTV news archives. This is *our* culture, even if it’s *their* copyright.

Meanwhile, the design of streaming services has stagnated. The complaints people had years ago about interfaces that make it hard to find the shows they want to see are the same ones they have now. Content moves unpredictably from one service to another. Every service is bringing in ads and raising prices. The benefits that siphoned users from broadcast and cable are vanishing.

As against this, consider pirate sites: they have the most comprehensive libraries; there are no ads; you can use the full-featured player of your choice; no one other than you can delete them; and they are free. Logically, piracy should be going back up, and at least one study suggests it is. If only they paid creators…

The lesson: publishers may live to regret attacking the Internet Archive rather than finding ways to work with it – after all, it sends representatives to court hearings and obeys rulings; if so ordered, they might even pay authors. In 20 years, no one’s managed to sink The Pirate Bay; there’ll be no killing the shadow libraries either, especially since my sampling finds that the Internet Archive’s uncorrected scans are often the worst copies to read. Why let the pirate sites be the one to offer the best services?

Illustrations: The New York Public Library, built 1911 (via Wikimedia).

Safe

That didn’t take long. Since last week’s fret about AI startups ignoring the robots.txt convention, Thomas Claburn has reported at The Register that Cloudflare has developed a scraping prevention tool that identifies and blocks “content extraction” bots attempting to crawl sites at scale.

It’s a stopgap, not a solution. As Cloudflare’s announcement makes clear, the company knows there will be pushback; given these companies’ lack of interest in following existing norms, blocking tools versus scraping bots is basically the latest arms race (previously on this plotline: spam). Also, obviously, the tool only works on sites that are Cloudflare customers. Although these include many of the web’s largest sites, there are hundreds of millions more that won’t, don’t, or can’t pay for its services. If we want to return control to site owners, we’re going to need a more permanent and accesible solution.

In his 1999 book Code and Other Laws of Cyberspace, Lawrence Lessig finds four forms of regulation: norms, law, markets, and architecture. Norms are failing. Markets will just mean prolonged arms races. We’re going to need law and architecture.

***

We appear to be reaching peak “AI” hype, defined by (as in the peak of app hype) the increasing absurdity of things venture capitalists seem willing to fund. I recall reading the comment that at the peak of app silliness a lot of startups were really just putting a technological gloss on services that young men will previously have had supplied by their mothers. The AI bubble seems to be even less productive of long-term value, calling things “AI” that are not at all novel, and proposing “AI” to patch problems that call for real change.

As an example of the first of those, my new washing machine has a setting called “AI patterns”. The manual explains: it reorders the preset programs on the machine’s dial so the ones you use most appear first. It’s not stupid (although I’ve turned it off anyway, along with the wifi and “smart” features I would rather not pay for), but let’s call it what it is: customizing a menu.

As an example of the second…at Gizmodo, Maxwell Zeff reports that Softbank is claiming to have developed an “emotion canceling” AI that “alters angry voices into calm ones”. The use Softbank envisages is to lessen the stress for call center employees by softening the voices of angry customers without changing their actual words. There are, as people pointed out on Mastodon after the article was posted there, a lot smarter alternatives to reducing those individuals’ stress. Like giving them better employment conditions, or – and here’s a really radical thought – designing your services and products so your customers aren’t so frustrated and angry. What this software does is just falsify the sound. My guess is that if there is a result it will be to make customers even more angry and frustrated. More anger in the world. Great.

***

Oh! Sarcasm, even if only slight! At the Guardian, Ned Carter Miles reports on “emotional AI” (can we say “oxymoron”?). Among his examples is a team at the University of Groningen that is teaching an AI to recognize sarcasm using scenes from US sitcoms such as Friends and The Big Bang Theory. Even absurd-sounding research can be a good thing. I’m still not sure how good a guide sitcoms are for identifying emotions in real-world context even apart from the usual issues of algorithmic bias. After all, actors are given carefully crafted words and work harder to communicate their emotional content than ordinary people normally do.

***

Finally, again in the category of peak-AI-hype is this: at the New York Times Cade Metz is reporting that Ilya Sutskever, a co-founder and former chief scientist at OpenAI, has a new startup whose goal is to create a “safe superintelligence”.

Even if you, unlike me, believe that a “superintelligence” is an imminent possibility, what does “safe” mean, especially in an industry that still treats security and accessibility as add-ons? “Safe” is, like “secure”, meaningless without context and a threat model. Safe from what? Safe for what? To do what? Operated by whom? Owned by whom? With what motives? For how long? We create new intelligent humans all the time. Do we have any ability to ensure they’re “safe” technology? If an AGI is going to be smarter than a human, how can anyone possibly promise it will be, in the industry parlance, “aligned” with our goals? And for what value of “our”? Beware the people who want to build the Torment Nexus!

It’s nonsense. Safety can’t be programmed into a superintelligence any more than Isaac Asimov’s Laws of Robotics.

Sutskever’s own comments are equivocal. In a video clip at the Guardian, Sutsekver confusingly says both that “AI will solve all our problems” and that it will make fake news, cyber attacks, and weapons much worse and “has the potential to create infinitely stable dictatorships”. Then he adds, “I feel that technology is a force of nature.” Which is exactly the opposite of what technology is…but it suits the industry to push the inevitability narrative that technological progress cannot be stopped.

Cue Douglas Adams: “This is obviously some strange use of the word ‘safe’ I wasn’t previously aware of.”

Illustrations: The Big Bang Theory‘s Leonard (Johnny Galecki) teaching Sheldon (Jim Parsons) about sarcasm (Season 1, episode 2, “The Big Bran Hypothesis”).

Changing the faith

The governance of Britain and the governance of the Internet have this in common: the ultimate authority in both cases is to a large extent a “gentleman’s agreement”. For the same reason: both were devised by a relatively small, homogeneous group of people who trusted each other. In the case of Britain, inertia means that even without a written constitution the country goes on electing governments and passing laws as if.

Most people have no reason to know that the Internet’s technical underpinnings are defined by a series of documents known as RFCs, for Requests(s) for Comments. RFC1 was defined in April 1969; the most recent, RFC9598, is dated just last month. While the Internet Engineering Task Force oversees RFCs’ development and administration, it has no power to force anyone to adopt them. Throughout, RFC standards have been created collaboratively by volunteers and adopted on merit.

A fair number of RFCs promote good “Internet citizenship”. There are, for example, email addresses (chiefly, webmaster and postmaster) that anyone running a website is supposed to maintain in order to make it easy for a third party to report problems. Today, probably millions of website owners don’t even know this expectation exists. For Internet curmudgeons over a certain age, however, seeing email to those addresses bounce is frustrating.

Still, many of these good-citizen practices persist. One such is the Robots Exclusion Protocol, updated in 2022 as RFC 9309, which defines a file, “robots.txt”, that website owners can put in place to tell automated web crawlers which parts of the site they may access and copy. This may have mattered less in recent years than it did in 1994, when it was devised. As David Pierce recounts at The Verge, at that time an explosion of new bots were beginning to crawl the web to build directories and indexes (no Google until 1998!). Many of those early websites were hosted on very small systems based in people’s homes or small businesses, and could be overwhelmed by unrestrained crawlers. Robots txt, devised by a small group of administrators and developers, managed this problem.

Even without a legal requirement to adopt it, early Internet companies largely saw being good Internet citizens as benefiting them. They, too, were small at the time, and needed good will to bring them the users and customers that have since made them into giants. It served everyone’s interests to comply.

Until more or less now. This week, Katie Paul is reporting at Reuters that “AI” companies are blowing up this arrangement by ignoring robots.txt and scraping whatever they want. This news follows reporting by Randall Lane at Forbes that Perplexity.ai is using its software to generate stories and podcasts using news sites’ work without credit. At Wired, Druv Mehrotra and Tim Marchman report a similar story: Perplexity is ignoring robots.txt and scraping areas of sites that owners want left alone. At 404 Media, Emmanuel Maiberg reports that Perplexity also has a dubious history of using fake accounts to scrape Twitter data.

Let’s not just pick on Perplexity; this is the latest in a growing trend. Previously, hiQ Labs tried scraping data from LinkedIn in order to build services to sell employers, the courts finally ruled in 2019 that hiQ violated LinkedIn’s terms and conditions. More controversially, in the last few years Clearview AI has been responding to widespread criticism by claiming that any photograph published on the Internet is “public” and therefore within its rights to grab for its database and use to identify individuals online and offline. The result has been myriad legal actions under data protection law in the EU and UK, and, in the US, a sheaf of lawsuits. Last week, Kashmir Hill reported at the New York Times, that because Clearview lacks the funds to settle a class action lawsuit it has offered a 23% stake to Americans whose faces are in its database.

As Pierce (The Verge) writes, robots.txt used to represent a fair-enough trade: website owners got search engine visibility in return for their data, and the owners of the crawlers got the data but in return sent traffic.

But AI startups ingesting data to build models don’t offer any benefit in return. Where search engines have traditionally helped people find things on other sites, the owners of AI chatbots want to keep the traffic for themselves. Perplexity bills itself as an “answer engine”. A second key difference is this: none of these businesses are small. As Vladen Joler pointed out last month at CPDP, “AI comes pre-monopolized.” Getting into this area requires billions in funding; by contrast many early Internet businesses started with just a few hundred dollars.

This all feels like a watershed moment for the Internet. For most of its history, as Charles Arthur writes at The Overspill, every advance has exposed another area where the Internet operates on the basis of good faith. Typically, the result is some form of closure – spam, for example, led the operators of mail servers to close to all but authenticated users. It’s not clear to a non-technical person what stronger measure other than copyright law could replace the genteel agreement of robots.txt, but the alternative will likely be closing open access to large parts of the web – a loss to all of us.

Illustrations: Vladen Joler at CPDP 2024, showing his map of the extractive industries required to underpin “AI”.