Mean time between failures

Normal people should not know names like “US-East-1”, someone said this week on Mastodon, or more or less. “US-East-1” is the section of Amazon Web Services that went out last month to widely disruptive effect. What this social media poster was getting at, while contemplating this week’s Cloudflare outage, is the fact that the series of recent Internet-related outages has made network nodes that were previously only known to technical specialists into household names.

For the history-minded, there was a moment like this in 1988, when a badly-written worm put the Internet on newspapers’ front pages for the first time. The Internet was then so little known that every story had to explain what it was – primarily, then, a network connecting government, academic, and corporate scientific research institutions. Now, stories are explaining network architecture. I guess that’s progress?

Much less detailed knowledge was needed to understand what happened on Tuesday, when Cloudflare went down, taking with it access to Spotify, Uber, Grindr, Ikea, Microsoft CoPilot, Politico, and even, in London, its VPN service (says Wikipedia). Cloudflare offers content delivery and protection against distributed denial of service attacks, and as such it interpolates itself into all sorts of Internet interactions. I often see it demanding action to prove I’m not a robot; in that mode it’s hard to miss. That said, many sites really do need the protection it offers against large-scale attacks. Attacks at scale require defense at scale.

Ironically, one of the sites lost in the Cloudflare outage was DownDetector, a site that helps you know if the site you can’t reach is down for everyone or just you, one of several such debugging tools for figuring out who needs to fix what.

So, Cloudflare was Tuesday. Amazon’s outage was just about a month ago, on October 20. Microsoft Azure, another DNS error, was just a week later. All three of these had effects across large parts of the network.

Is this a trend or just a random coincidental cluster in a sea of possibilities?

One thing that’s dispiriting about these outages is that so often the causes are traceable to issues that have been well-understood for years. With Amazon it was a DNS error. Microsoft also had a DNS issue “following an inadvertent configuration change”. Cloudflare’s issue may have been less predictable; The Verge reports its problem was a software crash caused by a “feature file” used by its bot management system abruptly doubling in size, taking it above the size the software was designed to handle.

Also at The Verge, Emma Roth thinks it’s enough of a trend that website owners need to start thinking about backup – that is, failover – plans. Correctly, she says the widespread impact of these outages shows how concentrated infrastructure service provision has become. She cites Signal CEO Meredith Whittaker: the encrypted messaging service can’t find an alternative to using one of the three or four major cloud providers.

At Krebs on Security, Brian Krebs warns sites that managed to pivot their domains away from Cloudflare to keep themselves available during the outage need to examine their logs for signs of the attacks Cloudflare normally protects them from and put effort into fixing the common vulnerabilities they find. And then also: consider spreading the load so there isn’t a single point of failure. As I understand it, Netflix did this after the 2017 AWS outage.

For any single one of these giant providers, significant outages are not common. This was, Jon Brodin says at Ars Technica, Cloudflare’s worst outage since 2019. That one was due to a badly written firewall rule. But increasing size also brings increasing complexity, and, as these outages have also shown, even the largest network can be disrupted at scale by a very small mistake.

Elsewhere, a software engineer friend and I have been talking about “mean time between failures”, a measure normally applied to hard drives, servers, or other components. There, it’s much more easily measured – run a load of drives, time when they fail, take an average… With the Internet, so much depends on your individual setup. But beyond that: what counts as failure? My friend suggested setting thresholds based on impact: number of people, length of time, extent of cascading failures. Being able to quantify outages might help get a better sense of whether it’s a trend or a random cluster. The bottom line, though, is clear already: increasing centralization means that when outages occur they are further-reaching and disruptive in unpredictable ways. This trend can only continue, even if the outages themselves become rarer.

Most of us have no control over the infrastructure decisions sites and services make, or even any real way to know what they are. We can counter this to some extent by diversifying our own dependencies.

In the first decade or two of the Internet, we could always revert to older ways of doing things. Increasingly, this is impossible because either those older methods have been turned off or because technology has taken us places where the old ways didn’t go. We need to focus a lot more on making the new systems robust, or face a future as hostages.

Illustrations: Traffic jam in New York’s Herald Square, 1973 (via Wikimedia).

Also this week
– At the Plutopia podcast, we interview Jennifer Granick, the Surveillance and Cybersecurity counsel at the ACLU about the expansion of government and corporate surveillance and the increasing threat to civil liberties.
– At Skeptical Inquirer, I interview juggling mathematician Colin Wright about spreading enthusiasm for mathematics.

Wendy M. Grossman is an award-winning journalist. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon or Bluesky.

It’s always DNS…

Years ago, someone in tech support at Telewest, then the cable supplier for southwest London, told me that if my broadband went out I should hope its television service went down too: the volume of complaints would get it fixed much faster. You could see this in action some years later, in 2017, when Amazon Web Services went down, taking with it Netflix. Until that moment few had realized that Netflix built its streaming service on Amazon’s cloud computing platform to take advantage of its flexibility in up- and down-sizing infrastructure. The source – an engineer’s typing error – was quickly traced and fixed, and later I was told the incident led Netflix to diversify its suppliers. You would think!

Even so, Netflix was one of the companies affected on Monday, when a DNS error took out a chunk of AWS, and people from gamers on Roblox to governments with mission-critical dependencies were affected. On the list of the affected are both the expected (Alexa and Ring) and the unexpected (Apple TV, Snapchat, Hulu, Google, Fortnite, Lyft, T-Mobile, Verizon, Venmo, Zoom, and the New York Times). To that add the UK government. At the Guardian, Simon Goodley says the UK government has awarded AWS £1.7 billion in contracts across 35 public sector authorities, despite warnings from the Treasury, the Financial Conduct Authority, and the Prudential Regulation Authority. Among the AWS-dependent: the Home Office, the Department of Work and Pensions, HM Revenue and Customs, and the Cabinet Office.

First, to explain the mistake – so common that experts said “It’s always DNS” and so old that early Internet pioneers said “We shouldn’t be having DNS errors any more”. The Domain Name System, conceived in 1983 by Paul Mockapetris, is a core piece of how the Internet routes traffic. When you type or click on a domain name such as “pelicancrossing.net”, behind the scenes a computer translates that name into a series of dotted numbers that identify the request’s destination. An error in those numbers, no matter how small, means the message – data, search request, email, whatever – can’t reach its destination, just as you can’t reach the recipient you want if you get a telephone number wrong. The upshot of all that is that DNS errors snarl traffic. In the AWS case, the error affected just one of its 30 regions, which is why Monday’s outages were patchy.

As Dan Milmo and Graham Wearden write at the Guardian, the outage has focused many minds on the need to diversify cloud computing. Taken together, Amazon (30%), Microsoft Azure (20%), and Google (13%) jointly control 63% of the market worldwide. There have been many such warnings.

At The Register, Carly Page reports on the individual level: smart homes turned dumb. Eightsleep beds stuck in an upright position and lost their temperature controls. App-controlled litter boxes stopped communicating. “Smart” light bulbs stayed dark. The Internet of Other People’s Things at its finest.

Also at The Register, Corey Quinn suggests the DNS error was ultimately attributable to an ongoing exodus of senior AWS engineers who took with them essential institutional memory. Once you’ve reached a certain level of scale, Quinn writes, every problem is complex and being able to remember that a similar issue on a previous occasion was traced improbably to a different system in a corner somewhere can be crucial. As departures continue, Quinn believes failures like these will become more common.

If that global picture is dispiriting, consider also the question of dependence within organizations; if your country depends on a single company’s infrastructure to power mission-critical systems, the diversity in the rest of the world won’t help you if that single company goes out. In the UK, Sam Trendall reports at Public Technology, the government activated incident-response mechanisms. Notable among the failures as prime minister Keir Starmer pushes for a mandatory digital ID: the government’s new One Login, as well as some UK banks. This outage provides evidence for the digital sovereignty many have been advocating.

I admit to mixed feelings. I agree with the many who believe the public sector should embrace digital sovereignty…but I also know that the UK government has a terrible record of failed IT projects, no matter who builds them. In 2010, fixing that was part of the motivation for setting up the Government Digital Service, as first GDS leader Mike Bracken writes at Public Digital. Yet the failures keep coming; see also the Post Office Horizon scandal. Bracken believes the solution is to invest in public sector capacity and digital expertise in order to end this litany of expensive failures.

At TechRadar, Benedict Collins rounds up further expert commentary, largely in agreement about the lessons we should learn. But will we? We should have learned in 2017.

Still, it would be a mistake to focus solely on Amazon. It is just one of many centralized points of failure. The is dangerously important as a unique resource for archived web pages. And the UK is not the only government flying at high-risk. Consider South Korea, where a few weeks ago a data center fire may have consumed 85TB of government data – with no backups. It seems we never really learn.

Illustrations: Traffic jam in New York’s Herald Square, 1973 (via Wikimedia).

Wendy M. Grossman is an award-winning journalist. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon or Bluesky.

Email to Ofgem

So, the US has claimed victory against the UK.

Regular readers may recall that in February the UK’s Home Office secretly asked Apple to put a backdoor in the Advanced Data Protection encryption it offers as a feature for iCloud users. In March, Apple challenged the order. The US objected to the requirement that the backdoor should apply to all users worldwide. How dare the Home Office demand the ability to spy on Americans?

On Tuesday, US director of national intelligence Tulsi Gabbard announced the UK is dropping its demand for the backdoor in Apple’s encryption “that would have enabled access to the protected encrypted data of American citizens”. The key here is “American citizens”. The announcement – which the Home Office is refusing to comment on – ignores everyone else and also the requirement for secrecy. It’s safe to say that few other countries would succeed in pressuring the UK in this way.

As Bll Goodwin reports at Computer Weekly, the US deal does nothing to change the situation for people in Britain or elsewhere. The Investigatory Powers Act (2016) is unchanged. As Parmy Olson writes at Bloomberg, the Home Office can go on issuing Technical Capability Notices to Apple and other companies demanding information on their users that the criminalization of disclosure will keep the companies silent. The Home Office can still order technology companies operating in the UK to weaken their security. And we will not know they’ve done it. Surprisingly, support for this point of view comes from the Federal Trade Commission, which has posted a letter to companies deploring foreign anti-encryption policy (ignoring how often undermining encryption has been US policy, too) and foreign censorship of Americans’ speech. This is far from over, even in the US.

Within the UK, the situation remains as dangerously uncertain as ever. With all countries interconnected, the UK’s policy risks the security of everyone everywhere. And, although US media may have forgotten, the US has long spied on its citizens by getting another country to do it.

Apple has remained silent, but so far has not withdrawn its legal challenge. Also continuing is the case filed by Privacy International, Liberty, and two individuals. In a recent update, PI says both legal cases will be heard over seven days in 2026 as much as possible in the open.

***

For non-UK folk: The Office of Gas and Electricity Markets (Ofgem) is the regulator for Britain’s energy market. Its job is to protect consumers.

To Ofgem:

Today’s Guardian (and many others) carries the news that Tesla EMEA has filed an application to supply British homes and businesses with energy.

Please do not approve this application.

I am a journalist who has covered the Internet and computer industries for 35 years. As we all know, Tesla is owned by Elon Musk. Quite apart from his controversial politics and actions within the US government, Elon Musk has shown himself to be an unstable personality who runs his companies recklessly. Many who have Tesla cars love them – but the cars have higher rates of quality control problems than those from other manufacturers, and Musk’s insistence on marketing the “Full Self Drive” feature has cost lives according to the US National Highway and Transportation Safety Agency, which launched yet another investigation into the company just yesterday. In many cases, when individuals have sought data from Tesla to understand why their relatives died in car fires or crashes the company has refused to help them. During the covid emergency, thousands of Tesla workers got covid because Musk insisted on reopening the Tesla factory. This is not a company people should trust with their homes.

With Starlink, Musk has exercised his considerable global power by turning off communications in Ukraine while it was fighting back Russian attacks. SpaceX launches continue to crash. According to the children’s commissioner’s latest report, far more children encounter pornography online on Musk’s X than on pornography sites, a problem that has gotten far worse since Musk took it over.

More generally, he is an enemy of workers’ rights. Misinformation on X helped fuel the Southport riots, and Musk himself has considered trying to oust Keir Starmer as prime minister.

Many are understandably awed by his technological ideas. But he uses these to garner government subsidies and undermine public infrastructure, which he then is able to wield as a weapon to suit his latest whims.

Musk is already far too powerful in the world. His actions in the White House have shown he is either unable to understand or entirely uninterested in the concerns and challenges that face people living on sums that to him seem negligible. He is even less interested in – and often actively opposes – social justice, fairness, and equity. No amount of separation between him and Tesla EMEA will be sufficient to counter his control of and influence over his company. Tesla’s board, just weeks ago, voted to award him $30 billion in shares to “energise and focus” him.

Please do not grant him a foothold in Britain’s public infrastructure. Whatever his company is planning, it does not have British interests at heart.

Ofgem is accepting public comments on Tesla’s application until close of business on Friday, August 22, 2025.

Illustration: Artist Dominic Wilcox’s Stained Glass Driverless Sleeper Car..

Wendy M. Grossman is an award-winning journalist. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon or Bluesky.

Drought conditions

At 404 Media, Matthew Gault was first to spot a press release from the UK’s National Drought Group offering a list of things we can do to save water. The meeting makes sense: people think of the UK as a rainy country, but an increasing number of parts of the UK are experiencing extraordinarily dry weather. This “green and pleasant England” is brown.

Last on the Group’s list of things we can do to save water at home: “Delete old emails and pictures as data centres require vast amounts of water to cool their systems.”

I had to look up the National Drought Group. Says Water Magazine: “The National Drought Group includes the Met[eorology] Office, government, regulators, water companies, farmers, the [Canal and River Trust], angling groups and conservation experts. With further warm, dry weather expected, the NDG will continue to meet regularly to coordinate the national response and safeguard water supplies for people, agriculture, and the environment.”

For those outside the UK: its ten water companies are particular unpopular just now. Created by privatization during Margaret Thatcher’s decade as prime minister, six are being sued for £500 million for “underreporting sewage spills”. Others are being sued for overcharging 35 million household water customers. As just one example, Thames Water will raise prices by 35% over the next three years (on top of other recent rises), and expects customers to pay £7.5 billion for a new reservoir in Oxfordshire. It already has £17 billion in debt, and this week we learned environment secretary Steve Reed has made contingency plans in case the company goes bust. As George Monbiot writes at the Guardian, money that should have been invested in infrastructure went instead to shareholders. Climate change is a factor, sure, but so is poor water management.

All this being the case, the impact consumers can have by doing even the most effective things is dwarfed by the water companies’ failures. Deleting emails is not one of the most effective things.

At his The Weird Turn Pro Substack, Andy Masley provides some useful comparisons. Basic conclusion: you’d have to delete billions of emails to equal the savings of fixing your leaking toilet (if you have one). The whole thing reminds me of a while back when everyone was being told to save electricity by unplugging everything to extinguish all those standby lights. Last year, Which pointed out that the savings are really, really small.

The bizarre idea of deleting emails is coming, at least in part, from a government that is proposing a raft of technology-related legislation and wants, in the next five to ten years, to mastermind all sorts of IT projects, from making AI pervasive throughout government to bringing in a digital ID card. Are they thinking about the data centers they’ll need and the impact they’ll have on water management? Maybe instead tell people not to use generative AI or mine cryptocurrencies?

This much is true: data centers are a problem across the world because they require extreme amounts of water for cooling. In recent examples: at the New York Times, Eli Tan visits the US state of Georgia. At Rest of World, last year Ushar Daniele and Khadija Alam predicted upcoming water shortages in Malaysia, and Claudia Urquieta and Daniela Dib found protests in Chile, where 28 new data centers are planned.

Telling people to delete emails and pictures is just embarrassing – and sad, if people actually do it and sacrifice personal history they care about. As Masley writes, “Major governments should really know better than this.”

***

Two weeks ago we noted the arrival of age verification in the UK. Related, on May 8 the Wikimedia Foundation announced it had filed a legal challenge to the categorization provisions of the Online Safety Act (not the Act itself). The basic problem: there is little in the Act to distinguish between Wikipedia, a crowd-edited provider of highly curated information, and Facebook…or X.

The Foundation says nearly 260,000 volunteers worldwide in 300 languages contribute to Wikipedia. I do myself, but verified or not, I’m in no danger. Many are contributing factual information in countries where the facts offend an authoritarian government intent on shutting them up. The Foundation argues that 1) Wikipedia is “one of the world’s most trusted and widely used digital public goods; 2) it is at risk of being placed in the highest-risk category because of its size and interactive structure; 2) being so categorized would force it to verify the identity of contributors, placing many at risk; 4) could endanger the existence of tools the site uses to combat harmful content; 5) “criminal anonymous abuse”, which is what the Category 1 duty is supposed to help solve, isn’t a problem Wikipedia has. Instead, identifying volunteers is more likely to expose them to it.

So bad news: on August 11, the High Court of Justice dismissed the case.

The better news is that Justice Jeremy Johnson warned that if Ofcom does place Wikipedia in Category 1, it would have to be justifiable as proportionate. The judge also acknowledged the testimony of a user identified as “BLN”, who provided evidence of the extensive threats editors can face.

No one claims Wikipedia is perfect. But it remains an extraordinary collaborative achievement and a public good. It would be a horrifying consequence if legislation intended to protect children deprived them of it.

Illustrations: Kew Green, August 2025.

Wendy M. Grossman is an award-winning journalist. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon or Bluesky.

Magic math balls

So many ironies, so little time. According to the Financial Times (and syndicated at Ars Technica), the US government, which itself has traditionally demanded law enforcement access to encrypted messages and data, is pushing the UK to drop its demand that Apple weaken its encryption. Normally, you want to say, Look here, countries are entitled to have their own laws whether the US likes it or not. But this is not a law we like!

This all began in February, when the Washington Post reported that the UK’s Home Office had issued Apple with a Technical Capability Notice. Issued under the Investigatory Powers Act (2016) and supposed to be kept secret, the TCN demanded that Apple undermine the end-to-end encryption used for iCloud’s Advanced Data Protection feature. Much protest ensued, followed by two legal cases in front of the Investigatory Powers Tribunal, one brought by Apple, the other by Privacy International and Liberty. WhatsApp has joined Apple’s legal challenge.

Meanwhile, Apple withdrew ADP in the UK. Some people argued this didn’t really matter, as few used it, which I’d call a failure of user experience design rather than an indication that people didn’t care about it. More of us saw it as setting a dangerous precedent for both encryption and the use of secret notices undermining cybersecurity.

The secrecy of TCNs is clearly wrong and presents a moral hazard for governments that may prefer to keep vulnerabilities secret so they can take advantage for surveillance purposes. Hopefully, the Tribunal will eventually agree and force a change in the law. The Foundation for Information Policy Research (obDisclosure: I’m a FIPR board member) has published a statement explaining the issues.

According to the Financial Times, the US government is applying a sufficiently potent threat of tariffs to lead the UK government to mull how to back down. Even without that particular threat, it’s not clear how much the UK can resist. As Angus Hanton documented last year in the book Vassal State, the US has many well-established ways of exerting its influence here. And the vectors are growing; Keir Starmer’s Labour government seems intent on embedding US technology and companies into the heart of government infrastructure despite the obvious and increasing risks of doing so. When I read Hanton’s book earlier this year, I thought remaining in the EU might have provided some protection, but Caroline Donnelly warns at Computer Weekly that they, too, are becoming dangerously dependent on US technology, specifically Microsoft.

It’s tempting to blame everything on the present administration, but the reality is that the US has long used trade policy and treaties to push other countries into adopting laws regardless of their citizens’ preferences.

***

As if things couldn’t get any more surreal, this week the Trump administration *also* issued an executive order banning “woke AI” in the federal government. AI models are in future supposed to be “politically neutral”. So, as Kevin Roose writes at the New York Times, the culture wars are coming for AI.

The US president is accusing chatbots of “Marxist lunacy”, where the rest of the world calls them inaccurate, biased toward repeating and expanding historical prejudices, and inconsistent. We hear plenty about chatbots adopting Nazi tropes; I haven’t heard of one promoting workers’ and migrants’ rights.

If we know one thing about AI models it’s that they’re full of crap all the way down. The big problem is that people are deploying them anyway. At the Canary, Steve Topple reports that the UK’s Department of Work and Pensions admits in a newly-published report that its algorithm for assessing whether benefit claimants might commit fraud is ageist and and racist. A helpful executive order would set must-meet standards for *accuracy*. But we do not live in those times.

The Guardian reports that two more Trump EOs expedite building new data centers, promote exports of American AI models, expand the use of AI in the federal government, and intend to solidify US dominance in the field. Oh, and Trump would really like if it people would stop calling it “artificial” and find a new name. Seven years ago, aspirational intelligence” seemed like a good idea. But that was back when we heard a lot about incorporating ethics. So…”magic math ball”?

These days, development seems to proceed ethics-free. DWP’s report, for example, advocates retraining its flawed algorithm but says continuing to operate it is “reasonable and proportionate”. In 2021, for European Digital Rights Initiative, Agathe Balayn and Seda Gürses found, “Debiasing locates the problems and solutions in algorithmic inputs and outputs, shifting political problems into the domain of design, dominated by commercial actors.” In other words, no matter what you think is “neutral”, training data, model, and algorithms are only as “neutral” as their wider context allows them to be.

Meanwhile, nothing to curb the escalating waste. At 404 Media, Emanuel Maiberg finds that Spotify is publishing AI-generated songs from dead artists without anyone’s’ permission. On Monday, MSNBC’s Rachel Maddow told viewers that there’s so much “AI slop ” about her that they’ve posted Is That Really Rachel? to catalog and debunk them.

As Ed Zitron writes, the opportunity costs are enormous.

In the UK, the US, and many other places, data centers are threatening the water supply.

But sure, let’s make more of that.

Illustrations: Magic 8 ball toy (via frankieleon at Wikimedia).

Wendy M. Grossman is an award-winning journalist. Her website has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon or Bluesky.

The Skype of it all

This week, Microsoft shuttered Skype. For a lot of people, it’s a sad but nostalgic moment. Sad, because for older Internet users it brings back memories of the connections it facilitated; nostalgic because hardly anyone seemed to be using it any more. As Chris Stokel-Walker wrote at Wired in 2021, somehow when covid arrived and poured accelerant on remote communications, everyone turned to Zoom instead. Stokel-Walker blamed the Microsoft team for lacking focus on the bit that mattered most: keeping the video link up and stable. Zoom had better video, true, but also far better usability in terms of getting people to calls.

Skype’s service – technically, VoIP, for Voice over Internet Protocol – was pioneering in its time, which arguably peaked around 2010. Like CompuServe before it and Twitter since, there was a period when everyone had their Skype ID on their business cards. In 2005, when eBay bought it for $1.3 billion, it was being widely copied. In 2009, when eBay sold it to an investor group, it was valued at $2.75 billion.

In 2011, Microsoft bought it for $8.5 billion in cash, to general puzzlement as to *why* and why for *so much*. I thought eBay would somehow embed it into its transaction infrastructure as it had Paypal, which it had bought in 2002 for $1.5 billion (and then in 2014 spun off as a public company). Similarly, Wired talked of Microsoft embedding it into its Xbox Live network. Instead, the company fiddled with the app in the general shift from desktop to mobile. Ironic, given that Skype was a *phone* app. If it struggled like Facebook did to make the change, it’s kind of embarrassing.

Forgotten in all this is the fact that although Skype was the first VoIP application to gain mainstream acceptance, it was not the first to connect phone calls over the Internet. That was the long-forgotten Free World Dial-Up project, pioneered by Jeff Pulver. On the ground I imagined Free World Dial-Up as looking something like the switchboard and radio phone Radar O’Reilly (Gary Burghoff) used in the TV series M*A*S*H (1973-1982), who was patching phone calls being transmitted via radio networks. As Pulver described it, calls were sent across the Internet between servers, each connected to a box that patched the calls into the local phone system.

Rereading my notes from my 1995 interview with Pulver, when he was just getting his service up and running, it’s astonishing to remember how many hurdles there were for his prototype VoIP project to overcome – and this was all being done by volunteers. In many countries outside North America, charges for local phone calls made it financially risky to run a server. Some countries had prohibitive licensing regulations that made it illegal to offer such a service if you weren’t a telephone company. The hardware and software were readily available but had to be bought and required tinkering to set up. Plus, few outside the business world had continuous high-speed connections; most of us were using modems to dial up a service provider.

Small surprise that those early calls were not great. A Chicago recipient of a test call said she’d had better connections over the traditional phone network to Harare. Network lag made it more like a store-and-forward audio clipping service than a phone call. This didn’t matter as much to people with a history in ham radio, like Pulver himself; they were used to the cognitive effort to understand despite static and dropouts.

On the other hand, international calling was so wildly expensive at the time that even so FWD opened up calling for half a million people.

FWD was the experiment that proved the demand and the potential. Soon, numerous companies were setting up to offer VoIP services via desktop applications of varying quality and usability. It was into this hodge-podge that Skype was launched in 2003 from Estonia. For a time, it kept getting better: it began with free calling between Skype users and paid calls to phone lines, and moved on to offering local phone numbers around the world, as Google Voice does now.

Around the early 2000s it was popular to predict that VoIP services would kill off telephone companies. This was a moment when network neutrality, now under threat, was crucial; had telcos been allowed to discriminate against VoIP traffic, we’d all still be paying through the nose for international calling and probably wouldn’t have had video calling during the covid lockdowns.

Instead, the telcos themselves have become VoIP companies. In 2007, BT was the first to announce it was converting its entire network to IP. That process is supposed to complete this year. My landline is already a VoIP line. (Downside: no electricity, no telecommunications.)

Pulver, I find, is still pushing away at the boundaries of telecommunications. His website these days is full of virtualized conversations (vCons) and Supply Chain Integrity, Transparency, and Trust (SWICC), which he explains here (PDF). The first is an IETF proposed standard for AI-enhanced digital records. The second is an IETF proposed framework that intends to define “a set of interoperable building blocks that will allow implementers to build integrity and accountability into software supply chain systems to help assure trustworthy operation”. This is the sort of thing that may make a big difference to companies while being invisible and/or frustrating to most of us.

As for Skype, it will fade from human memory. If it ever comes up, we’ll struggle to explain what it was to a generation who have no idea that calling across the world was ever difficult and expensive.

Illustrations: Radar O’Reilly (Gary Burghoff) in the TV series M*A*S*H with his radio telephone setup.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon or Bluesky.

A hole is a hole

We told you so.

By “we” I mean thousands, of privacy advocates, human rights activists, technical experts, and information security journalists.

By “so”, I mean: we all said repeatedly over decades that there is no such thing as a magic hole that only “good guys” can use. If you build a supposedly secure system but put in a hole to give the “authorities” access to communications, that hole can and will be exploited by “bad guys” you didn’t want spying on you.

The particular hole Chinese hackers used to spy on the US is the Communications Assistance for Law Enforcement Act (1994). CALEA mandates that telecommunications providers design their equipment so that they can wiretap any customer if law enforcement presents a warrant. At Techcrunch, Zack Whittaker recaps much of the history, tracing technology giants’ new emphasis on end-to-end encryption to the 2013 Snowden revelations of the government’s spying on US citizens.

The mid-1990s were a time of profound change for telecommunications: the Internet was arriving, exchanges were converting from analog to digital, and deregulation was providing new competition for legacy telcos. In those pre-broadband years, hundreds of ISPs offered dial-up Internet access. Law enforcement could no longer just call up a single central office to place a wiretap. When CALEA was introduced, critics were clear and prolific; for an in-depth history see Susan Landau’s and Whit Diffie’s book, Privacy on the Line (originally published 1998, second edition 2007). The net.wars archive includes a compilation of years of related arguments, and at Techdirt, Mike Masnick reviews the decades of law enforcement insistence that they need access to encrypted text. “Lawful access” is the latest term of art.

In the immediate post-9/11 shock, some of those who insisted on the 1990s version of today’s “lawful access” – key escrow, took the opportunity to tell their opponents (us) that the attacks proved we’d been wrong. One such was the just-departed Jack Straw, the home secretary from 1997 to (June) 2001, who blamed BBC Radio Four and “…large parts of the industry, backed by some people who I think will now recognise they were very naive in retrospect”. That comment sparked the first net.wars column. We could now say, “Right back atcha.”

Whatever you call an encryption backdoor, building a hole into communications security was, is, and will always be a dangerous idea, as the Dutch government recently told the EU. Now, we have hard evidence.

***

The time is long gone when people used to be snobbish about Internet addresses (see net.wars-the-book, chapter three). Most of us are therefore unlikely to have thought much about the geekishly popular “.io”. It could be a new-fangled generic top-level domain – but it’s not. We have been reading linguistic meaning into what is in fact a country code. Which is all fine and good, except that the country it belongs to is the Chagos Islands, also known as the British Indian Ocean Territory, which I had never heard of until the British government announced recently that it will hand the islands back to Mauritius (instead of asking the Chagos Islanders what they want…). Gareth Edwards made the connection: when that transfer happens, .io will cease to exist (h/t Charles Arthur’s The Overspill).

Edwards goes on to discuss the messy history of orphaned country code domains: Yugoslavia, and the Soviet Union. As a result, ICANN, the naming authority, now has strict rules that mandate termination in such cases. This time, there’s a lot at stake: .io is a favorite among gamers, crypto companies, and many others, some of them substantial businesses. Perhaps a solution – such as setting .io up anew as a gTLD with its domains intact – will be created. But meantime, it’s worth noting that the widely used .tv (Tuvalu), .fm (Federated States of Micronesia), and .ai (Anguilla) are *also* country code domains.

***

The story of what’s going on with Automattic, the owner of the blogging platform WordPress.com, and WP Engine, which provides hosting and other services for businesses using WordPress, is hella confusing. It’s also worrying: WordPress, which is open source content management software overseen by the WordPress Foundation, powers a little over 40% of the Internet’s top ten million websites and more than 60% of sites overall (including this one).

At Heise Online, Kornelius Kindermann offers one of the clearer explanations: Automattic, whose CEO, Matthew Mullenweg is also a director of the WordPress Foundation and a co-creator of the software, wants WP Engine, which has been taken over by the investment company Silver Lake, to pay “trademark royalties” of 8% to the WordPress Foundation to support the software. WP Engine doesn’t wanna. Kindermann estimates the sum involved at $35 million, After the news of all that broke, 159 employees have announced they are leaving Automattic.

The more important point that, like the users of the encrypted services governments want to compromise, the owners of .io domains, or, ultimately, the Chagos Islanders themselves, WP Engine’s customers, some of them businesses worth millions, are hostages of uncertainty surrounding the decisions of others. Open source software is supposed to give users greater control. But as always, complexity brings experts and financial opportunities, and once there’s money everyone wants some of it.

Illustrations: View of the Chagos Archipelago taken during ISS Expedition 60 (NASA, via Wikimedia).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

A three-hour tour

It should be easy for the UK’s Competition Authority to shut down the proposed merger of Vodafone and Three, two of the UK’s four major mobile network providers. Remaining as competition post-merger would be EE (owned by BT) and Virgin Media O2 (owned by the Spanish company Telefónica and the US-listed company Liberty Global).

The trade union Unite is correctly calling the likely consequences: higher prices, fewer choices, job losses, and poorer customer service. In response, Vodafone and Three are dangling a shiny object of temptation: investment in building 5G network.

Well, hogwash. I would say “Don’t do this” even if I weren’t a Three customer (who left Vodafone years ago). Let them agree to collaborate on building a sbared network and compete on quality and services, but not merge. See the US broadband market, where prices are high, speeds are low, and frustrated consumers rarely have more than one option and take heed.

***

It’s a relief to see some sanity arriving around generative AI. As a glance at the archives will show, I’ve never been a fan; last year Jon Crowcroft and I predicted the eventual demise of large language models due to model collapse. Now, David Gray Widder and Mar Hicks warn in a paper that although the generative AI bubble is deflating, its damage will persist: “…carbon can’t be put back in the ground, workers continue to need to fend off AI’s disciplining effects, and the poisonous effect on our information commons will be hard to undo.”

This week offers worked examples. Re disinformation, at The Verge Sarah Jeong describes the change in our relationship with photographs arriving with new smartphones’ ability to fake realistic images. At The Register, Dan Robinson reports that data centers and AI are causing a substantial rise in water use in the US state of Virginia.

As evidence of the deflating bubble, Widder and Hicks cite the recent Goldman Sachs report arguing that generative AI is unlikely ever to pay back its investment.

And yet: to exploit generative AI, companies and governments are reversing or delaying programs to lower carbon emissions. Also alarmingly, Widder and Hicks wonder if generative AI was always meant to fail and its promoters’ only real goals were to scoop up profits and use the inevitability narrative to make generative AI a vector for embedding infrastructural dependencies (for example, on cloud computing).

That outcome doesn’t have to have been a plan – or a conspiracy theory, just as robber barons don’t actually need to conspire in order to serve each other’s interests. It could just as well be a circumstances-led pivot. But companies that have put money into generative AI will want to scrounge whatever return they can get. So the idea that we will be left with infrastructure that’s a poor fit for our actual needs is a disturbing – and entirely possible – outcome.

***

It’s fascinating – and an example of how you never know where new technologies will lead – to learn that people are using DNA testing to prove they qualify for citizenship in other countries such as Ireland, where a single grandparent will get you in. In some cases, such as the children of unmarried Irish women who were transported to England, this use of DNA testing rights historic wrongs. For others, it opens new opportunities such as the right to live in the EU. Unfortunately, it’s easy to imagine that in countries where citizenship by birthright is a talking point for the right wing this type of DNA testing could be mooted as a requirement. I’d like to think that rounding up babies for deportation is beyond even the most bigoted authoritarians, but…

***

The controversial British technology entrepreneur Mike Lynch has died a billionaire’s death; his superyacht sank in a tornado off the coast of Sicily. I interviewed him for Salon in 2000, when he was newly Britain’s first software billionaire. It was the first time I heard of the theorem developed by Thomas Bayes, an 18th century minister and mathematician (which now is everywhere), and for a long time afterwards I wasn’t certain I’d correctly understood his comments about perception and philosophy. This was exacerbated by early experience with his software in 1996, when it was still a consumer desktop search product fronted by an annoying cartoon dog – I thought it unusably slow compared to pre-Google search engines. By 2000, Autonomy had pivoted to enterprise software, which seemed a better fit.

In 2011, Sharon Bertsch McGrayne‘s book, The Theory That Would Not Die, explained things more clearly. That year, Lynch hit a business peak by selling Autonomy to Hewlett-Packard for $11 billion. A year later, he left HP, and set up Invoke Capital to invest in companies with fundamental technology ideas that scale.

Soon afterwards, HP wrote down $8.8 billion and accused Lynch of accounting fraud. The last 12 years of his life were spent in courtrooms: first a UK civil case, decided for HP in 2022, which Lynch was appealing, then a fight against extradition, and finally a criminal trial in the US, where former Autonomy CFO Sushovan Hussein had already been sent to jail for five years. Lynch’s fatal yacht trip was to celebrate his acquittal.

Illustrations: A Customs and Border Protection scientist reads a DNA profile to determine the origin of a commodity (via Wikimedia.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

Crowdstricken

This time two weeks ago the media were filled with images from airports clogged with travelers unable to depart because of…a software failure. Not a cyberattack, and not, as in 2017, limited to a single airline’s IT systems failure.

The outage wasn’t just in airports: NHS hospitals couldn’t book appointments, the London Stock Exchange news service and UK TV channel Sky News stopped functioning, and much more. It was the biggest computer system outage not caused by an attack to date, a watershed moment like 1988’s Internet worm.

Experienced technology observers quickly predicted: “bungled software update”. There are prior examples aplenty. In February, an AT&T outage lasted more than 12 hours, spanned 50 US states, Puerto Rico, and the US Virgin Islands, and blocked an estimated 25,000 attempted calls to the 911 emergency service. Last week, the Federal Communications Commission attributed the cause to an employee’s addition of a “misconfigured network element” to expand capacity without following the established procedure of peer review. The resulting cascade of failures was an automated response designed to prevent a misconfigured device from propagating. AT&T has put new preventative controls in place, and FCC chair Jessica Rosenworcel said the agency is considering how to increase accountabiliy for failing to follow best practice.

Much of this history is recorded in Peter G. Neumann’s ongoing RISKS Forum mailing list. In 2014, an update Apple issued to fix a flaw in a health app blocked users of its then-new iPhone 6 from connecting. In 2004, a failed modem upgrade knocked Cox Communications subscribers offline. My first direct experience was in the 1990s, when for a day CompuServe UK subsccribers had to dial Germany to pick up our email.

In these previous cases, though, the individuals affected had a direct relationship with the screw-up company. What’s exceptional about Crowdstrike is that the directly affected “users” were its 29,000 huge customer businesses. It was those companies’ resulting failures that turned millions of us into hostages to technological misfortune.

What’s more, in those earlier outages only one company and their direct customers were involved, and understanding the problem was relatively simple. In the case of Crowdstrike, it was hard to pinpoint the source of the problem at first because the direct effects were scattered (only Windows PCs awake to receive Crowdstrike updates) and the indirect effects were widespread.

The technical explanation of what happened, simplified, goes like this: Crowdstrike issued an update to its Falcon security software to block malware it spotted exploiting a vulnerability in Windows. The updated Falcon software sparked system crashes as PCs reacted to protect themselves against potential low-level damage (like a circuit breaker in your house tripping to protect your wiring from overload). Crowdstrike realized the error and pushed out a corrected update 79 minutes later. That fixed machines that hadn’t yet installed the faulty update. The machines that had updated in those 79 minutes, however, were stuck in a doom loop, crashing every time they restarted. Hence the need for manual intervention to remove those files in order to reboot successfully.

Microsoft initially estimated that 8.5 million PCs were affected – but that’s probably a wild underestimate as the only machines it could count were those that had crash reporting turned on.

The root cause is still unclear. Crowdstrike has said it found a hole in its Content Validator Tool, which should have caught the flaw. Microsoft is complaining that a 2009 interoperability agreement forced on it by the EU required it to allow Crowdstrike’s software to operate at the very low level on Windows machines that pushed the systems to crash. It’s wrong, however, to blame companies for enabling automated updates; security protection has to respond to new threats in real time.

The first financial estimates are emerging. Delta Airlines estimates the outage, which borked its crew tracking system for a week, cost it $500 million. CEO Ed Bastian told CNN, “They haven’t offered us anything.” Delta has hired lawyer David Boies, whose high-profile history began with leading the successful 1990s US government prosecution of Microsoft, to file its lawsuit.

Delta will need to take a number. Massachusetts-based Plymouth County Retirement Association has already filed a class action suit on behalf of Crowdstrike shareholders in Texas federal court, where Crowdstrike is headquartered, for misrepresenting its software and its capabilities. Crowdstrike says the case lacks merit.

Lawsuits are likely the only way companies will get recompense unless they have insurance to cover supplier-caused system failures. Like all software manufacturers, Crowdstrike has disclaimed all liability in its terms of use.

In a social media post, Federal Trade Commission chair Lina Khan said that, “These incidents reveal how concentration can create fragile systems.”

Well, yes. Technology experts have long warned of the dangers of monocultures that make our world more brittle. The thing is, we’re stuck with them because of scale. There were good reasons why the dozens of early network and operating systems consolidated: it’s simpler and cheaper for hiring, maintenance, and even security. Making our world less brittle will require holding companies – especially those that become significant points of failure – to meet higher standards of professionalism, including product liability for software, and requiring their customers to boost their resilience.

As for Crowdstrike, it is doomed to become that worst of all things for a company: a case study at business schools everywhere.

Illustrations: XKCD’s Dependency comic, altered by Mary Branscombe to reflect Crowdstrike’s reality.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

Twenty comedians walk into a bar…

The Internet was, famously, created to withstand a bomb outage. In 1998 Matt Blaze and Steve Bellovin said it, in 2002 it was still true, and it remains true today, after 50 years of development: there are more efficient ways to kill the Internet than dropping a bomb.

Take today. The cybersecurity company Crowdstrike pushed out a buggy update, and half the world is down. Airports, businesses, the NHS appointment booking system, supermarkets, the UK’s train companies, retailers…all showing the Blue Screen of Death. Can we say “central points of failure”? Because there are two: Crowdstrike, whose cybersecurity is widespead, and Microsoft, whose Windows operating system is everywhere.

Note this hasn’t killed the *Internet*. It’s temporarily killed many systems *connected to* the Internet. But if you’re stuck in an airport where nothing’s working and confronted with a sign that says “Cash only” when you only have cards…well, at least you can go online to read the news.

The fix will be slow, because it involves starting the computer in safe mode and manually deleting files. Like Y2K remediation, one computer at a time.

***

Speaking of things that don’t work, three bits from the generative AI bubble. First, last week Goldman Sachs issued a scathing report on generative AI that concluded it is unlikely to ever repay the trillion-odd dollars companies are spending on it, while its energy demands could outstrip available supply. Conclusion: generative AI is a bubble that could nonetheless take a long time to burst.

Second, at 404 Media Emanuel Weiburg reads a report from the Tony Blair Institute that estimates that 40% of tasks performed by public sector workers could be partially automated. Blair himself compares generative AI to the industrial revolution. This comparison is more accurate than he may realize, since the industrial revolution brought climate change, and generative AI pours accelerant on it.

TBI’s estimate conflicts with that provided to Goldman by MIT economist Daron Acemoglu, who believes that AI will impact at most 4.6% of tasks in the next ten years. The source of TBI’s estimate? ChatGPT itself. It’s learned self-promotion from parsing our output?

Finally, in a study presented at ACM FAccT, four DeepMind researchers interviewed 20 comedians who do live shows and use AI to participate in workshops using large language models to help write jokes. “Most participants felt the LLMs did not succeed as a creativity support tool, by producing bland and biased comedy tropes, akin to ‘cruise ship comedy material from the 1950s, but a bit less racist’.” Last year, Julie Seabaugh at the LA Times interviewed 13 professional comedians and got similar responses. Ahmed Ahmed compared AI-generated comedy to eating processed foods and, crucially, it “lacks timing”.

***

Blair, who spent his 1997-2007 premiership pushing ID cards into law, has also been trying to revive this longheld obsession. Two days after Keir Starmer took office, Blair published a letter in the Sunday Times calling for its return. As has been true throughout the history of ID cards (PDF), every new revival presents it as a solution to a different problem. Blair’s 2024 reason is to control immigration (and keep the far-right Reform party at bay). Previously: prevent benefit fraud, combat terorism, streamline access to health, education, and other government services (“the entitlement card”), prevent health tourism.

Starmer promptly shot Blair down: “not part of the government’s plans”. This week Alan West, a home office minister 2007-2010 under Gordon Brown, followed up with a letter to the Guardian calling for ID cards because they would “enhance national security in the areas of terrorism, immigration and policing; facilitate access to online government services for the less well-off; help to stop identity theft; and facilitate international travel”.

Neither Blair (born 1953) nor West (born 1948) seems to realize how old and out of touch they sound. Even back then, the “card” was an obvious decoy. Given pervasive online access, a handheld reader, and the database, anyone’s identity could be checked anywhere at any time with no “card” required.

To sound modern they should call for institutionalizing live facial recognition, which is *already happening* by police fiat. Or sprinkled AI bubble on their ID database.

Databases and giant IT projects that failed – like the Post Office scandal – that was the 1990s way! We’ve moved on, even if they haven’t.

***

If you are not a deposed Conservative, Britain this week is like waking up sequentially from a series of nightmares. Yesterday, Keir Starmer definitively ruled out leaving the European Convention on Human Rights – Starmer’s background as a human rights lawyer to the fore. It’s a relief to hear after 14 years of Tory ministers – David Cameron,, Boris Johnson, Suella Braverman, Liz Truss, Rishi Sunak – whining that human rights law gets in the way of their heart’s desires. Like: building a DNA database, deporting refugees or sending them to Rwanda, a plan to turn back migrants in boats at sea.

Principles have to be supported in law; under the last government’s Public Order Act 2023 curbing “disruptive protest”, yesterday five Just Stop Oil protesters were jailed for four and five years. Still, for that brief moment it was all The Brotherhood of Man.

Illustrations: Windows’ Blue Screen of Death (via Wikimedia).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.