Soap dispensers and Skynet

In the TV series Breaking Bad, the weary ex-cop Mike Ehrmantraut tells meth chemist Walter White : “No more half measures.” The last time he took half measures, the woman he was trying to protect was brutally murdered.

Apparently people like to say there are no dead bodies in privacy (although this is easily countered with ex-CIA director General Michael Hayden’s comment, “We kill people based on metadata”). But, as Woody Hartzog told a Senate committee hearing in September 2023, summarizing work he did with Neil Richards and Ryan Durrie, half measures in AI/privacy legislation are still a bad thing.

A discussion at Privacy Law Scholars last week laid out the problems. Half measures don’t work. They don’t prevent societal harms. They don’t prevent AI from being deployed where it shouldn’t be. And they sap the political will to follow up with anything stronger.

In an article for The Brink, Hartzog said, “To bring AI within the rule of law, lawmakers must go beyond half measures to ensure that AI systems and the actors that deploy them are worthy of our trust,”

He goes on to list examples of half measures: transparency, committing to ethical principles, and mitigating bias. Transparency is good, but doesn’t automatically bring accountability. Ethical principles don’t change business models. And bias mitigation to make a technology nominally fairer may simultaneously make it more dangerous. Think facial recognition: debias the system and improve its accuracy for matching the faces of non-male, non-white people, and then it’s used to target those same people with surveillance.

Or, bias mitigation may have nothing to do with the actual problem, an underlying business model, as Arvind Narayanan, author of the forthcoming book AI Snake Oil, pointed out a few days later at an event convened by the Future of Privacy Forum. In his example, the Washington Post reported in 2019 on the case of an algorithm intended to help hospitals predict which patients will benefit from additional medical care. It turned out to favor white patients. But, Narayanan said, the system’s provider responded to the story by saying that the algorithm’s cost model accurately predicted the costs of additional health care – in other words, the algorithm did exactly what the hospital wanted it to do.

“I think hospitals should be forced to use a different model – but that’s not a technical question, it’s politics.”.

Narayanan also called out auditing (another Hartzog half measure). You can, he said, audit a human resources system to expose patterns in which resumes it flags for interviews and which it drops. But no one ever commissions research modeled on the expensive random controlled testing common in medicine that follows up for five years to see if the system actually picks good employees.

Adding confusion is the fact that “AI” isn’t a single thing. Instead, it’s what someone called a “suitcase term” – that is, a container for many different systems built for many different purposes by many different organizations with many different motives. It is absurd to conflate AGI – the artificial general intelligence of science fiction stories and scientists’ dreams that can surpass and kill us all – with pattern-recognizing software that depends on plundering human-created content and the labeling work of millions of low-paid workers

To digress briefly, some of the AI in that suitcase is getting truly goofy. Yum Brands has announced that its restaurants, which include Taco Bell, Pizza Hut, and KFC, will be “AI-first”. Among Yum’s envisioned uses, the company tells Benj Edwards at Ars Technica, are being able to ask an app what temperature to set the oven. I can’t help suspecting that the real eventual use will be data collection and discriminatory pricing. Stuff like this is why Ed Zitron writes postings like The Rot-Com Bubble, which hypothesizes that the reason Internet services are deteriorating is that technology companies have run out of genuinely innovative things to sell us.

That you cannot solve social problems with technology is a long-held truism, but it seems to be especially true of the messy middle of the AI spectrum, the use cases active now that rarely get the same attention as the far ends of that spectrum.

As Neil Richards put it at PLSC, “The way it’s presented now, it’s either existential risk or a soap dispenser that doesn’t work on brown hands when the real problem is the intermediate level of societal change via AI.”

The PLSC discussion included a list of the ways that regulations fail. Underfunded enforcement. Regulations that are pure theater. The wrong measures. The right goal, but weakly drafted legislation. Make the regulation ambiguous, or base it on principles that are too broad. Choose conflicting half-measures – for example, require transparency but add the principle that people should own their own data.

Like Cristina Caffarra a week earlier at CPDP, Hartzog, Richards, and Durrie favor finding remedies that focus on limiting abuses of power. Full measures include outright bans, the right to bring a private cause of action, imposing duties of “loyalty, care, and confidentiality”, and limiting exploitative data practices within these systems. Curbing abuses of power, as he says, is nothing new. The shiny new technology is a distraction.

Or, as Narayanan put it, “Broken AI is appealing to broken institutions.”

Illustrations: Mike (Jonathan Banks) telling Walt (Bryan Cranston) in Breaking Bad (S03e12) “no more half measures”.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

Admiring the problem

In one sense, the EU’s barely dry AI Act and the other complex legislation – the Digital Markets Act, Digital Services Act, GDPR, and so on -= is a triumph. Flawed it may be, but it’s a genuine attempt to protect citizens’ human rights against a technology that is being birthed with numerous trigger warnings. The AI-with-everything program at this year’s Computers, Privacy, and Data Protection, reflected that sense of accomplishment – but also the frustration that comes with knowing that all legislation is flawed, all technology companies try to game the system, and gaps will widen.

CPDP has had these moments before: new legislation always comes with a large dollop of frustration over the opportunities that were missed and the knowledge that newer technologies are already rushing forwards. AI, and the AI Act, more or less swallowed this year’s conference as people considered what it says, how it will play internationally, and the necessary details of implementation and enforcement. Two years at this event, inadequate enforcement of GDPR was a big topic.

The most interesting future gaps that emerged this year: monopoly power, quantum sensing, and spatial computing.

For at least 20 years we’ve been hearing about quantum computing’s potential threat to public key encryption – that day of doom has been ten years away as long as I can remember, just as the Singularity is always 30 years away. In the panel on quantum sensing, Chris Hoofnagle argued that, as he and Simson Garfinkel recently wrote at Lawfare and in their new book, quantum cryptanalysis is overhyped as a threat (although there are many opportunities for quantum computing in chemistry and materials science). However, quantum sensing is here now, works (because qubits are fragile), and is cheap. There is plenty of privacy threat here to go around: quantum sensing will benefit entirely different classes of intelligence, particularly remote, undetectable surveillance.

Hoofnagle and Garfinkel are calling this MASINT, for machine and signature intelligence, and believe that it will become very difficult to hide things, even at a national level. In Hoofnagle’s example, a quantum sensor-equipped drone could fly over the homes of parolees to scan for guns.

Quantum sensing and spatial computing have this in common: they both enable unprecedented passive data collection. VR headsets, for example, collect all sorts of biomechanical data that can be mined more easily for personal information than people expect.

Barring change, all that data will be collected by today’s already-powerful entities.

The deeper level on which all this legislation fails particularly exercised Cristina Caffarra, the co-founder of the Centre for Economic Policy Research in the panel on AI and monopoly, saying that all this legislation is basically nibbling around the edges because they do not touch the real, fundamental problem of the power being amassed by the handful of companies who own the infrastructure.

“It’s economics 101. You can have as much downstream competition as you like but you will never disperse the power upstream.” The reports and other material generated by government agencies like the UK’s Competition and Markets Authority are, she says, just “admiring the problem”.

A day earlier, the Novi Sad professor Vladen Joler had already pointed out the fundamental problem: at the dawn of the Internet anyone could start with nothing and build something; what we’re calling “AI” requires billions in investment, so comes pre-monopolized. Many people dismiss Europe for not having its own homegrown Big Tech, but that overlooks open technologies: the Raspberry Pi, Linux, and the web itself, which all have European origins.

In 2010, the now-departing MP Robert Halfon (Con-Harlow) said at an event on reining in technology companies that only a company the size of Google – not even a government – could create Street View. Legend has it that open source geeks heard that as a challenge, and so we have OpenStreetMap. Caffarra’s fiery anger raises the question: at what point do the infrastructure providers become so entrenched that they could choke off an open source competitor at birth? Caffarra wants to build a digital public interest infrastructure using the gaps where Big Tech doesn’t yet have that control.

The Dutch Groenlinks MEP Kim van Sparrentak offered an explanation for why the AI Act doesn’t address market concentration: “They still dream of a European champion who will rule the world.” An analogy springs to mind: people who vote for tax cuts for billionaires because one day that might be *them*. Meanwhile, the UK’s Competition and Markets Authority finds nothing to investigate in Microsoft’s partnership with the French AI startup Mistral.

Van Sparrentak thinks one way out is through public procurement; adopt goals of privacy and sustainability, and support European companies. It makes sense; as the AI Now Institute’s Amba Kak, noted, at the moment almost everything anyone does digitally has to go through the systems of at least one Big Tech company.

As Sebastiano Toffaletti, head of the secretariat of the European SME Alliance, put it, “Even if you had all the money in the world, these guys still have more data than you. If you don’t and can’t solve it, you won’t have anyone to challenge these companies.”

Illustrations: Vladen Joler shows Anatomy of an AI System, a map he devised with Kate Crawford of the human labor, data, and planetary resources that are extracted to make “AI”.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

Microsoft can remember it for you wholesale

A new theory: somewhere in the Silicon Valley universe there’s a cadre of techies who have eidetic memories and they’re feeling them start to slip. Panic time.

That’s my best explanation for Microsoft’s latest wheeze, a new feature for its Copilot assistant that will take what’s variously called a “snapshot” or a “screenshot” of your computer (all three monitors?) every five seconds and store it for future reference. Microsoft hasn’t explained much about Recall’s inner technical workings, but according to the announcement, the data will be stored locally and will be searchable via semantic associations and some sort of “AI”. Microsoft also says the data will not be used to train AI models.

The general anger and dismay at this plan brings back, almost nostalgically, memories of the 1990s, when Microsoft was near-universally hated as the evil monopolist dominating computing. In 2008, when Google was ten years old, a BBC presenter asked me if I thought Google would ever be hated as much as Microsoft was (not then, no). In 2012, veteran journalist Charles Arthur published the book Digital Wars about how Microsoft had stagnated and lost its lead. And then suddenly, in the last few years, it’s back on top.

Possibilities occur that Microsoft doesn’t mention. For example: could software might be embedded into Windows to draw inferences from the data Recall saves? And could those inferences be forwarded to the company or used to target you with ads? That seems like a far more efficient way to invade users’ privacy than copying the data itself, if that’s what the company ultimately wants to do.

Lots of things on our computers already retain a “memory” of what we’ve been doing. Operating systems generate logs to help debug problems. Word processors retain a changelog, which powers the ability to undo mistakes. Web browsers have user-configurable histories; email software has archives; media players retain playlists. All of those are useful – but part of that usefulness is that they are contextual, limited, and either easily terminated by closing the relevant application or relatively easily edited to remove items that shouldn’t be kept.

It’s hard for almost everyone who isn’t Microsoft to understand the point of keeping everything by default. It seems like a feature only developers could love. I certainly would like Windows to be better at searching for stored files or my (Firefox) browser to be better at reloading that article I was reading yesterday. I have even longed for a personal version of Vannevar Bush’s Memex. As part of that, I might welcome a feature that let me hit a button to record the last five useful minutes of a meeting, or save a social media post to a local archive. But the key to that sort of memory expansion is curation, not remembering everything promiscuously. For most people, selective forgetting is how we survive the torrents of irrelevance hurled at us every day.

What Recall sounds most like is the lifelog science fiction writer Charlie Stross imagined in 2007 might be our future. Plummeting storage costs and expanding capacity, he reasoned, would make it possible to store *everything* in your pocket. Even then, there were (a very few) people doing that sort of thing, most notably Steve Mann, a University of Toronto professor who started wearing devices to comprhensively capture his life as a 1990s graduate student. Over the years, Mann has shrunk his personal gadget array from a laptop and peripherals to glasses and pocket devices. Many more people capture their surroundings now – but they do it on their phones. If Apple or Google were proposing a Recall feature for iOS or Android, the idea would seem a lot less weird.

The real issue is that there are many people who would like to be able to know what somone *else* has been doing on their computer at all times. Helicopter parents. Schools and teachers under government compulsion (see for example Prevent (PDF)). Employers. Border guards. Corporate spies. The Department of Work and Pensions. Authoritarian governments. Law enforcement and security agencies. Criminals. Domestic abusers… So developing any feature like this must include considering how to protect it against these threats. This does not appear to have happened.

Many others have written about the privacy issues in all this – the UK’s Information Commission’s Office is already investigating. At The Register, Richard Speed does a particularly good job of looking at some of the fine details. On Mastodon, Kevin Beaumont says inspection of the Copilot+ software suggests that Recall stores the text it extracts from all those snapshots into an easily copiable SQlite database.

But there’s still more. The kind of archive Recall appears to construct can teach an attacker how the target thinks: not just what passwords they choose but how they devise them.Those patterns can be highly valuable. Granted, few targets are worth that level of attention, but it happens, as Peter Davies, a technical director at eThales, has often warned.

Recall is not the only move – see also flawed-AI-with-everything – that suggests that the computer industry, like some politicians and governments, is badly losing touch with the public. Increasingly, what they want to do seems unrelated to what the rest of us want. If they think things like Recall are a good idea they need to read more Philip K. Dick. And then don’t invent the Torment Nexus.

Illustrations: Arnold Schwarzenegger seeking better memories in the 1990 film Total Recall.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon..

The apostrophe apocalypse

It was immediately tempting to view the absence of apostrophes on new street signs in a North Yorkshire town as a real-life example of computer systems crushing human culture. Then, near-simultaneously, Apple launched an ad (which it now regrets) showing just that process, raising the temptation even more. But no.

In fact, as Brandon Vigliarolo writes at The Register, not only is the removal of apostrophes in place names not new in the UK, but it also long precedes computers. The US Board on Geographic Names declared apostrophes unwanted as long ago as its founding year, 1890, apparently to avoid implying possession. This decision by the BGN, which has only made five exceptions in its history, was later embedded in the US’s Geographic Names Information System and British Standard 7666. When computers arrived to power databases, the practice carried on.

All that said, it’s my experience that the older British generation are more resentful of American-derived changes to their traditional language than they are of computer-driven alterations (one such neighbor complains about “sidewalk”). So campaigns to reinstate missing apostrophes seem likely to persist.

Blaming computers seemed like a coherent narrative, not least because new technology often disrupts social customs. Railways brought standardized time, and the desire to simplify things for computers led to the 2023 decision to eliminate leap seconds in 2035 (after 18 years of debate). Instead, the apostrophe apocalypse is a more ordinary story of central administrators preferencing their own convenience over local culture and custom (which may itself be contested). It still seems like people should be allowed to keep their street signs. I mean.

***

Of course language changes over time and usage. The character limits imposed by texting (and therefore exTwitter and other microblogging sites) brought us many abbreviations that are now commonplace in daily life, just as long before that the telegraph’s cost per word spawned its own compressed dialect. A new example popped up recently in Charles Arthur’s The Overspill.

Arthur highlighted an article at Level Up Coding/Medium by Fareed Khan that offered ways to distinguish between human-written and machine-generated text. It turns out that chatbots use distinctively different words than we do. Khan was able to generate a list of about 100 words that may indicate a chatbot has been at work, as well as a web app that can check a block of text or a file in one go. The word “delve” was at the top.

I had missed Khan’s source material, an earlier claim by YCombinator founder Paul Graham that “delve” used in an email pitch is a clear sign of ChatGPT-generated text. At the Guardian, Alex Hern suggests that an underlying cause may be the fact that much of the labeling necessary to train the large language models that power chatbots is carried out by badly paid people in the global South – including Africa, where “delve” is more commonly used than in Western countries.

At the Premium Times, Chiamaka Okafor argues that therefore identifying “delve” as a marker of “robotic text” penalizes African writers. “We are losing sight of an opportunity to rewrite the AI narratives that exclude people in the global majority,” she writes. A reminder: these chatbots are just math and statistics predicting the next word. They will always regress to the mean. And now they’ll penalize us for being different.

***

Just two years ago, researchers fretted that we were running out of “high-quality text” on which to train large language models. We’ve been seeing the results since, as sites hosting user-generated content strike deals with LLM owners, leading to contentious disputes between those owners and sites’ users, who feel betrayed and ripped off. Reddit began by charging for access to its API, then made a deal with Google to use its database of posts for training for an injection of cash that enabled it to go public. Yesterday, Reddit announced a similar deal with OpenAI – and the stock went up. In reality, these deals are asset-stripping a site that has consistently lost money for 18 years.

The latest site to sell its users’ content is the technical site Stack Overflow, Developers who offer mutual aid by answering each other’s questions are exactly the user base you would expect to be most offended by the news that the site’s owner, the investment group Prosus, which bought the site in 2021 for $1.8 billion, has made a deal giving OpenAI access to all its content. And so it proved: developers promptly began altering or removing their posts to protest the deal. Shortly thereafter, the site’s moderators began restoring those posts and suspending the users.

There’s no way this ends well; Internet history’s many such stories never have. The site’s original owners, who created the culture, are gone. The new ones don’t care what users *believe* their rights are if the terms and conditions grant an irrevocable license to everything they post. Inertia makes it hard to build a replacement; alienation thins out the old site. As someone posted to Twitter a few years ago, “On the Internet your home always leaves you.”

‘Twas ever thus. And so it will be until people stop taking the bait in the first place.

Illustrations: Apple’s canceled “crusher” ad.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

Review: More Than a Glitch

More Than a Glitch: Confronting Race, Gender, and Ability Bias in Tech
By Meredith Broussard
MIT Press
ISBN: 978-0-262-04765-4

At the beginning of the 1985 movie Brazil, a family’s life is ruined when a fly gets stuck in a typewriter key so that the wrong man is carted away to prison. It’s a visual play on “computer bug”, so named after a moth got trapped in a computer at Harvard.

Based on her recent book More Than a Glitch, NYU associate professor Meredith Broussard, would call both the fly and the moth a “glitch”. In the movie, the error is catastrophic for Buttle-not-Tuttle and his family, but it’s a single, ephemeral mistake that can be prevented with insecticide and cross-checking. A “bug” is more complex and more significant: it’s “substantial”, “a more serious matter that makes software fail”. It “deserves attention”. It’s the difference between the lone rotten apple in a bushel full of good ones and a barrel that causes all the apples put it in to rot.

This distinction is Broussard’s prelude to her fundamental argument that the lack of fairness in computer systems is persistent, endemic, and structural. In the book, she examines numerous computer systems that are already out in the world causing trouble. After explaining the fundamentals of machine bias, she goes through a variety of sectors and applications to examine failures of fairness in each one. In education, proctoring software penalizes darker-skinned students by failing to identify them accurately, and algorithms used to estimate scores on tests canceled during the pandemic penalized exceptional students from unexpected backgrounds. In health, long-practiced “race correction” that derives from slavery preferences white patients for everything from painkillers to kidney transplants – and gets is embedded into new computer systems built to replicate existing practice. If computer developers don’t understand the way in which the world is prejudiced – and they don’t – how can the systems they create be more neutral than the precursors they replace? Broussard delves inside each system to show why, not just how, it doesn’t work as intended.

In other cases Broussard highlights, part of the problem is rigid inflexibility in back-end systems that need to exchange data. There’s little benefit in having 58 gender options if the underlying database only supports two choices. At a doctor’s office, Broussard is told she can only check one box for race; she prefer to check both “black” and “white” because in medical settings it may affect her treatment. The digital world remains only partially accessible. And, as Broussard discovered when she was diagnosed with breast cancer, even supposed AI successes like reading radiology films are overhyped. This section calls back to her 2018 book, Artificial Unintelligence, which did a good job of both explaining how machine learning and “AI” computer systems work and why a lot of the things the industry says work…really don’t (see also self-driving cars).

Broussard concludes by advocating for public interest technology and a rethink. New technology imitates the world it comes from; computers “predict the status quo”. Making change requires engineering technology so that it performs differently. It’s a tall order, and Broussard knows that. But wasn’t that the whole promise the technology founder made? That they could change the world to empower the rest of us?

Intents and purposes

One of the basic principles of data protection law is the requirement for consent for change of use. For example, giving a site a mobile number for two-factor authentication doesn’t entitle it to sell that number to a telemarketing company. Providing a home address to enable package delivery doesn’t also invite ads trying to manipulate my vote in an election. Governments, too, are subject to data protection law, but they have more scope than most to carve out – or simply take – exceptions for themselves.

And so to the UK’s Department of Work and Pensions, whose mission in life is supposed to be to provide people with the financial support the state has promised them, whether that’s welfare or state pensions – overall, about 23 million people. Schools Week reports that Jen Persson at Defend Digital Me has discovered that the DWP has a secret deal with the Department of Education granting it access to the National Pupil Database for the purpose of finding benefit fraud.

“Who knows their family’s personal confidential records are in the haystack used to find the fraudulent needle?” Persson asks.

Every part of this is a mess. First of all, it turns schools into hostile environments for those already at greatest risk. Second, as we saw as long ago as 2010, parents and children have little choice about the data schools collect and keep. The breadth and depth of this data has been expanding long enough to burn out the UK’s first campaigner on children’s privacy rights (Terri Dowty, with Action for Rights of Children), and keep the second (Persson) fully occupied for some years now.

Persson told Schools Week that more than 15 million of the people on the NPD have long since left school. That sounds right; the database was created in 2002, five years into Tony Blair’s database-loving Labour government. In the 2009 report Database State, written under the aegis of the Foundation for Information Policy Research, Ross Anderson, Terri Dowty, Philip Inglesant, William Heath, and Angela Sasse surveyed 46 government databases. They found that a quarter of them were “almost certainly illegal” under human rights or data protection law, and noted that Britain was increasingly centralizing all such data.

“The emphasis on data capture, form-filling, mechanical assessment and profiling damages professional responsibility and alienates the citizen from the state. Over two-thirds of the population no longer trust the government with their personal data,” they wrote then.

The report was published while Blair’s government was trying to implement the ID card enshrined in the 2006 ID Cards Act. This latest in a long string of such proposals following the withdrawal of ID cards after the end of World War II was ultimately squelched when David Cameron’s coalition government took office in 2010. The act was repealed in 2011.

These bits of history are relevant for three reasons: 1) there is no reason to believe that the Labour government everyone expects will win office in the next nine months will be any less keen on dataveillance; 2) tackling benefit fraud was what they claimed they wanted the ID card for in 2006; 3) you really don’t need an ID *card* if you have biometrics and ubiquitous, permanent access online to a comprehensive government database. This was obvious even in 2006, and now we’re seeing it in action.

Dowty often warned that children were used as experimental subjects on which British governments sharpened the policies they intended to expand to the rest of the population. And so it is proving: the use of education data to look for benefit fraud is the opening act for the provision in the Data Protection and Digital Information bill empowering the DWP to demand account data from banks and other financial institutions, again to reduce benefit fraud.

The current government writes, “The new proposals would allow regular checks to be carried out on the bank accounts held by benefit claimants to spot increases in their savings which push them over the benefit eligibility threshold, or when people send [sic] more time overseas than the benefit rules allow for.” The Information Commissioner’s Office has called the measure disproportionate, and says it does not provide sufficient safeguards.

Big Brother Watch, which is campaigning against this proposal, argues that it reverses the fundamental principle of the presumption of innocence. All pervasive “monitoring” does that; you are continuously a suspect except at the specific points where you’ve been checked and found innocent. .

In a commercial context, we’d call the coercion implicit in repurposing data given under compulsion bait and switch. We’d also bear in mind the Guardian’s recent expose: the DWP has been demanding back huge sums of money from carers who’ve made minor mistakes in reporting their income. As BBW also wrote, even a tiny false positive rate will give the DWP hundreds of thousands of innocent people to harass.

Thirty years ago, when I was first learning about the dangers of rampant data collection, it occurred to me that the only way you can ensure that data can’t be leaked, exploited, or used maliciously is to not collect in the first place. This isn’t a choice anyone can make now. But there are alternatives that reverse the trend toward centralization that Anderson et. al identified in 2009.

Illustrations: Haystacks at a Moldovan village (via Wikimedia).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

Review: A History of Fake Things on the Internet

A History of Fakes on the Internet
By Walter J. Scheirer
Stanford University Press
ISBN 2023017876

One of Agatha Christie’s richest sources of plots was the uncertainty of identity in England’s post-war social disruption. Before then, she tells us, anyone arriving to take up residence in a village brought a letter of introduction; afterwards, old-time residents had to take newcomers at their own valuation. Had she lived into the 21st century, the arriving Internet would have given her whole new levels of uncertainty to play with.

In his recent book A History of Fake Things on the Internet, University of Notre Dame professor Walter J. Scheirer describes creating and detecting online fakes as an ongoing arms race. Where many people project doomishly that we will soon lose the ability to distinguish fakery from reality, Scheirer is more optimistic. “We’ve had functional policies in the past; there is no good reason we can’t have them again,” he concludes, adding that to make this happen we need a better understanding of the media that support the fakes.

I have a lot of sympathy with this view; as I wrote recently, things that fool people when a medium is new are instantly recognizable as fake once they become experienced. We adapt. No one now would be fooled by the images that looked real in the early days of photography. Our perceptions become more sophisticated, and we learn to examine context. Early fakes often work simply because we don’t know yet that such fakes are possible. Once we do know, we exercise much greater caution before believing. Teens who’ve grown up applying filters to the photos and videos they upload to Instagram and TikTok, see images very differently than those of us who grew up with TV and film.

Schierer begins his story with the hacker counterculture that saw computers as a source of subversive opportunities. His own research into media forensics began with Photoshop. At the time, many, especially in the military, worried that nation-states would fake content in order to deceive and manipulate. What they found, in much greater volume, was memes and what Schierer calls “participatory fakery” – that is, the cultural outpouring of fakes for entertainment and self-expression, most of it harmless. Further chapters consider cheat codes in games, the slow conversion of hackers into security practitioners, adversarial algorithms and media forensics, shock-content sites, and generative AI.

Through it all, Schierer remains optimistic that the world we’re moving into “looks pretty good”. Yes, we are discovering hundreds of scientific papers with faked data, faked results, or faked images, but we also have new analysis tools to use to detect them and Retraction Watch to catalogue them. The same new tools that empower malicious people enable many more positive uses for storytelling, collaboration, and communication. Perhaps forgetting that the computer industry relentlessly ignores its own history, he writes that we should learn from the past and react to the present.

The mention of scientific papers raises an issue Schierer seems not to worry about: waste. Every retracted paper represents lost resources – public funding, scientists’ time and effort, and the same multiplied into the future for anyone who attempts to build on that paper. Figuring out how to automate reliable detection of chatbot-generated text does nothing to lessen the vast energy, water, and human resources that go into building and maintaining all those data centers and training models (see also filtering spam). Like Scheirer, I’m largely optimistic about our ability to adapt to a more slippery virtual reality. But the amount of wasted resources is depressing and, given climate change, dangerous.

Deja news

At the first event organized by the University of West London group Women Into Cybersecurity, a questioner asked how the debates around the Internet have changed since I wrote the original 1997 book net.wars..

Not much, I said. Some chapters have dated, but the main topics are constants: censorship, freedom of speech, child safety, copyright, access to information, digital divide, privacy, hacking, cybersecurity, and always, always, *always* access to encryption. Around 2010, there was a major change when the technology platforms became big enough to protect their users and business models by opposing government intrusion. That year Google launched the first version of its annual transparency report, for example. More recently, there’s been another shift: these companies have engorged to the point where they need not care much about their users or fear regulatory fines – the stage Ed Zitron calls the rot economy and Cory Doctorow dubs enshittification.

This is the landscape against which we’re gearing up for (yet) another round of recursion. April 25 saw the passage of amendments to the UK’s Investigatory Powers Act (2016). These are particularly charmless, as they expand the circumstances under which law enforcement can demand access to Internet Connection Records, allow the government to require “exceptional lawful access” (read: backdoored encryption) and require technology companies to get permission before issuing security updates. As Mark Nottingham blogs, no one should have this much power. In any event, the amendments reanimate bulk data surveillance and backdoored encryption.

Also winding through Parliament is the Data Protection and Digital Information bill. The IPA amendments threaten national security by demanding the power to weaken protective measures; the data bill threatens to undermine the adequacy decision under which the UK’s data protection law is deemed to meet the requirements of the EU’s General Data Protection Regulation. Experts have already put that adequacy at risk. If this government proceeds, as it gives every indication of doing, the next, presumably Labour, government may find itself awash in an economic catastrophe as British businesses become persona-non-data to their European counterparts.

The Open Rights Group warns that the data bill makes it easier for government, private companies, and political organizations to exploit our personal data while weakening subject access rights, accountability, and other safeguards. ORG is particularly concerned about the impact on elections, as the bill expands the range of actors who are allowed to process personal data revealing political opinions on a new “democratic engagement activities” basis.

If that weren’t enough, another amendment also gives the Department of Work and Pensions the power to monitor all bank accounts that receive payments, including the state pension – to reduce overpayments and other types of fraud, of course. And any bank account connected to those accounts, such as landlords, carers, parents, and partners. At Computer Weekly, Bill Goodwin suggests that the upshot could be to deter landlords from renting to anyone receiving state benefits or entitlements. The idea is that banks will use criteria we can’t access to flag up accounts for the DWP to inspect more closely, and over the mass of 20 million accounts there will be plenty of mistakes to go around. Safe prediction: there will be horror stories of people denied benefits without warning.

And in the EU… Techcrunch reports that the European Commission (always more surveillance-happy and less human rights-friendly than the European Parliament) is still pursuing its proposal to require messaging platforms to scan private communications for child sexual abuse material. Let’s do the math of truly large numbers: billions of messages, even a teeny-tiny percentage of inaccuracy, literally millions of false positives! On Thursday, a group of scientists and researchers sent an open letter pointing out exactly this. Automated detection technologies perform poorly, innocent images may occur in clusters (as when a parent sends photos to a doctor), and such a scheme requires weakening encryption, and in any case, better to focus on eliminating child abuse (taking CSAM along with it).

Finally, age verification, which has been pending in the UK ever since at least 2016, is becoming a worldwide obsession. At least eight US states and the EU have laws mandating age checks, and the Age Verification Providers Association is pushing to make the Internet “age-aware persistently”. Last month, the BSI convened a global summit to kick off the work of developing a worldwide standard. These moves are the latest push against online privacy; age checks will be applied to *everyone*, and while they could be designed to respect privacy and anonymity, the most likely is that they won’t be. In 2022, the French data protection regulator, CNIL, found that current age verification methods are both intrusive and easily circumvented. In the US, Casey Newton is watching a Texas case about access to online pornography and age verification that threatens to challenge First Amendment precedent in the Supreme Court.

Because the debates are so familiar – the arguments rarely change – it’s easy to overlook how profoundly all this could change the Internet. An age-aware Internet where all web use is identified and encrypted messaging services have shut down rather than compromise their users and every action is suspicious until judged harmless…those are the stakes.

Illustrations: Angel sensibly smashes the ring that makes vampires impervious (in Angel, “In the Dark” (S01e03)).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon

Selective enforcement

This week, as a rider to the 21st Century Peace Through Strength Act, which provides funding for defense in Ukraine, Israel, and Taiwan, the US Congress passed provisions for banning the distribution of TikTok if owner ByteDance has not divested it within 270 days. President Joe Biden signed it into law on Wednesday, and, as Mike Masnick says at Techdirt, ByteDance’s lawsuit is imminently expected, largely on First Amendment grounds. ACLU agrees. Similar arguments won when ByteDance challenged a 2023 Montana law.

For context: Pew Research says TikTok is the fifth-most popular social media service in the US. An estimated 150 million Americans – and 62% of 18-29-year-olds – use it.

The ban may not be a slam-dunk to fail in court. US law, including the constitution, includes many restrictions on foreign influence, from requiring registration for those acting as agents to requiring presidents to have been born US citizens. Until 2017, foreigners were barred from owning US broadcast networks.

So it seems to this non-lawyer as though a lot hinges on how the court defines TikTok and what precedents apply. This is the kind of debate that goes back to the dawn of the Internet: is a privately-owned service built of user-generated content more like a town square, a broadcaster, a publisher, or a local pub? “Broadcast”, whether over the air or via cable, implies being assigned a channel on a limited resource; this clearly doesn’t apply to apps and services carried over the presumably-infinite Internet. Publishing implies editorial control, which social media lacks. A local pub might be closest: privately owned, it’s where people go to connect with each other. “Congress may make no law…abridging the freedom of speech”…but does that cover denying access to one “place” where speech takes place when there are many other options?

TikTok is already banned in Pakistan, Nepal, and Afghanistan, and also India, where it is one of 500 apps that have been banned since 2020. ByteDance will argue that the ban hurts US creators who use TikTok to build businesses. But as NPR reports, in India YouTube and Instagram rolled out short video features to fill the gap for hyperlocal content that the loss of TikTok opened up, and four years on creators have adapted to other outlets.

It will be more interesting if ByteDance claims the company itself has free speech rights. In a country where commercial companies and other organizations are deemed to have “free speech” rights entitling them to donate as much money as they want to political causes (as per the Supreme Court’s ruling in Citizens United v. Federal Election Commission), that might make a reasonable argument.

On the other hand, there is no question that this legislation is full of double standards. If another country sought to ban any of the US-based social media, American outrage would be deafening. If the issue is protecting the privacy of Americans against rampant data collection, then, as Free Press argues, pass a privacy law that will protect Americans from *every* service, not just this one. The claim that the ban is to protect national security is weakened by the fact that the Chinese government, like apparently everyone else, can buy data on US citizens even if it’s blocked from collecting it directly from ByteDance.

Similarly, if the issue is the belief that social media inevitably causes harm to teenagers, as author and NYU professor Jonathan Haidt insists in his new book, then again, why only pick on TikTok? Experts who have really studied this terrain, such as Danah Boyd and others, insist that Haidt is oversimplifying and pushing parents to deny their children access to technologies whose influence is largely positive. I’m inclined to agree; between growing economic hardship, expanding wars, and increasing climate disasters young people have more important things to be anxious about than social media. In any case, where’s the evidence that TikTok is a bigger source of harm than any other social medium?

Among digital rights activists, the most purely emotional argument against the TikTok ban revolves around the original idea of the Internet as an open network. Banning access to a service in one country (especially the country that did the most to promote the Internet as a vector for free speech and democratic values) is, in this view, a dangerous step toward the government control John Perry Barlow famously rejected in 1996. And yet, to increasing indifference, no-go signs are all over the Internet. *Six* years after GDPR came into force, Europeans are still blocked from many US media sites that can’t be bothered to comply with it. Many other media links don’t work because of copyright restrictions, and on and on.

The final double standard is this: a big element in the TikTok ban is the fear that the Chinese government, via its control over companies hosted there, will have access to intimate personal information about Americans. Yet for more than 20 years this has been the reality for non-Americans using US technology services outside the US: their data is subject to NSA surveillance. This, and the lack of redress for non-Americans, is what Max Schrems’ legal cases have been about. Do as we say, not as we do?

Illustrations: TikTok CEO Shou Zi Chew, at the European Commission in 2024 (by Lukasz Kobus at Wikimedia).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

Irrevocable

One of the biggest advances in computing in my lifetime is the “Undo” button. Younger people will have no idea of this, but at one time if you accidentally deleted the piece you’d spent hours typing into your computer, it was just…gone forever.

This week, UK media reported on what seems to be an unusual but not unique case: a solicitor accidentally opened the wrong client’s divorce case on her computer screen and went on to apply for a final decree for the couple concerned. The court granted the divorce in a standardly automated 21 minutes, even though the specified couple had not yet agreed on a financial settlement. Despite acknowledging the error, the court now refuses to overturn the decree. UK lawyers of my acquaintance say that this obvious unfairness may be because granting the final decree sets in motion other processes that are difficult to reverse.

That triggers a memory of the time I accidentally clicked on “cancel” instead of “check in” on a flight reservation, and casually, routinely, clicked again to confirm. I then watched in horror as the airline website canceled the flight. The undo button in this case was to phone customer service. Minutes later, they reinstated the reservation and thereafter I checked in without incident. Undone!

Until the next day, when I arrived in the US and my name wasn’t on the manifest. The one time I couldn’t find my boarding pass… After a not-long wait that seeemd endless in a secondary holding area (which I used to text people to tell them where I was just in case) I explained the rogue cancellation and was let go. Whew! (And yes, I know: citizen, white, female privilege.)

“Ease of use” should include making it hard to make irrecoverable mistakes. And maybe a grace period before automated processes cascade.

The Guardian quotes family court division head Sir Andrew McFarlane explaining that the solicitor’s error was not easy to make: “Like many similar online processes, an operator may only get to the final screen where the final click of the mouse is made after traveling through a series of earlier screens,” Huh? If you think you have opened the right case, then those are the screens you would expect to see. Why wouldn’t you go ahead?

At the Law Gazette, John Hyde reports that the well-known law firm in question, Vardag, is backing the young lawyer who made the error, describing it as a “slip up with the drop down menu” on “the new divorce portal”, noting that similar errors had happened “a few times” and felt like a design error.

“Design errors” can do a lot of damage. Take paying a business or person via online banking. In the UK, until recently, you entered account name, number, and sort code, and confirmed to send. If you made a mistake, tough. If the account information was sent by a scammer instead of the recipient you thought, tough. It was only in 2020 that most banks began participating in “Confirmation of payee”, which verifies the account with the receiving bank and checks with you that the name is correct. In 2020, Which? estimated that confirming payee could have saved £320 million in bank transfer fraud since 2017.

Similarly, while many more important factors caused the Horizon scandal, software design played its part: subpostmasters could not review past transactions as they could on paper.

Many computerized processes are blocked unless precursor requirements have been completed and checked for compliance. A legally binding system seems like it similarly ought to incorporate checks to ensure that all necessary steps had been completed.

Arguably, software design is failing users. In ecommerce, user-hostile software design is deceptive, or “dark”, patterns, user interfaces built deliberately to manipulate users into buying/spending more than they intended. The clutter that makes Amazon unusable directs shoppers to its house brands.

User interface design is where I began writing about computers circa 1990. Windows 3 was new, and the industry was just discovering that continued growth depended on reaching past those who *liked* software to be difficult. I vividly recall being told by a usability person at then-market leader Lotus about the first time her company’s programmers watched ordinary people using their software. First one fails to complete task. “Well, that’s a stupid person.” Second one. “Well, that’s a stupid person, too.” Third one. “Where do you find these people?” But after watching a couple more, they got it.

In the law firm’s case, the designers likely said, “This system is just for expert users”. True, but what they’re expert in is law, not software. Hopefully the software will now be redesigned to reflect the rule that it should be as easy as possible to do the work but as hard as possible to make unrecoverable mistakes (the tolerance principle). It’s a simple idea that goes all the way back to Donald Norman’s classic 1988. book The Design of Everyday Things.

At a guess, if today’s “AI” automation systems become part of standard office work making mistakes will become easier rather than harder, partly because it makes systems more inscrutable. In addition, the systems being digitized are increasingly complex with more significant consequences reaching deep into people’s lives, and intended to serve the commissioning corporations’ short-term desires. It will not be paranoid to believe the world is stacked against us.

Illustrations: Cary Grant and Rosalind Russell as temporarily divorced newspapermen in His Girl Friday (1944).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.