Review: A History of Fake Things on the Internet

A History of Fakes on the Internet
By Walter J. Scheirer
Stanford University Press
ISBN 2023017876

One of Agatha Christie’s richest sources of plots was the uncertainty of identity in England’s post-war social disruption. Before then, she tells us, anyone arriving to take up residence in a village brought a letter of introduction; afterwards, old-time residents had to take newcomers at their own valuation. Had she lived into the 21st century, the arriving Internet would have given her whole new levels of uncertainty to play with.

In his recent book A History of Fake Things on the Internet, University of Notre Dame professor Walter J. Scheirer describes creating and detecting online fakes as an ongoing arms race. Where many people project doomishly that we will soon lose the ability to distinguish fakery from reality, Scheirer is more optimistic. “We’ve had functional policies in the past; there is no good reason we can’t have them again,” he concludes, adding that to make this happen we need a better understanding of the media that support the fakes.

I have a lot of sympathy with this view; as I wrote recently, things that fool people when a medium is new are instantly recognizable as fake once they become experienced. We adapt. No one now would be fooled by the images that looked real in the early days of photography. Our perceptions become more sophisticated, and we learn to examine context. Early fakes often work simply because we don’t know yet that such fakes are possible. Once we do know, we exercise much greater caution before believing. Teens who’ve grown up applying filters to the photos and videos they upload to Instagram and TikTok, see images very differently than those of us who grew up with TV and film.

Schierer begins his story with the hacker counterculture that saw computers as a source of subversive opportunities. His own research into media forensics began with Photoshop. At the time, many, especially in the military, worried that nation-states would fake content in order to deceive and manipulate. What they found, in much greater volume, was memes and what Schierer calls “participatory fakery” – that is, the cultural outpouring of fakes for entertainment and self-expression, most of it harmless. Further chapters consider cheat codes in games, the slow conversion of hackers into security practitioners, adversarial algorithms and media forensics, shock-content sites, and generative AI.

Through it all, Schierer remains optimistic that the world we’re moving into “looks pretty good”. Yes, we are discovering hundreds of scientific papers with faked data, faked results, or faked images, but we also have new analysis tools to use to detect them and Retraction Watch to catalogue them. The same new tools that empower malicious people enable many more positive uses for storytelling, collaboration, and communication. Perhaps forgetting that the computer industry relentlessly ignores its own history, he writes that we should learn from the past and react to the present.

The mention of scientific papers raises an issue Schierer seems not to worry about: waste. Every retracted paper represents lost resources – public funding, scientists’ time and effort, and the same multiplied into the future for anyone who attempts to build on that paper. Figuring out how to automate reliable detection of chatbot-generated text does nothing to lessen the vast energy, water, and human resources that go into building and maintaining all those data centers and training models (see also filtering spam). Like Scheirer, I’m largely optimistic about our ability to adapt to a more slippery virtual reality. But the amount of wasted resources is depressing and, given climate change, dangerous.

Deja news

At the first event organized by the University of West London group Women Into Cybersecurity, a questioner asked how the debates around the Internet have changed since I wrote the original 1997 book net.wars..

Not much, I said. Some chapters have dated, but the main topics are constants: censorship, freedom of speech, child safety, copyright, access to information, digital divide, privacy, hacking, cybersecurity, and always, always, *always* access to encryption. Around 2010, there was a major change when the technology platforms became big enough to protect their users and business models by opposing government intrusion. That year Google launched the first version of its annual transparency report, for example. More recently, there’s been another shift: these companies have engorged to the point where they need not care much about their users or fear regulatory fines – the stage Ed Zitron calls the rot economy and Cory Doctorow dubs enshittification.

This is the landscape against which we’re gearing up for (yet) another round of recursion. April 25 saw the passage of amendments to the UK’s Investigatory Powers Act (2016). These are particularly charmless, as they expand the circumstances under which law enforcement can demand access to Internet Connection Records, allow the government to require “exceptional lawful access” (read: backdoored encryption) and require technology companies to get permission before issuing security updates. As Mark Nottingham blogs, no one should have this much power. In any event, the amendments reanimate bulk data surveillance and backdoored encryption.

Also winding through Parliament is the Data Protection and Digital Information bill. The IPA amendments threaten national security by demanding the power to weaken protective measures; the data bill threatens to undermine the adequacy decision under which the UK’s data protection law is deemed to meet the requirements of the EU’s General Data Protection Regulation. Experts have already put that adequacy at risk. If this government proceeds, as it gives every indication of doing, the next, presumably Labour, government may find itself awash in an economic catastrophe as British businesses become persona-non-data to their European counterparts.

The Open Rights Group warns that the data bill makes it easier for government, private companies, and political organizations to exploit our personal data while weakening subject access rights, accountability, and other safeguards. ORG is particularly concerned about the impact on elections, as the bill expands the range of actors who are allowed to process personal data revealing political opinions on a new “democratic engagement activities” basis.

If that weren’t enough, another amendment also gives the Department of Work and Pensions the power to monitor all bank accounts that receive payments, including the state pension – to reduce overpayments and other types of fraud, of course. And any bank account connected to those accounts, such as landlords, carers, parents, and partners. At Computer Weekly, Bill Goodwin suggests that the upshot could be to deter landlords from renting to anyone receiving state benefits or entitlements. The idea is that banks will use criteria we can’t access to flag up accounts for the DWP to inspect more closely, and over the mass of 20 million accounts there will be plenty of mistakes to go around. Safe prediction: there will be horror stories of people denied benefits without warning.

And in the EU… Techcrunch reports that the European Commission (always more surveillance-happy and less human rights-friendly than the European Parliament) is still pursuing its proposal to require messaging platforms to scan private communications for child sexual abuse material. Let’s do the math of truly large numbers: billions of messages, even a teeny-tiny percentage of inaccuracy, literally millions of false positives! On Thursday, a group of scientists and researchers sent an open letter pointing out exactly this. Automated detection technologies perform poorly, innocent images may occur in clusters (as when a parent sends photos to a doctor), and such a scheme requires weakening encryption, and in any case, better to focus on eliminating child abuse (taking CSAM along with it).

Finally, age verification, which has been pending in the UK ever since at least 2016, is becoming a worldwide obsession. At least eight US states and the EU have laws mandating age checks, and the Age Verification Providers Association is pushing to make the Internet “age-aware persistently”. Last month, the BSI convened a global summit to kick off the work of developing a worldwide standard. These moves are the latest push against online privacy; age checks will be applied to *everyone*, and while they could be designed to respect privacy and anonymity, the most likely is that they won’t be. In 2022, the French data protection regulator, CNIL, found that current age verification methods are both intrusive and easily circumvented. In the US, Casey Newton is watching a Texas case about access to online pornography and age verification that threatens to challenge First Amendment precedent in the Supreme Court.

Because the debates are so familiar – the arguments rarely change – it’s easy to overlook how profoundly all this could change the Internet. An age-aware Internet where all web use is identified and encrypted messaging services have shut down rather than compromise their users and every action is suspicious until judged harmless…those are the stakes.

Illustrations: Angel sensibly smashes the ring that makes vampires impervious (in Angel, “In the Dark” (S01e03)).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon

Core values

Follow the money; follow the incentives.

Cybersecurity is an intractable problem for many of the same reasons climate change is: often the people paying the cost are not the people who derive the benefits. The foundation of the Workshop on the Economics of Information Security is often traced to the 2001 paper Why Information Security is Hard, by the late Ross Anderson. There were earlier hints, most notably in the 1999 paper Users Are Not the Enemy by Angela Sasse and Anne Adams.

Anderson’s paper directly examined and highlighted the influence of incentives on security behavior. Sasse’s paper was ostensibly about password policies and the need to consider human factors in designing them. But hidden underneath was the fact that the company department that called her in was not the IT team or the help desk team but accounting. Help desk costs to support users who forgot their passwords were rising so fast they threatened to swamp the company.

At the 23rd WEIS, held this week in Dallas (see also 2020), papers studied questions like which values drive people’s decisions when hit by ransomware attacks (Zinaida Benenson); whether the psychological phenomenon of delay discounting could be used to understand the security choices people make (Einar Snekkenes); and whether a labeling scheme would help get people to pay for security (L Jean Camp).

The latter study found that if you keep the label simple, people will actually pay for security. It’s a seemingly small but important point: throughout the history of personal computing, security competes with so many other imperatives that it’s rarely a factor in purchasing decisions. Among those other imperatives: cost, convenience, compatibility with others, and ease of use. But also: it remains near-impossible to evaluate how secure a product or provider is. Only the largest companies are in a position to ask detailed questions of cloud providers, for example,

Or, in an example provided by Chitra Marti, rare is the patient who can choose a hospital based on the security arrangements it has in place to protect its data. Marti asked a question I haven’t seen before: what is the role of market concentration in cybersecurity? To get at this, Marti looked at the decade’s experience of electronic medical records in hospitals since the big post-2008 recession push to digitize. Since 2010, more than 150 million records have been breached.

Of course, monoculture is a known problem in cybersecurity as it is in agriculture: if every machine runs the same software all machines are vulnerable to the same attacks. Similarly, the downsides of monopoly – poorer service, higher prices, lower quality – are well known. Marti’s study tying the two together found correlations in the software hospitals run and rarely change, even after a breach, though they do adopt new security measures. Hospitals choose software vendors for all sorts of reasons such as popularity, widspread use in their locality, or market leadership. The difficulty of deciding to change may be exacerbated by positive benefits to their existing choice that would be lost and outweigh the negatives.

These broader incentives help explain, as Richard Clayton set out, why distributed denial of service attacks remain so intractable. A key problem is “reflectors”, which amplify attacks by using spoofed IP addresses to send requests where the size of the response will dwarf the request. With this technique, a modest amount of outgoing traffic lands a flood on the chosen target (the one whose IP address has been spoofed). Fixing infrastructure to prevent these reflectors is tedious and only prevents damage to others. Plus, the provider involved may have to sacrifice the money they are paid to carry the traffic. For reasons like these, over the years the size of DDoS attacks has grown until only the largest anti-DDoS providers can cope with them. These realities are also why the early effort to push providers to fix their systems – RFC 2267 – failed. The incentives, in classic WEIS terms, are misaligned.

Clayton was able to use the traffic data he was already collecting to create a short list of the largest reflected amplified DDoS attacks each week and post it on a private Slack channel so providers could inspect their logs to trace it back to the source

At this point a surprising thing happened: the effort made a difference. Reflected amplified attacks dropped noticeably. The reasons, he and Ben Collier argue in their paper, have to do with the social connections among network engineers, the most senior of whom helped connect the early Internet and have decades-old personal relationships with their peers that have been sustained through forums such as NANOG and M3AAWG. This social capital and shared set of values kicked in when Clayton’s action lists moved the problem from abuse teams into the purview of network engineer s. Individual engineers began racing ahead; Amazon recently highlighted AWS engineer Tom Scholl’s work tracing back traffic and getting attacks stopped.

Clayton concluded by proposing “infrastructural capital” to cover the mix of human relationships and the position in the infrastructure that makes them matter. It’s a reminder that underneath those giant technology companies there still lurks the older ethos on which the Internet was founded, and humans whose incentives are entirely different from profit-making. And also: that sometimes intractable problems can be made less intractable.

Illustrations: WEIS waits for the eclipse.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

Game of carrots

The big news of the week has been the result of the Epic Games v. Google antitrust trial. A California jury took four hours to agree with Epic that Google had illegally tied together its Play Store and billing service, so that app makers could only use the Play Store to distribute their apps if they also used Google’s service for billing, giving Google a 30% commission. Sort of like, I own half the roads in this town, and if you want to sell anything to my road users you have to have a store in my mall and pay me a third of your sales revenue, and if you don’t like it, tough, because you can’t reach my road users any other way. Meanwhile, the owner of the other half of the town’s roads is doing exactly the same thing, so you can’t win.

At his BIG Substack, antitrust specialist Matt Stoller, who has been following the trial closely, gloats, “the breakup of Big Tech begins”. Maybe not so fast: Epic lost its similar case against Apple. Both of these cases are subject to appeal. Stoller suggests, however, that the latest judgment will carry more weight because it came from a jury of ordinary citizens rather than, as in the Apple case, a single judge. Stoller believes the precedent set by a jury trial is harder to ignore in future cases.

At The Verge, Sean Hollister, who has been covering the trial in detail, offers a summary of 20 key points he felt the trial established. Written before the verdict, Hollister’s assessment of Epic’s chances proved correct.

Even if the judgment is upheld in the higher courts, it will be a while before users see any effects. But: even if the judgment is overturned in the higher courts, my guess is that the technology companies will begin to change their behavior at least a bit, in self-defense. The real question is, what changes will benefit us, the people whose lives are increasingly dominated by these phones?

I personally would like it to be much easier to use an Android phone without ever creating a Google account, and to be confident that the phone isn’t sending masses of tracking data to either Google or the phone’s manufacturer.

But…I would still like to be able to download the apps I want from a source I can trust. I care less about who provides the source than I do about what data they collect about me and the cost.

I want that source to be easy to access, easy to use, and well-stocked, defining “well-stocked” as “has the apps I want” (which, granted, is a short list). The nearest analogy that springs to mind is TV channels. You don’t really care what channel the show you want to watch is on; you just want to be able to watch the show without too much hassle. If there weren’t so many rights holders running their own streaming services, the most sensible business logic would be for every show to be on every service. Then instead of competing on their catalogues, the services would be competing on privacy, or interface design, or price. Why shouldn’t we have independent app stores like that?

Mobile phones have always been more tightly controlled than the world of desktop computing, largely because they grew out of the tightly controlled telecommunications world. Desktop computing, like the Internet, served first the needs of the military and academic research, and they remain largely open even when they’re made by the same companies who make mobile phone operating systems. Desktop systems also developed at a time when American antitrust law still sought to increase competition.

It did not stay that way. As current FTC chair Lina Khan made her name pointing out in 2017, antitrust thinking for the last several decades has been limited to measuring consumer prices. The last big US antitrust case to focus on market effects was Microsoft, back in 1995. In the years since, it’s been left to the EU to act as the world’s antitrust enforcer. Against Google, the EU has filed three cases since 2010: over Shopping (Google was found guilty in 2017 and fined €2.4 billion, upheld on appeal in 2021); Android, over Google apps and the Play Store (Google was found guilty in 2018 and fined €4.3 billion and required to change some of its practices); and AdSense (fined €1.49 billion in 2019). But fines – even if the billions eventually add up to real money – don’t matter enough to companies with revenues the size of Google’s. Being ordered to restructure its app store might.

At the New York Times, Steve Lohr compares the Microsoft and Epic v Google cases. Microsoft used its contracts with PC makers to prevent them from preinstalling its main web browser rival, Netscape, in order to own users’ path into the accelerating digital economy. Google’s contracts instead paid Apple, Samsung, Mozilla, and others to favor it on their systems – “carrots instead of sticks,” NYU law professor Harry First told Lohr.

The best thing about all this is that the Epic jury was not dazzled by the incomprehensibility effect of new technology. Principles are coming back into focus. Tying – leveraging your control over one market in order to dominate another – is no different if you say it in app stores than if you say it in gas stations or movie theaters.

Illustrations: “The kind of anti-trust legislation that is needed”, by J.S. Pughe (via Library of Congress).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon

Planned incompatibility

My first portable music player was a monoaural Sony cassette player a little bigger than a deck of cards. I think it was intended for office use as a dictation machine, but I hauled it to folk clubs and recorded the songs I liked, and used it to listen to music while in transit. Circa 1977, I was the only one on most planes.

At the time, each portable device had its own charger with its own electrical specification and plug type. Some manufacturers saw this as an opportunity, and released so-called “universal” chargers that came with an array of the most common plugs and user-adjustable settings so you could match the original amps and volts. Sony reacted by ensuring that each new generation had a new plug that wasn’t included on the universal chargers…which would then copy it….which would push Sony to come up with yet another new plug And so on. All in the name of consumer safety, of course.

Sony’s modern equivalent (which of course includes Sony itself) doesn’t need to invent new plugs because more sophisticated methods are available. They can instead insert a computer chip that the main device checks to ensure the part is “genuine”. If the check fails, as it might if you’ve bought your replacement part from a Chinese seller on eBay, the device refuses to let the new part function. This is how Hewlett-Packard has ensured that its inkjet printers won’t work with third-party cartridges, it’s one way that Apple has hobbled third-party repair services, and it’s how, as this week’s news tells us, the PS5 will check its optonal disc drives.

Except the PS5 has a twist: in order to authenticate the drive the PS5 has to use an Internet connection to contact Sony’s server. I suppose it’s better than John Deere farm equipment, which, Cory Doctorow writes in his new book, The Internet Con: How to Seize the Means of Computation, requires a technician to drive out to a remote farm and type in a code before the new part will work while the farmer waits impatiently. But not by much, if you’re stuck somewhere offline.

“It’s likely that this is a security measure in order to ensure that the disc drive is a legitimate one and not a third party,” Video Gamer speculates. Checking the “legitimacy” of an optional add-on is not what I’d call “security”; in general it’s purely for the purpose of making it hard for customers to buy third-party add-ons (a goal the article does nod at later). Like other forms of digital rights management, the nuisance all accrues to the customer and the benefits, such as they are, accrue only to the manufacturer.

As Doctorow writes, part-pairing, as this practice is known, originated with cars (for this reason, it’s also often known as “VIN” locking, from vehicle information number), brought in to reducee the motivation to steal cars in order to strip them and sell their parts (which *is* security). The technology sector has embraced and extended this to bolster the Gilette business model: sell inkjet printers cheap and charge higher-than-champagne prices for ink. Apple, Doctorow writes, has used this approach to block repairs in order to sustain new phone sales – good for Apple, but wasteful for the environment and expensive for us. The most appalling of his examples, though, is wheelchairs, which are “VIN-locked and can’t be serviced by a local repair shop”, and medical devices. Making on-location repairs impossible in these cases is evil.

The PS5, though, compounds part-pairing by requiring an Internet connection, a trend that really needs not to catch on. As hundreds of Tesla drivers discovered the hard way during an app server outage it’s risky to presume those connections will always be there when you need them. Over the last couple of decades, we’ve come to accept that software is not a purchase but a subscription service subject to license. Now, hardware is going the same way, as seemed logical from the late-1990s moment when MIT’s Neil Gershenfeld proposed Things That Think. Back then, I imagined the idea applying to everyday household items, not devices that keep our bodies functioning. This oncoming future is truly dangerous, as Andrea Matwyshyn has been pointing out..

For Doctorow, the solution is to mandate and enforce interoperability as well as other regulations such as antitrust law. The right to repair laws that are appearing inany jurisdictions (and which companies like Apple and John Deere have historically opposed). Requiring interoperability would force companies to enable – or at least not to hinder – third-party repairs.

But more than that is going to be needed if we are to avoid a future in which every piece of our personal infrastructures is turned into a subscription service. At The Register, Richard Speed reminds that Microsoft will end support for Windows 10 in 2025, potentially leaving 400 million PCs stranded. We have seen this before.

I’m not sure anyone in government circles is really thinking about the implications for an aging population. My generation still owns things; you can’t delete my library of paper books or charge me for each reread. But today’s younger generation, for whom everything is a rental…what will they do at retirement age, when income drops but nothing gets cheaper in a world where everything stops working the minute you stop paying? If we don’t force change now, this will be their future.

Illustrations: A John Deere tractor.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon

The documented life

For various reasons, this week I asked my GP for printed verification of my latest covid booster. They handed me what appears to be a printout of the entire history of my interactions with the practice back to 1997.

I have to say, reading it was a shock. I expected them to have kept records of tests ordered and the results. I didn’t think about them keeping everything I said on the website’s triage form, which they ask you to use when requesting an appointment, treatment, or whatever. Nor did I expect notes beginning “Pt dropped in to ask…”

The record doesn’t, however, show all details of all conversations I’ve had with everyone in the practice. It notes medical interactions, like noting a conversation in which I was advised about various vaccinations. It doesn’t mention that on first acquaintance with the GP to whom I’m assigned I asked her about her attitudes toward medical privacy and alternative treatments such as acupuncture. “Are you interviewing me?” she asked. A little bit, yes.

There are also bits that are wrong or outdated.

I think if you wanted a way to make the privacy case, showing people what’s in modern medical records would go a long way. That said, one of the key problems in current approaches to the issues surrounding mass data collection is that everything is siloed in people’s minds. It’s rare for individuals to look at a medical record and connect it to the habit of mind that continues to produce Google, Meta, Amazon, and an ecosystem of data brokers that keeps getting bigger no matter how many data protection laws we pass. Medical records hit a nerve in an intimate way that purchase histories mostly don’t. Getting the broad mainstream to see the overall picture, where everything connects into giant, highly detailed dossiers on all of us, is hard.

And it shouldn’t be. Because it should be obvious by now that what used to be considered a paranoid view has a lot of reality. Governments aren’t highly motivated to curb commercial companies’ data collecction because that all represents data that can be subpoenaed without the risk of exciting a public debate or having to justify a budget. In the abstract, I don’t care that much who knows what about me. Seeing the data on a printout, though, invites imagining a hostile stranger reading it. Today, that potentially hostile stranger is just some other branch of the NHS, probably someone looking for clues in providing me with medical care. Five or twenty years from now…who knows?

More to the point, who knows what people will think is normal? Thirty years ago, “normal” meant being horrified at the idea of cameras watching everywhere. It meant fingerprints were only taken from criminal suspects. And, to be fair, it meant that governments could intercept people’s phone calls by making a deal with just one legacy giant telephone company (but a lot of people didn’t fully realize that). Today’s kids are growing up thinking of constantly being tracked as normal, I’d like to think that we’re reaching a turning point where what Big Tech and other monopolists have tried to convince is is normal is thoroughly rejected. It’s been a long wait.

I think the real shock in looking at records like this is seeing yourself through someone else’s notes. This is very like the moment in the documentary Erasing David, when the David of the title gets his phone book-sized records from a variety of companies. “What was I angry about on November 2006?” he muses, staring at the note of a moment he had long forgotten but the company hadn’t. I was relieved to see there were no such comments. On the other hand, also missing were a couple of things I distinctly remember asking them to write down.

But don’t get me wrong: I am grateful that someone is keeping these notes besides me. I have medical records! For the first 40 years of my life, doctors routinely refused to show patients any of their medical records. Even when I was leaving the US to move overseas in 1981, my then-doctor refused to give me copies, saying, “There’s nothing there that would be any use to you.” I took that to mean there were things he didn’t want me to see. Or he didn’t want to take the trouble to read through and see that there weren’t. So I have no record of early vaccinations or anything else from those years. At some point I made another attempt and was told the records had been destroyed after seven years. Given that background, the insousiance with which the receptionist printed off a dozen pages of my history and handed it over was a stunning advance in patient rights.

For the last 30-plus years, therefore, I’ve kept my own notes. There isn’t, after checking, anything in the official record that I don’t have. There may, of course, be other notes they don’t share with patients.

Whether for purposes malign (surveillance, control) or benign (service), undocumented lives are increasingly rare. In an ideal world, there’d be a way for me and the medical practice to collaborate to reconcile discrepancies and rectify omissions. The notion of patients controlling their own data is still far from acceptance. That requires a whole new level of trust.

Illustrations: Asclepius, god of medieine, exhibited in the Museum of Epidaurus Theatre (Michael F. Mehnert via Wikimedia).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon

The end of cool

For a good bit of this year’s We Robot, it felt like abstract “AI” – that is, algorithms running on computers with no mobility – had swallowed the robots whose future this conference was invented to think about. This despite a pre-conference visit to Boston Dynamics, which showed off its Atlas
‘s ability to do gymnastics. It’s cute, but is it useful? Your washing machine is smarter, and its intelligence solves real problems like how to use less water.

There’s always some uncertainty about boundaries at this event: is a machine learning decision making system a robot? At the inaugural We Robot in 2012, the engineer Bill Smart summed up the difference: “My iPhone can’t stab me in my bed.” Of course, neither could an early Roomba, which most would agree was the first domestic robot. However, it was also dumb as a floor tile, achieving cleanliness through random repetition rather than intelligent mapping. In the Roomba 1.0 sense, a “robot” is “a device that does boring things so I don’t have to”. Not cool, but useful, and solves a real problem

During a session in which participants played a game designed to highlight the conflicts inherent in designing an urban drone delivery system, Lael Odhner offered yet another definition: “A robot is a literary device we use to voice our discomfort with technology.” In the context of an event where participants think through the challenges robots bring to law and policy, this may be the closest approximation.

In the design exercise, our table’s three choices were: fund the FAA (so they can devise and enforce rules and policies), build it as a municipally-owned public service both companies and individuals can use as customers, and ban advertising on the drones for reasons of both safety and offensiveness. A similar exercise last year produced more specific rules, but also led us to realize that a drone delivery service had no benefits over current delivery services.

Much depends on scale. One reason we chose a municipal public service was the scale of noise and environmental impact inevitably generated by multiple competing commercial services. In a paper, Woody Hartzog examined the meaning of “scale”: is scale *more*, or is scale *different*? You can argue, as net.wars often has, that scale *creates* difference, but it’s rarely clear where to place the threshold, or how reaching it changes a technology’s harms or who it makes vulnerable. Ryan Calo and Daniella DiPaola suggested that rather than associate vulnerability with particular classes of people we should see it as variable with circumstances: “Everyone is vulnerable sometimes, and vulnerability is a state that can be created and manipulated toward particular ends.” This seems a more logical and fairer approach.

An aspect of this is that there are two types of rules: harm rules, which empower institutions to limit harm, and power rules, which empower individuals to protect themselves. A possible worked example soon presented itself in Kegan J Strawn;s and Daniel Sokol‘s paper on safety techniques in mobile robots, which suggested copying medical ethics’ consent approach. Then someone described the street scene in which every pedestrian had to give consent to every passing experimental Tesla, a possibly an even worse scenario than ad-bearing delivery drones. Pedestrians get nothing out of the situation, and Teslas don’t become safer. What you really want is for car companies not to test the safety of autonomous vehicles on public roads with pedestrians as unwitting crash test dummies.

I try to think every year how our ideas about inegrating robots into society are changing over time. An unusual paper from Maria P. Angel considered this question with respect to privacy scholarship by surveying 1990s writing and 20 years of papers presented at Privacy Law Scholars. We Robot co-founders Calo, Michael Froomkin, and Ian Kerr partly copied its design. Angel’s conclusion is roughly that the 1990s saw calls for an end to self-regulation while the 2000s moved from privacy as necessary for individual autonomy and self-determination to collective benefits and most recently to its importance for human flourishing.

As Hartzog commented, he came to the first We Robot with the belief that “Robots are magic”, only to encounter Smart’s “really fancy hammers.” And, Smart and Cindy Grimm added in 2018, controlled by sensors that are “late, noisy, and wrong”. Hartzog’s early excitement was shared by many of us; the future looked so *interesting* when it was almost entirely imaginary.

Over time, the robotic future has become more nowish, and has shifted in response to technological development; the discussion has become more about real systems (2022) than imagined future ones. The arrival of real robots on our streets – for example, San Francisco’s 2017 use of security robots to deter homeless camps – changed parts of the discussion from theoretical to practical.

In the mid-2010s, much discussion focused on problems of fairness, especially to humans in the loop, who, Madeleine Claire Elish correctly predicted in 2016 would be blamed for failures. More recently, the proliferation of data-gathering devices (sensors, cameras) into everything from truckers’ cabs to agriculture and the arrival of new algorithmic systems dubbed AI has raised awareness of the companies behind these technologies. And, latterly, that often the technology diverts attention from the better possibilities of structural change.

But that’s not as cool.

Illustrations: Boston Dynamics’ Atlas robots doing synchronized backflips (via YouTube).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

Small data

Shortly before this gets posted, Jon Crowcroft and I will have presented this year’s offering at Gikii, the weird little conference that crosses law, media, technology, and pop culture. This is what we will possibly may have said, as I understand it, with some added explanation for the slightly less technical audience I imagine will read this.

Two years ago, a team of four researchers – Timnit Gebru, Emily Bender, Margaret Mitchell (writing as Shmargaret Shmitchell), and Angelina McMillan-Major – wrote a now-famous paper called On the Dangers of Stochastic Parrots (PDF) calling into question the usefulness of the large language models (LLMs) that have caused so much ruckus this year. The “Stochastic Four” argued instead of small models built on carefully curated data: less prone to error, less exploitive of people’s data, less damaging to the planet. Gebru got fired over this paper; Google also fired Mitchell soon afterwards. Two years later, neural networks pioneer Geoff Hinton quit Google in order to voice similar concerns.

Despite the hype, LLMs have many problems. They are fundamentally an extractive technology and are resource-intensive. Building LLMs requires massive amounts of training data; so far, the companies have been unwilling to acknowledge their sources, perhaps because (as is happening already) they fear copyright suits.

More important from a technical standpoint, is the issue of model collapse; that is, models degrade when they begin to ingest synthetic AI-generated data instead of human input. We’ve seen this before with Google Flu Trends, which degraded rapidly as incoming new search data included many searches on flu-like symptoms that weren’t actually flu, and others that simply reflected the frequency of local news coverage. “Data pollution” as LLM-generated data fills the web, will mean that the web will be an increasingly useless source of training data for future generations of generative AI. Lots more noise, drowning out the signal (in the photo above, the signal would be the parrot).

Instead, if we follow the lead of the Stochastic Four, the more productive approach is small data – small, carefully curated datasets that train models to match specific goals. Far less resource-intensive, far fewer issues with copyright, appropriation, and extraction.

We know what the LLM future looks like in outline: big, centralized services, because no one else will be able to amass enough data. In that future, surveillance capitalism is an essential part of data gathering. SLM futures could look quite different: decentralized, with realigned incentives. At one point, we wanted to suggest that small data could bring the end of surveillance capitalism; that’s probably an overstatement. But small data could certainly create the ecosystem in which the case for mass data collection would be less compelling.

Jon and I imagined four primary alternative futures: federation, personalization, some combination of those two, and paradigm shift.

Precursors to a federated small data future already exist; these include customer service chatbots, predictive text assistants. In this future, we could imagine personalized LLM servers designed to serve specific needs.

An individualized future might look something like I suggested here in March: a model that fits in your pocket that is constantly updated with material of your own choosing. Such a device might be the closest yet to Vannevar Bush’s 1945 idea of the Memex (PDF), updated for the modern era by automating the dozens of secretary-curators he imagined doing the grunt work of labeling and selection. That future again has precursors in techniques for sharing the computation but not the data, a design we see proposed for health care, where the data is too sensitive to share unless there’s a significant public interest (as in pandemics or very rare illnesses), or in other data analysis designs intended to protect privacy.

In 2007, the science fiction writer Charles Stross suggested something like this, though he imagined it as a comprehensive life log, which he described as a “google for real life”. So this alternative future would look something like Stross’s pocket $10 life log with enhanced statistics-based data analytics.

Imagining what a paradigm shift might look like is much harder. That’s the kind of thing science fiction writers do; it’s 16 years since Stross gave that life log talk. However, in his 2018 history of advertising, The Attention Merchants, Columbia professor Tim Wu argued that industrialization was the vector that made advertising and its grab for our attention part of commerce. A hundred and fifty-odd years later, the centralizing effects of industrialization are being challenged starting with energy via renewables and local power generation and social media via the fediverse. Might language models also play their part in bringing a new, more collaborative and cooperative society?

It is, in other words, just possible that the hot new technology of 2023 is simply a dead end bringing little real change. It’s happened before. There have been, as Wu recounts, counter-moves and movements before, but they didn’t have the technological affordances of our era.

In the Q&A that followed, Miranda Mowbray pointed out that companies are trying to implement the individualized model, but that it’s impossible to do unless there are standardized data formats, and even then hard to do at scale.

Illustrations: Spot the parrot seen in a neighbor’s tree.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Wendy M. GrossmanPosted on Categories AI, Events, New tech, old knowledgeTags 1 Comment on Small data

Review: Making a Metaverse That Matters

Making a Metaverse That Matters: From Snow Crash and Second Life to A Virtual World Worth Fighting For
By Wagner James Au
Publisher: Wiley
ISBN: 978-1-394-15581-1

A couple of years ago, when “the metaverse” was the hype-of-the-month, I kept wondering why people didn’t just join 20-year-old Second Life, or a game world. Even then the idea wasn’t new: the first graphical virtual world, Habitat, launched in 1988. And even *that* was preceded by text-based MUDs that despite their limitations afforded their users the chance to explore a virtual world and experiment with personal identity.

I never really took to Second Life. The initial steps – download the software, install it, choose a user name and password, and then an avatar – aren’t difficult. The trouble begins after that: what do I do now? Fly to an island, and then…what?

I *did*, once, have a commission to interview a technology company executive, who dressed his avatar in a suit and tie to give a lecture in a virtual auditorium and then joined me in the now-empty auditorium to talk, now changedinto jeans, T-shirt, and baseball cap.

In his new book, Making a Metaverse That Matters, the freelance journalist Wagner James Au argues that this sort of image consciousness derives from allowing humanoid avatars; they lead us to bring the constraints of our human societies into the virtual world, where instead we could free our selves. Humanoid form leads people to observe the personal space common in their culture, apply existing prejudices, and so on. Au favors blocking markers such as gender and skin color that are the subject of prejudice offline. I’m not convinced this will make much difference; even on text-based systems with numbers instead of names disguising your real-life physical characteristics takes work.

Au spent Second Life’s heyday as its embedded reporter; his news and cultural reports eventually became his 1999 book, The Making of Second Life: Notes from a New World. Part of his new book reassesses that work and reports regrets. He wishes he had been a stronger critic back then instead of being swayed by his own love for the service. Second Life’s biggest mistake, he thinks, was persistently refusing to call itself a game or add game features. The result was a dedicated user base that stubbornly failed to grow beyond about 600,000 as most people joined and reacted the way I did: what now? But some of those 600,000 benefited handsomely, as Au documents: some remade their lives, and a few continue to operate million-dollar businesses built inside the service.

Au returns repeatedly to Snow Crash author Neal Stephenson‘s original conception of the metaverse, a single pervasive platform. The metaverse of Au’s dreams has community as its core value, is accessible to all, is a game (because non-game virtual worlds have generally failed), and collaborative for creators. In other words, pretty much the opposite of anything Meta is likely to build.


Whatever you’re starting to binge-watch, slow down. It’s going to be a long wait for fresh content out of Hollywood.

Yesterday, the actors union, SAG-AFTRA, went out on strike alongside the members of the Writers Guild of America, who have been “>walking picket lines since May 2. Like the writers, actors have seen their livelihoods shrink as US TV shows’ seasons shorten, “reruns” that pay residuals fade into the past, and DVD royalties dry up, while royalties from streaming remain tiny by comparison. At the Hollywood and Levine podcast, the veteran screenwriter Ken Levine gives the background to the WGA’s action. But think of it this way: the writers and cast of The Big Bang Theory may be the last to share fairly in the enormous profits their work continues to generate.

The even bigger threat? AI that makes it possible to capture the actor’s likeness and then reuse it ad infinitum in new work. This, as Malia Mendez writes at the LA Times, is the big fear. In a world where Harrison Ford at 80 is making movies in which he’s aged down to look 40 and James Earl Jones has agreed to clone his voice for reuse after his death, it’s arguably a rational big fear.

We’ve had this date for a long time. In the late 1990s I saw a demonstration of “vactors” – virtual actors that were created by scanning a human actor moving in various ways and building a library of movements that thereafter could be rendered at will. At the time, the state of the art was not much advanced from the liquid metal man in Terminator 2. Rendering film-quality characters was very slow, but that was then and this is now, and how long before rendering moving humans can be done in high-def in real-time at action speed?

The studios are already pushing actors into allowing synthesized reuse. California law grants public figures, including actors, publicity rights that prevent the commercial use of their name and likeness without consent. However, Mendez reports that current contracts already require actors to waive those rights to grant the studios digital simulation or digital creation rights. The effects are worst in reality television, where the line is blurred between the individual as a character on a TV show and the individual in their off-screen life. She quotes lawyer Ryan Schmidt: “We’re at this Napster 2001 moment…”

That moment is even closer for voice actors. Last year, Actors Equity announced a campaign to protect voice actors from their synthesized counterparts. This week, one of those synthesizers is providing commentary – more like captions, really – for video clips like this one at Wimbledon. As I said last year, while synthesized voices will be good enough for many applications such as railway announcements, there are lots of situations that will continue to require real humans. Sports commentary is one; commentators aren’t just there to provide information, they’re *also* there to sell the game. Their human excitement at the proceedings is an important part of that.

So SAG-AFTRA, like the Writers Guild of America, is seeking limitations on how studios may use AI, payment for such uses, and rules on protecting against misuse. In another LA Times story, Anoushka Sakoui reports that the studios’ offer included requiring “a performer’s consent for the creation and use of digital replicas or for digital alterations of a performance”. Like publishers “offering” all-rights-in perpetuity contracts to journalists and authors since the 1990s, the studios are trying to ensure they have all the rights they could possibly want.

“You cannot change the business model as much as it has changed and not expect the contract to change, too,” SAG-AFTRA president Fran Drescher said yesterday in a speech that has been widely circulated.

It was already clear this is going to be a long strike that will damage tens of thousands of industry workers and the economy of California. Earlier this week, Dominic Patten reported at Deadline that the Association of Movie and Television Producers plans to delay resuming talks with the WGA until October. By then, Patten reports producers saying, writers will be losing their homes and be more amenable to accepting the AMPTP’s terms. The AMPTP officially denies this, saying it’s committed to reaching a deal. Nonetheless, there are no ongoing talks. As Ken Levine pointed out in a pair of blogposts written during the 2007 writers strike, management is always in control of timing.

But as Levine also says, in the “old days” a top studio mogul could simply say, “Let’s get this done” and everyone would get around the table and make a deal. The new presence of tech giants Netflix, Amazon, and Apple in the AMPTP membership makes this time different. At some point, the strike will be too expensive for legacy Hollywood studios. But for Apple, TV production is a way to sell services and hardware. For Amazon, it’s a perk that comes with subscribing to its Prime delivery service. Only Netflix needs a constant stream of new work – and it can commission it from creators across the globe. All three of them can wait. And the longer they drag this out, the more the traditional studios will lose money and weaken as competitors.

Legacy Hollywood doesn’t seem to realize it yet, but this strike is existential for them, too.

Illustrations: SAG-AFTRA president Fran Drescher, announcing the strike on Thursday.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Follow on Mastodon.