The toast bubble

To The Big Bang Theory (“The Russian Rocket Reaction”, S5e05):

Howard: Someone has to go up with the telescope as a payload specialist, and guess who that someone is!
Sheldon: Muhammed Li.
Howard: Who’s Muhammed Li?
Sheldon: Muhammed is the most common first name in the world, Li the most common surname, and as I didn’t know the answer I thought that gave me a mathematical edge.

Experts tell me that exchange doesn’t perfectly explain how generative AI works; it’s too simplistic. Generative AI – or a Sheldon made more nuanced by his writers – takes into account contextual information to calculate the probable next word. So it wouldn’t pick from all the first names and surnames in the world. It might, however, pick from the names of all the payload specialists or some other group it correlated, or confect one.

More than a year on, I still can’t find a use for generative “AI” that is so unreliable and inscrutable. At Exponential View, Azeem Azhar has written about the “answer engine” Perplexity.ai. While it’s helpful that Perplexity provides references for its answers, it was producing misinformation by the third question I asked it, and offered no improvement when challenged. Wikipedia spent many years being accused of unreliability, too, but at least there you can read the talk page and understand how the editors arrived at the text they present.

On The Daily Show this week, Jon Stewart ranted about AI and interviewed FTC chair Lina Khan. Well-chosen video clips showed AI company heads’ true colors, telling the public AI is an assistant for humans while telling money people and each other that AI will enable greater productivity with fewer workers and help eliminate “the people tax”.

More interesting, however, was Khan’s note that the FTC is investigating the investments and partnerships in AI to understand if they’re giving current technology giants undue influence in the marketplace. If, in her example, all the competitors in a market outsource their pricing decisions to the same algorithm they may be guilty of price fixing even if they’re not actively colluding. And these markets are consolidating at an ever-earlier stage. Snapchat and WhatsApp had millions of users by the time Facebook thought it prudent to buy them rather than let them become dangerous competitors. AI is pre-consolidating: the usual suspects have been buying up AI startups and models at pace.

“More profound than fire or electricity,” Google CEO Sundar Pichai tells a camera at one point, speaking about AI. The last time I heard this level of hyperbole it was about the Internet in the 1990s, shortly before the bust. A friend’s answer to this sort of thing has never varied: “I’d rather have indoor plumbing.”

***

Last week the Federal District Court in Manhattan sentenced FTX CEO Sam Bankman-Fried to 25 years in prison for stealing $8 billion. In the end, you didn’t have to understand anything complicated about cryptocurrencies; it was just good old embezzlement.

And then the price of bitcoin went *up*. At the Guardian, Molly White explains that this is because cryptoevangelists are pushing the idea that the sector can reach its full potential, now that Bankman-Fried and other bad apples have been purged. But, as she says, nothing has really changed. No new use case has come along to make cryptocurrencies more useful, more valuable, or more trustworthy.

Both cryptocurrencies and generative AI are bubbles. The difference is that the AI bubble will likely leave behind it some technologies and knowledge that are genuinely useful; it will be like the Internet, which boomed and busted before settling in to change the world. Cryptocurrencies are more like the Dutch tulips. Unfortunately, in the meantime both these bubbles are consuming energy at an insane rate. How many wildfires is bitcoin worth?

**

I’ve seen a report suggesting that the last known professional words of the late Ross Anderson may have been, “Do they take us for fools?”

He was referring to the plans, debated in the House of Commons on March 25, to amend the Investigatory Powers Act to allow the government to pre-approve (or disapprove) new security features technology firms want to intorduce. The government is of course saying it’s all perfectly innocent, intended to keep the country safe. But recent clashes in the decades-old conflict over strong encryption have seen the technology companies roll out features like end-to-end encryption (Meta) and decide not to implement others, like client-side scanning (Apple). The latest in a long line of UK governments that want access to encrypted text was hardly going to take that quietly. So here we are, debating this yet again. Yet the laws of mathematics still haven’t changed: there is no such thing as a security hole that only “good guys” can use.

***

Returning to AI, it appears that costs may lead Google to charge for access to its AI-enhanced search, as Alex Hern reports at the Guardian. Hern thinks this is good news for its AI-focused startup competitors, which already charge for top-tier tools and who are at risk of being undercut by Google. I think it’s good for users by making it easy to avoid the AI “enhancement”. Of course, DuckDuckGo already does this without all the tracking and monopoly mishegoss.

Illustrations: Jon Stewart uninspired by Mark Zuckerberg’s demonstration of AI making toast.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

Facts are scarified

The recent doctored Palace photo has done almost as much as the arrival of generative AI to raise fears that in future we will completely lose the ability to identify fakes. The royal photo was sloppily composited – no AI needed – for reasons unknown (though Private Eye has a suggestion). A lot of conspiracy theorizing could be avoided if the palace would release the untouched original(s), but as things are, the photograph is a perfect example of how to provide the fuel for spreading nonsense to 400 million people.

The most interesting thing about the incident was discovering the rules media apply to retouching photos. AP specified, for example, that it does not use altered or digitally manipulated images. It allows cropping and minor adjustments to color and tone where necessary, but bans more substantial changes, even retouching to remove red eye. As Holly Hunter’s character says, trying to uphold standards in the 1987 movie Broadcast News (written by James Brooks), “We are not here to stage the news.”

The desire to make a family photo as appealing as possible is understandable; the motives behind spraying the world with misinformation are less clear and more varied. I’ve long argued here that for this reason combating misinformation and disinformation is similar to cybersecurity because of the complexity of the problem and the diversity of actors and agendas. At last year’s Disinformation Summit in Cambridge cybersecurity was, sadly, one of the missing communities.

Just a couple of weeks ago the BBC announced its adoption of C2PA for authenticating images, developed by a group of technology and media companies including the BBC, the New York Times, Microsoft, and Adobe. The BBC says that many media organizations are beginning to adopt C2PA, and even Meta is considering it. Edits must be signed, and create a chain of provenance all the way back to the original photo. In 2022, the BBC and the Royal Society co-hosted a workshop on digital provenance, following a Royal Society report, at which C2PA featured prominently.

That’s potentially a valuable approach for publishing and broadcast, where the conduit to the public is controlled by one of a relatively small number of organizations. And you can see why those organizations would want it: they need, and in many cases are struggling to retain, public trust. It is, however, too complex a process for the hundreds of millions of people with smartphone cameras posting images to social media, and unworkable for citizen journalists capturing newsworthy events in real time. Ancillary issue: sophisticated phone cameras try so hard to normalize the shots we take that they falsify the image at source. In 2020, Californians attempting to capture the orange color of their smoke-filled sky were defeated by autocorrection that turned it grey. So, many images are *originally* false.

In lengthy blog posting, Neal Krawitz analyzes difficulties with C2PA. He lists security flaws, but also is opposed to the “appeal to authority” approach, which he dubs a “logical fallacy”. In the context of the Internet, it’s worse than that; we already know what happens when a tiny handful of commercial companies (in this case, chiefly Adobe) become the gatekeeper for billions of people.

All of this was why I was glad to hear about work in progress at a workshop last week, led by Mansoor Ahmed-Rengers, a PhD candidate studying system security: Human-Oriented Proof Standard (HOPrS). The basic idea is to build an “Internet-wide, decentralised, creator-centric and scalable standard that allows creators to prove the veracity of their content and allows viewers to verify this with a simple ‘tick’.” Co-sponsoring the workshop was Open Origins, a project to distinguish between synthetic and human-created content.

It’s no accident that HOPrS’ mission statement echoes the ethos of the original Internet; as security researcher Jon Crowcroft explains, it’s part of long-running work on redecentralization. Among HOPrS’ goals, Ahmed-Rengers listed: minimal centralization; the ability for anyone to prove their content; Internet-wide scalability; open decision making; minimal disruption to workflow; and easy interpretability of proof/provenance. The project isn’t trying to cover all bases – that’s impossible. Given the variety of motivations for fakery, there will have to be a large ecosystem of approaches. Rather, HOPrS is focusing specifically on the threat model of an adversary determined to sow disinformation, giving journalists and citizens the tools they need to understand what they’re seeing.

Fakes are as old as humanity. In a brief digression, we were reminded that the early days of photography were full of fakery: the Cottingley Fairies, the Loch Ness monster, many dozens of spirit photographs. The Cottingley Fairies, cardboard cutouts photographed by Elsie Wright, 16, and Florence Griffiths, 9, were accepted as genuine by Sherlock Holmes creator Sir Arthur Conan Doyle, famously a believer in spiritualism. To today’s eyes, trained on millions of photographs, they instantly read as fake. Or take Ireland’s Knock apparitions, flat, unmoving, and, philosophy professor David Berman explained in 1979, magic lantern projections. Our generation, who’ve grown up with movies and TV, would I think have instantly recognized that as fake, too. Which I believe tells us something: yes, we need tools, but we ourselves will get better at detecting fakery, as unlikely as it seems right now. The speed with which the royal photo was dissected showed how much we’ve learned just since generative AI became available.

Illustrations: The first of the Cottingley Fairies photographs (via Wikimedia).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

Anachronistics

“In my mind, computers and the Internet arrived at the same time,” my twenty-something companion said, delivering an entire mindset education in one sentence.

Just a minute or two earlier, she had asked in some surprise, “Did bulletin board systems predate the Internet?” Well, yes: BBSs were a software package running on a single back room computer with a modem users dialed into, whereas the Internet is this giant sprawling mess of millions of computers connected together…simple first, complex later.

Her confusion is understandable: from her perspective, computers and the Internet did arrive at the same time, since her first conscious encounters with them were simultaneous.

But still, speaking as someone who first programmed a (mainframe, with punch cards) computer in 1972 as a student, who got her first personal computer in 1982, and got online in 1991 by modem and 1999 by broadband and to whom the sequence of events is memorable: wow.

A 25-year-old today was born in 1999 (the year I got broadband). Her counterpart 15 years hence (born 2014, the year a smartphone replaced my personal digital assistant) may think smart phones and the Internet were simultaneous. And sometime around 2045 *her* counterpart born in 2020 (two years before ChatGPT was released) might think generative text and image systems were contemporaneous with the first computers.

I think this confusion must have something to do with the speed of change in a relatively narrow sector. I’m sure that even though they all entered my life simultaneously, by the time I was 25 I knew that radio preceded TV (because my parents grew up with radio), bicycles preceded cars, and that handwritten manuscripts predated printed books (because medieval manuscripts). But those transitions played out over multiple lifetimes, if not centuries, and all those memories were personal. Few of us reminisce about the mainframes of the 1960s because most of us didn’t have access to them.

And yet, understanding the timeline of earlier technologies probably mattered less than not understanding the sequence of events in information technology. Jumbling the arrival dates of the pieces of information technology means failing to understand dependencies. What currently passes for “AI” could not exist without being able to train models on giant piles of data that the Internet and the web made possible, and that took 20 years to build. Neural networks pioneer Geoff Hinton came up with the ideas for convolutional neural networks as long ago as the 1980s, but it took until the last decade for them to become workable. That’s because it took that long to build sufficiently powerful computers and to amass enough training data. How do you understand the ongoing battle between those who wish to protect privacy via data protection laws and those who want data to flow freely without hindrance if you do not understand what those masses of data are important for?

This isn’t the only such issue. A surprising number of people who should know better seem to believe that the solution to all our ills with social media is to destroy Section 230, apparently believing that if S230 allowed Big Tech to get big, it must be wrong. Instead, the reality is also that it allows small sites to exist and it is the legal framework that allows content moderation. Improve it by all means, but understand its true purpose first.

Reviewing movies and futurist projections such as Vannevar Bush’s 1946 essay As We May Think (PDF) and Alan Turing’s lecture, Computing Machinery and Intelligence? (PDF) doesn’t really help because so many ideas arrive long before they’re feasible. The crew in the original 1966 Star Trek series (to say nothing of secret agent Maxwell Smart in 1965) were talking over wireless personal communicators. A decade earlier, Arthur C. Clarke (in The Nine Billion Names of God) and Isaac Asimov (in The Last Question) were putting computers – albeit analog ones – in their stories. Asimov in particular imagined a sequence that now looks prescient, beginning with something like a mainframe, moving on to microcomputers, and finishing up with a vast fully interconnected network that can only be held in hyperspace. (OK, it took trillions of years, starting in 2061, but still..) Those writings undoubtedly inspired the technologists of the last 50 years when they decided what to invent.

This all led us to fakes: as the technology to create fake videos, images, and texts continues to improve, she wondered if we will ever be able to keep up. Just about every journalism site is asking some version of that question; they’re all awash in stories about new levels of fakery. My 25-year-old discussant believes the fakes will always be improving faster than our methods of detection – an arms race like computer security, to which I’ve compared problems of misinformation / disinformation before.

I’m more optimistic. I bet even a few years from now today’s versions of generative “AI” will look as primitive to us as the special effects in a 1963 episode of Dr Who or the magic lantern used to create the Knock apparitions do to generations raised on movies, TV, and computer-generated imagery. Humans are adaptable; we will find ways to identify what is authentic that aren’t obvious in the shock of the new. We might even go back to arguing in pubs.

Illustrations: Secret agent Maxwell Smart (Don Adams) talking on his shoe phone (via Wikimedia).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon

The good fight

This week saw a small gathering to celebrate the 25th anniversary (more or less) of the Foundation for Information Policy Research, a think tank led by Cambridge and Edinburgh University professor Ross Anderson. FIPR’s main purpose is to produce tools and information that campaigners for digital rights can use. Obdisclosure: I am a member of its advisory council.

What, Anderson asked those assembled, should FIPR be thinking about for the next five years?

When my turn came, I said something about the burnout that comes to many campaigners after years of fighting the same fights. Digital rights organizations – Open Rights Group, EFF, Privacy International, to name three – find themselves trying to explain the same realities of math and technology decade after decade. Small wonder so many burn out eventually. The technology around the debates about copyright, encryption, and data protection has changed over the years, but in general the fundamental issues have not.

In part, this is because what people want from technology doesn’t change much. A tangential example of this presented itself this week, when I read the following in the New York Times, written by Peter C Baker about the “Beatles'” new mash-up recording:

“So while the current legacy-I.P. production boom is focused on fictional characters, there’s no reason to think it won’t, in the future, take the form of beloved real-life entertainers being endlessly re-presented to us with help from new tools. There has always been money in taking known cash cows — the Beatles prominent among them — and sprucing them up for new media or new sensibilities: new mixes, remasters, deluxe editions. But the story embedded in “Now and Then” isn’t “here’s a new way of hearing an existing Beatles recording” or “here’s something the Beatles made together that we’ve never heard before.” It is Lennon’s ideas from 45 years ago and Harrison’s from 30 and McCartney and Starr’s from the present, all welded together into an officially certified New Track from the Fab Four.”

I vividly remembered this particular vision of the future because just a few days earlier I’d had occasion to look it up – a March 1992 interview for Personal Computer World with the ILM animator Steve Williams, who the year before had led the team that produced the liquid metal man for the movie Terminator 2. Williams imagined CGI would become pervasive (as it has):

“…computer animation blends invisibly with live action to create an effect that has no counterpart in the real world. Williams sees a future in which directors can mix and match actors’ body parts at will. We could, he predicts, see footage of dead presidents giving speeches, films starring dead or retired actors, even wholly digital actors. The arguments recently seen over musicians who lip-synch to recordings during supposedly ‘live’ concerts are likely to be repeated over such movie effects.”

Williams’ latest work at the time was on Death Becomes Her. Among his calmer predictions was that as CGI became increasingly sophisticated the boundary between computer-generated characters and enhancements would become invisible. Thirty years on, the big excitement recently has been Harrison Ford’s deaging for Indiana Jones and the Dial of Destiny. That used CGI, AI, and other tools to digitally swap in his face from 1980s footage.

Side note: in talking about the Ford work to Wired, ILM supervisor Andrew Whitehurst, exactly like Williams in 1992, called the new technology “another pencil”.

Williams also predicted endless legal fights over copyright and other rights. That at least was spot-on; AI and the perpetual reuse of retained footage without further payment is part of what the recent SAG-AFTRA strikes were about.

Yet, the problem here isn’t really technology; it’s the incentives. The businessfolk of Hollywood’s eternal desire is to guarantee their return on investment, and they think recycling old successes is the safest way to do that. Closer to digital rights, law enforcement always wants greater access to private communications; the frustration is that incoming generations of politicians don’t understand the laws of mathematics any better than their predecessors in the 1990s.

Many of the speakers focused on the issue of getting government to listen to and understand the limits of technology. Increasingly, though, a new problem is that, as Bruce Schneier writes in his latest book, The Hacker’s Mind, everyone has learned to think like hackers and subvert the systems they’re supposed to protect. The Silicon Valley mantra of “ask forgiveness, not permission” has become pervasive, whether it’s a technology platform deciding to collect masses of data about us or a police force deciding to stick a live facial recognition pilot next to Oxford Circus tube station. Except no one asks for forgiveness either.

Five years ago, at FIPR’s 20th anniversary, when GDPR is new, Anderson predicted (correctly) that the battles over encryption would move to device access. Today, it’s less clear what’s next. Facial recognition represents a step change; it overrides consent and embeds distrust in our public infrastructure.

If I were to predict the battles of the next five years, I’d look at the technologies being deployed around European and US borders to surveil migrants. Migrants make easy targets for this type of experimentatioon because they can’t afford to protest and can’t vote. “Automated suspicion,” Euronews.next calls it. That habit of mind is danagerous.

Illustrations: The liquid metal man in Terminator 2 reconstituting itself.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon

The one hundred

Among the highlights of this week’s hearings of the Covid Inquiry were comments made by Helen MacNamara, who was the deputy cabinet secretary during the relevant time, about the effect of the lack of diversity. The absence of women in the room, she said, led to a “lack of thought” about a range of issues, including dealing with childcare during lockdowns, the difficulties encountered by female medical staff in trying to find personal protective equipment that fit, and the danger lockdowns would inevitably pose when victims of domestic abuse were confined with their abusers. Also missing was anyone who could have identified issues for ethnic minorities, disabled people, and other communities. Even the necessity of continuing free school lunches was lost on the wealthy white men in charge, none of whom were ever poor enough to need them. Instead, MacNamara said, they spent “a disproportionate amount” of their time fretting about football, hunting, fishing, and shooting.

MacNamara’s revelations explain a lot. Of course a group with so little imagination about or insight into other people’s lives would leave huge, gaping holes. Arrogance would ensure they never saw those as failures.

I was listening to this while reading posts on Mastodon complaining that this week’s much-vaunted AI Safety Summit was filled with government representatives and techbros, but weak on human rights and civil society. I don’t see any privacy organizations on the guest list, for example, and only the largest technology platforms needed apply. Granted, the limit of 100 meant there wasn’t room for everyone. But these are all choices seemingly designed to make the summit look as important as possible.

From this distance, it’s hard to get excited about a bunch of bigwigs getting together to alarm us about a technology that, as even the UK government itself admits, may – even most likely – will never happen. In the event, they focused on a glut of disinformation and disruption to democratic polls. Lots of people are thinking about the first of these, and the second needs local solutions. Many technology and policy experts are advocating openness and transparency in AI regulation.

Me, I’d rather they’d given some thought to how to make “AI” (any definition) sustainable, given the massive resources today’s math-and-statistics systems demand. And I would strongly favor a joint resolution to stop using these systems for surveillance and eliminate predictive systems that pretend to be sble to spot potential criminals in advance or decide who are deserving of benefits, admission into retail stores, or parole. But this summit wasn’t about *us*.

***

A Mastodon post reminded me that November 2 – yesterday – was the 35th anniversary of the Morris Worm and therefore the 35th anniversary of the day I first heard of the Internet. Anniversaries don’t matter much, but any history of the Internet would include this now largely-fotgotten (or never-known) event.

Morris’s goals were pretty anodyne by today’s standards. He wanted, per Wikipedia, to highlight flaws in some computer systems. Instead, the worm replicated out of control and paralyzed parts of this obscure network that linked university and corporate research institutions, who now couldn’t work. It put the Internet on the front pages for the first time.

Morris became the first person to be convicted of a felony under the brand-new Computer Fraud and Abuse Act (1986); that didn’t stop him from becoming a tenured professor at MIT in 2006. The heroes of the day were the unsung people who worked hard to disable the worm and restore full functionality. But it’s the worm we remember.

It was another three years before I got online myself, in 1991, and two or three more years after that before I got direct Internet access via the now-defunct Demon Internet. Everyone has a different idea of when the Internet began, usually based on when they got online. For many of us, it was November 2, 1988, the day when the world learned how important this technology they had never heard of had already become.

***

This week also saw the first anniversary of Twitter’s takeover. Despite a variety of technical glitches and numerous user-hostile decisions, the site has not collapsed. Many people I used to follow are either gone or posting very little. Even though I’m not experiencing the increased abuse and disinformation I see widely reported, there’s diminishing reward for checking in.

There’s still little consensus on a replacement. About half of my Twitter list have settled in on Mastodon. Another third or so are populating Bluesky. I hear some are finding Threads useful, but until it has a desktop client I’m out (and maybe even then, given its ownership). A key issue, however, is that uncertainty about which site will survive (or “win”) leads many people to post the same thing on multiple services. But you don’t dare skip one just in case.

For both philosophical and practical reasons, I’m hoping more people will get comfortable on Mastodon. Any corporate-owned system will merely replicate the situation in which we become hostages to business interests who have as little interest in our welfare as Boris Johnson did according to MacNamara and other witnesses. Mastodon is not a safe harbor from horrible human behavior, but with no ads and no algorithm determining what you see, at least the system isn’t designed to profit from it.

Illustrations: Former deputy cabinet secretary Helen MacNamara testifying at the Covid Inquiry.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon

The end of cool

For a good bit of this year’s We Robot, it felt like abstract “AI” – that is, algorithms running on computers with no mobility – had swallowed the robots whose future this conference was invented to think about. This despite a pre-conference visit to Boston Dynamics, which showed off its Atlas
robot
‘s ability to do gymnastics. It’s cute, but is it useful? Your washing machine is smarter, and its intelligence solves real problems like how to use less water.

There’s always some uncertainty about boundaries at this event: is a machine learning decision making system a robot? At the inaugural We Robot in 2012, the engineer Bill Smart summed up the difference: “My iPhone can’t stab me in my bed.” Of course, neither could an early Roomba, which most would agree was the first domestic robot. However, it was also dumb as a floor tile, achieving cleanliness through random repetition rather than intelligent mapping. In the Roomba 1.0 sense, a “robot” is “a device that does boring things so I don’t have to”. Not cool, but useful, and solves a real problem

During a session in which participants played a game designed to highlight the conflicts inherent in designing an urban drone delivery system, Lael Odhner offered yet another definition: “A robot is a literary device we use to voice our discomfort with technology.” In the context of an event where participants think through the challenges robots bring to law and policy, this may be the closest approximation.

In the design exercise, our table’s three choices were: fund the FAA (so they can devise and enforce rules and policies), build it as a municipally-owned public service both companies and individuals can use as customers, and ban advertising on the drones for reasons of both safety and offensiveness. A similar exercise last year produced more specific rules, but also led us to realize that a drone delivery service had no benefits over current delivery services.

Much depends on scale. One reason we chose a municipal public service was the scale of noise and environmental impact inevitably generated by multiple competing commercial services. In a paper, Woody Hartzog examined the meaning of “scale”: is scale *more*, or is scale *different*? You can argue, as net.wars often has, that scale *creates* difference, but it’s rarely clear where to place the threshold, or how reaching it changes a technology’s harms or who it makes vulnerable. Ryan Calo and Daniella DiPaola suggested that rather than associate vulnerability with particular classes of people we should see it as variable with circumstances: “Everyone is vulnerable sometimes, and vulnerability is a state that can be created and manipulated toward particular ends.” This seems a more logical and fairer approach.

An aspect of this is that there are two types of rules: harm rules, which empower institutions to limit harm, and power rules, which empower individuals to protect themselves. A possible worked example soon presented itself in Kegan J Strawn;s and Daniel Sokol‘s paper on safety techniques in mobile robots, which suggested copying medical ethics’ consent approach. Then someone described the street scene in which every pedestrian had to give consent to every passing experimental Tesla, a possibly an even worse scenario than ad-bearing delivery drones. Pedestrians get nothing out of the situation, and Teslas don’t become safer. What you really want is for car companies not to test the safety of autonomous vehicles on public roads with pedestrians as unwitting crash test dummies.

I try to think every year how our ideas about inegrating robots into society are changing over time. An unusual paper from Maria P. Angel considered this question with respect to privacy scholarship by surveying 1990s writing and 20 years of papers presented at Privacy Law Scholars. We Robot co-founders Calo, Michael Froomkin, and Ian Kerr partly copied its design. Angel’s conclusion is roughly that the 1990s saw calls for an end to self-regulation while the 2000s moved from privacy as necessary for individual autonomy and self-determination to collective benefits and most recently to its importance for human flourishing.

As Hartzog commented, he came to the first We Robot with the belief that “Robots are magic”, only to encounter Smart’s “really fancy hammers.” And, Smart and Cindy Grimm added in 2018, controlled by sensors that are “late, noisy, and wrong”. Hartzog’s early excitement was shared by many of us; the future looked so *interesting* when it was almost entirely imaginary.

Over time, the robotic future has become more nowish, and has shifted in response to technological development; the discussion has become more about real systems (2022) than imagined future ones. The arrival of real robots on our streets – for example, San Francisco’s 2017 use of security robots to deter homeless camps – changed parts of the discussion from theoretical to practical.

In the mid-2010s, much discussion focused on problems of fairness, especially to humans in the loop, who, Madeleine Claire Elish correctly predicted in 2016 would be blamed for failures. More recently, the proliferation of data-gathering devices (sensors, cameras) into everything from truckers’ cabs to agriculture and the arrival of new algorithmic systems dubbed AI has raised awareness of the companies behind these technologies. And, latterly, that often the technology diverts attention from the better possibilities of structural change.

But that’s not as cool.

Illustrations: Boston Dynamics’ Atlas robots doing synchronized backflips (via YouTube).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

Doom cyberfuture

Midway through this year’s gikii miniconference for pop culture-obsessed Internet lawyers, Jordan Hatcher proposed that generational differences are the key to understanding the huge gap between the Internet pioneers, who saw regulation as the enemy, and the current generation, who are generally pushing for it. While this is a bit too pat – it’s easy to think of Millennial libertarians and I’ve never thought of Boomers as against regulation, just, rationally, against bad Internet law that sticks – it’s an intriguing idea.

Hatcher, because this is gikii and no idea can be presented without a science fiction tie-in, illustrated this with 1990s movies, which spread the “DCF-84 virus” – that is, “doom cyberfuture-84”. The “84” is not chosen for Orwell but for the year William Gibson’s Neuromancer was published. Boomers – he mentioned John Perry Barlow, born 1947, and Lawrence Lessig, born 1961 – were instead infected with the “optimism virus”.

It’s not clear which 1960s movies might have seeded us with that optimism. You could certainly make the case that 1968’s 2001: A Space Odyssey ends on a hopeful note (despite an evil intelligence out to kill humans along the way), but you don’t even have to pick a different director to find dystopia: I see your 2001 and give you Dr Strangelove (1964). Even Woodstock (1970) is partly dystopian; the consciousness of the Vietnam war permeates every rain-soaked frame. But so did the belief that peace could win: so, wash.

For younger people’s pessimism, Hatcher cited 1995’s Johnny Mnemonic (based on a Gibson short story) and Strange Days.

I tend to think that if 1990s people are more doom-laden than 1960s people it has more to do with real life. Boomers were born in a time of economic expansion, relatively affordable education and housing, and and when they protested a war the government eventually listened. Millennials were born in a time when housing and education meant a lifetime of debt, and when millions of them protested a war they were ignored.

In any case, Hatcher is right about the stratification of demographic age groups. This is particularly noticeable in social media use; you can often date people’s arrival on the Internet by which communications medium they prefer. Over dinner, I commented on the nuisance of typing on a phone versus a real keyboard, and two younger people laughed at me: so much easier to type on a phone! They were among the crowd whose papers studied influencers on TikTok (Taylor Annabell, Thijs Kelder, Jacob van de Kerkhof, Haoyang Gui, and Catalina Goanta) and the privacy dangers of dating apps (Tima Otu Anwana and Paul Eberstaller), the kinds of subjects I rarely engage with because I am a creature of text, like most journalists. Email and the web feel like my native homes in a way that apps, game worlds, and video services never will. That dates me both chronologically and by my first experiences of the online world (1991).

Most years at this event there’s a new show or movie that fires many people’s imagination. Last year it was Upload with a dash of Severance. This year, real technological development overwhelmed fiction, and the star of the show was generative AI and large language models. Besides my paper with Jon Crowcrosft, there was one from Marvin van Bekkum, Tim de Jonge, and Frederik Zuiderveen Borgesius that compared the science fiction risks of AI – Skynet, Roko’s basilisk, and an ordering of Asimov’s Laws that puts obeying orders above not harming humans (see XKCD, above) – to the very real risks of the “AI” we have: privacy, discrimination, and environmental damage.

Other AI papers included one by Colin Gavaghan, who asked if it actually matters if you can’t tell whether the entity that’s communicating with you is an AI? Is that what you really need to know? You can see his point: if you’re being scammed, the fact of the scam matters more than the nature of the perpetrator, though your feelings about it may be quite different.

A standard explanation of what put the “science” in science fiction (or the “speculative” in “speculative fiction”) used be to that the authors ask, “What if?” What if a planet had six suns whose interplay meant that darkness only came once every 1,000 years? Would the reaction really be as Ralph Waldo Emerson imagined it? (Isaac Asimov’s Nightfall). What if a new link added to the increasingly complex Boston MTA accidentally turned the system into a Mobius strip (A Subway Named Mobius, by Armin Joseph Deutsch). And so on.

In that sense, gikii is often speculative law, thought experiments that tease out new perspectives. What if Prime Day becomes a culturally embedded religious holiday (Megan Rae Blakely)? What if the EU’s trademark system applied in the Star Trek universe (Simon Sellers)? What if, as in Max Gladsone’s Craft Sequence books, law is practical magic (Antonia Waltermann)? In the trademark example, time travel is a problem; as competing interests can travel further and further back to get the first registration. In the latter…well, I’m intrigued by the idea that a law making dumping sewage in England’s rivers illegal could physically stop it from happening without all the pesky apparatus of law enforcement and parliamentary hearings.

Waltermann concluded by suggesting that to some extent law *is* magic in our world, too. A useful reminder: be careful what law you wish for because you just may get it. Boomer!

Illustrations: Part of XKCD‘s analysis of Asimov’s Laws of Robotics.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Wendy M. GrossmanPosted on Categories AI, Future tech, LawTags , , 1 Comment on Doom cyberfuture

Small data

Shortly before this gets posted, Jon Crowcroft and I will have presented this year’s offering at Gikii, the weird little conference that crosses law, media, technology, and pop culture. This is what we will possibly may have said, as I understand it, with some added explanation for the slightly less technical audience I imagine will read this.

Two years ago, a team of four researchers – Timnit Gebru, Emily Bender, Margaret Mitchell (writing as Shmargaret Shmitchell), and Angelina McMillan-Major – wrote a now-famous paper called On the Dangers of Stochastic Parrots (PDF) calling into question the usefulness of the large language models (LLMs) that have caused so much ruckus this year. The “Stochastic Four” argued instead of small models built on carefully curated data: less prone to error, less exploitive of people’s data, less damaging to the planet. Gebru got fired over this paper; Google also fired Mitchell soon afterwards. Two years later, neural networks pioneer Geoff Hinton quit Google in order to voice similar concerns.

Despite the hype, LLMs have many problems. They are fundamentally an extractive technology and are resource-intensive. Building LLMs requires massive amounts of training data; so far, the companies have been unwilling to acknowledge their sources, perhaps because (as is happening already) they fear copyright suits.

More important from a technical standpoint, is the issue of model collapse; that is, models degrade when they begin to ingest synthetic AI-generated data instead of human input. We’ve seen this before with Google Flu Trends, which degraded rapidly as incoming new search data included many searches on flu-like symptoms that weren’t actually flu, and others that simply reflected the frequency of local news coverage. “Data pollution” as LLM-generated data fills the web, will mean that the web will be an increasingly useless source of training data for future generations of generative AI. Lots more noise, drowning out the signal (in the photo above, the signal would be the parrot).

Instead, if we follow the lead of the Stochastic Four, the more productive approach is small data – small, carefully curated datasets that train models to match specific goals. Far less resource-intensive, far fewer issues with copyright, appropriation, and extraction.

We know what the LLM future looks like in outline: big, centralized services, because no one else will be able to amass enough data. In that future, surveillance capitalism is an essential part of data gathering. SLM futures could look quite different: decentralized, with realigned incentives. At one point, we wanted to suggest that small data could bring the end of surveillance capitalism; that’s probably an overstatement. But small data could certainly create the ecosystem in which the case for mass data collection would be less compelling.

Jon and I imagined four primary alternative futures: federation, personalization, some combination of those two, and paradigm shift.

Precursors to a federated small data future already exist; these include customer service chatbots, predictive text assistants. In this future, we could imagine personalized LLM servers designed to serve specific needs.

An individualized future might look something like I suggested here in March: a model that fits in your pocket that is constantly updated with material of your own choosing. Such a device might be the closest yet to Vannevar Bush’s 1945 idea of the Memex (PDF), updated for the modern era by automating the dozens of secretary-curators he imagined doing the grunt work of labeling and selection. That future again has precursors in techniques for sharing the computation but not the data, a design we see proposed for health care, where the data is too sensitive to share unless there’s a significant public interest (as in pandemics or very rare illnesses), or in other data analysis designs intended to protect privacy.

In 2007, the science fiction writer Charles Stross suggested something like this, though he imagined it as a comprehensive life log, which he described as a “google for real life”. So this alternative future would look something like Stross’s pocket $10 life log with enhanced statistics-based data analytics.

Imagining what a paradigm shift might look like is much harder. That’s the kind of thing science fiction writers do; it’s 16 years since Stross gave that life log talk. However, in his 2018 history of advertising, The Attention Merchants, Columbia professor Tim Wu argued that industrialization was the vector that made advertising and its grab for our attention part of commerce. A hundred and fifty-odd years later, the centralizing effects of industrialization are being challenged starting with energy via renewables and local power generation and social media via the fediverse. Might language models also play their part in bringing a new, more collaborative and cooperative society?

It is, in other words, just possible that the hot new technology of 2023 is simply a dead end bringing little real change. It’s happened before. There have been, as Wu recounts, counter-moves and movements before, but they didn’t have the technological affordances of our era.

In the Q&A that followed, Miranda Mowbray pointed out that companies are trying to implement the individualized model, but that it’s impossible to do unless there are standardized data formats, and even then hard to do at scale.

Illustrations: Spot the parrot seen in a neighbor’s tree.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Wendy M. GrossmanPosted on Categories AI, Events, New tech, old knowledgeTags 1 Comment on Small data

The data grab

It’s been a good week for those who like mocking flawed technology.

Numerous outlets have reported, for example, that “AI is getting dumber at math”. The source is a study conducted by researchers at Stanford and the University of California Berkeley comparing GPT-3.5’s and GPT-4’s output in March and June 2023. The researchers found that, among other things, GPT-4’s success rate at identifying prime numbers dropped from 84% to 51%. In other words, in June 2023 ChatGPT-4 did little better than chance at identifying prime numbers. That’s psychic level.

The researchers blame “drift”, the problem that improving one part of a model may have unhelpful knock-on effects in other parts of the model. At Ars Technica, Benj Edwards is less sure, citing qualified critics who question the study’s methodology. It’s equally possible, he suggests, that as the novelty fades, people’s attempts to do real work surface problems that were there all along. With no access to the algorithm itself and limited knowledge of the training data, we can only conduct such studies by controlling inputs and observing the outputs, much like diagnosing allergies by giving a child a series of foods in turn and waiting to see which ones make them sick. Edwards advocates greater openness on the part of the companies, especially as software developers begin building products on top of their generative engines.

Unrelated, the New Zealand discount supermarket chain Pak’nSave offered an “AI” meal planner that, set loose, promptly began turning out recipes for “poison bread sandwiches”, “Oreo vegetable stir-fry”, and “aromatic water mix” – which turned out to be a recipe for highly dangerous chlorine gas.

The reason is human-computer interaction: humans, told to provide a list of available ingredients, predictably became creative. As for the computer…anyone who’s read Janelle Shane’s 2019 book, You Look LIke a Thing and I Love You, or her Twitter reports on AI-generated recipes could predict this outcome. Computers have no real world experience against which to judge their output!

Meanwhile, the San Francisco Chronicle reports, Waymo and Cruise driverless taxis are making trouble at an accelerating rate. The cars have gotten stuck in low-hanging wires after thunderstorms, driven through caution tape, blocked emergency vehicles and emergency responders, and behaved erratically enough to endanger cyclists, pedestrians, and other vehicles. If they were driven by humans they’d have lost their licenses by now.

In an interesting side note that reminds of the cars’ potential as a surveillance network, Axios reports that in a ten-day study in May Waymo’s driverless cars found that human drivers in San Francisco speed 33% of the time. A similar exercise in Phoenix, Arizona observed human drivers speeding 47% of the time on roads with a 35mph speed limit. These statistics of course bolster the company’s main argument for adoption: improving road safety.

The study should – but probably won’t – be taken as a warning of the potential for the cars’ data collection to become embedded in both law enforcement and their owners’ business models. The frenzy surrounding ChatGPT-* is fueling an industry-wide data grab as everyone tries to beef up their products with “AI” (see also previous such exercises with “meta”, “nano”, and “e”), consequences to be determined.

Among the newly-discovered data grabbers is Intel, whose graphics processing unit (GPU) drivers are collecting telemetry data, including how you use your computer, the kinds of websites you visit, and other data points. You can opt out, assuming you a) realize what’s happening and b) are paying attention at the right moment during installation.

Google announced recently that it would scrape everything people post online to use as training data. Again, an opt-out can be had if you have the knowledge and access to follow the 30-year-old robots.txt protocol. In practical terms, I can configure my own site, pelicancrossing.net, to block Google’s data grabber, but I can’t stop it from scraping comments I leave on other people’s blogs or anything I post on social media sites or that’s professionally published (though those sites may block Google themselves). This data repurposing feels like it ought to be illegal under data protection and copyright law.

In Australia, Gizmodo reports that the company has asked the Australian government to relax copyright laws to facilitate AI training.

Soon after Google’s announcement the law firm Clarkson filed a class action lawsuit against Google to join its action against OpenAI. The suit accuses Google of “stealing” copyrighted works and personal data,

“Google does not own the Internet,” Clarkson wrote in its press release. Will you tell it, or shall I?

Whatever has been going on until now with data slurping in the interests of bombarding us with microtargeted ads is small stuff compared to the accelerating acquisition for the purpose of feeding AI models. Arguably, AI could be a public good in the long term as it improves, and therefore allowing these companies to access all available data for training is in the public interest. But if that’s true, then the *public* should own the models, not the companies. Why should we consent to the use of our data so they can sell it back to us and keep the proceeds for their shareholders?

It’s all yet another example of why we should pay attention to the harms that are clear and present, not the theoretical harm that someday AI will be general enough to pose an existential threat.

Illustrations: IBM Watson, Jeopardy champion.

Wendy M. Grossman is the 2013 winner of the Enigma Award and contributing editor for the Plutopia News Network podcast. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Follow on Mastodon.

Solidarity

Whatever you’re starting to binge-watch, slow down. It’s going to be a long wait for fresh content out of Hollywood.

Yesterday, the actors union, SAG-AFTRA, went out on strike alongside the members of the Writers Guild of America, who have been “>walking picket lines since May 2. Like the writers, actors have seen their livelihoods shrink as US TV shows’ seasons shorten, “reruns” that pay residuals fade into the past, and DVD royalties dry up, while royalties from streaming remain tiny by comparison. At the Hollywood and Levine podcast, the veteran screenwriter Ken Levine gives the background to the WGA’s action. But think of it this way: the writers and cast of The Big Bang Theory may be the last to share fairly in the enormous profits their work continues to generate.

The even bigger threat? AI that makes it possible to capture the actor’s likeness and then reuse it ad infinitum in new work. This, as Malia Mendez writes at the LA Times, is the big fear. In a world where Harrison Ford at 80 is making movies in which he’s aged down to look 40 and James Earl Jones has agreed to clone his voice for reuse after his death, it’s arguably a rational big fear.

We’ve had this date for a long time. In the late 1990s I saw a demonstration of “vactors” – virtual actors that were created by scanning a human actor moving in various ways and building a library of movements that thereafter could be rendered at will. At the time, the state of the art was not much advanced from the liquid metal man in Terminator 2. Rendering film-quality characters was very slow, but that was then and this is now, and how long before rendering moving humans can be done in high-def in real-time at action speed?

The studios are already pushing actors into allowing synthesized reuse. California law grants public figures, including actors, publicity rights that prevent the commercial use of their name and likeness without consent. However, Mendez reports that current contracts already require actors to waive those rights to grant the studios digital simulation or digital creation rights. The effects are worst in reality television, where the line is blurred between the individual as a character on a TV show and the individual in their off-screen life. She quotes lawyer Ryan Schmidt: “We’re at this Napster 2001 moment…”

That moment is even closer for voice actors. Last year, Actors Equity announced a campaign to protect voice actors from their synthesized counterparts. This week, one of those synthesizers is providing commentary – more like captions, really – for video clips like this one at Wimbledon. As I said last year, while synthesized voices will be good enough for many applications such as railway announcements, there are lots of situations that will continue to require real humans. Sports commentary is one; commentators aren’t just there to provide information, they’re *also* there to sell the game. Their human excitement at the proceedings is an important part of that.

So SAG-AFTRA, like the Writers Guild of America, is seeking limitations on how studios may use AI, payment for such uses, and rules on protecting against misuse. In another LA Times story, Anoushka Sakoui reports that the studios’ offer included requiring “a performer’s consent for the creation and use of digital replicas or for digital alterations of a performance”. Like publishers “offering” all-rights-in perpetuity contracts to journalists and authors since the 1990s, the studios are trying to ensure they have all the rights they could possibly want.

“You cannot change the business model as much as it has changed and not expect the contract to change, too,” SAG-AFTRA president Fran Drescher said yesterday in a speech that has been widely circulated.

It was already clear this is going to be a long strike that will damage tens of thousands of industry workers and the economy of California. Earlier this week, Dominic Patten reported at Deadline that the Association of Movie and Television Producers plans to delay resuming talks with the WGA until October. By then, Patten reports producers saying, writers will be losing their homes and be more amenable to accepting the AMPTP’s terms. The AMPTP officially denies this, saying it’s committed to reaching a deal. Nonetheless, there are no ongoing talks. As Ken Levine pointed out in a pair of blogposts written during the 2007 writers strike, management is always in control of timing.

But as Levine also says, in the “old days” a top studio mogul could simply say, “Let’s get this done” and everyone would get around the table and make a deal. The new presence of tech giants Netflix, Amazon, and Apple in the AMPTP membership makes this time different. At some point, the strike will be too expensive for legacy Hollywood studios. But for Apple, TV production is a way to sell services and hardware. For Amazon, it’s a perk that comes with subscribing to its Prime delivery service. Only Netflix needs a constant stream of new work – and it can commission it from creators across the globe. All three of them can wait. And the longer they drag this out, the more the traditional studios will lose money and weaken as competitors.

Legacy Hollywood doesn’t seem to realize it yet, but this strike is existential for them, too.

Illustrations: SAG-AFTRA president Fran Drescher, announcing the strike on Thursday.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Follow on Mastodon.