Unclear and unpresent dangers

Monthly computer magazines used to fret that their news pages would be out of date by the time the new issue reached readers. This week in AI, a blog posting is out of date before you hit send.

This – Friday – morning, the Italian data protection authority, Il Garante, has ordered ChatGPT to stop processing the data of Italian users until it complies with the General Data Protection Regulation. Il Garante’s objections, per Apple’s translation, posted by Ian Brown: ChatGPT provides no legal basis for collecting and processing its massive store of the personal data used to train the model, and that it fails to filter out users under 13.

This may be the best possible answer to the complaint I’d been writing below.

On Wednesday, the Future of Life Institute published an open letter calling for a six-month pause on developing systems more powerful than Open AI’s current state of the art, GPT4. Barring Elon Musk, Steve Wozniack, and Skype co-founder Jaan Tallinn, most of the signatories are unfamiliar names to most of us, though the companies and institutions they represent aren’t – Pinterest, the MIT Center for Artificial Intelligence, UC Santa Cruz, Ripple, ABN-Amro Bank. Almost immediately, there was a dispute over the validity of the signatures..

My first reaction was on the order of: huh? The signatories are largely people who are inventing this stuff. They don’t have to issue a call. They can just *stop*, work to constrain the negative impacts of the services they provide, and lead by example. Or isn’t that sufficiently performative?

A second reaction: what about all those AI ethics teams that Silicon Valley companies are disbanding? Just in the last few weeks, these teams have been axed or cut at Microsoft and Twitch; Twitter of course ditched such fripperies last November in Musk’s inaugural wave of cost-cutting. The letter does not call to reinstate these.

The problem, as familiar critics such as Emily Bender pointed out almost immediately, is that the threats the letter focuses on are distant not-even-thunder. As she went on to say in a Twitter thread, the artificial general intelligence of the Singularitarian’s rapture is nowhere in sight. By focusing on distant threats – longtermism – we ignore the real and present problems whose roots are being continuously more deeply embedded into the new-building infrastructure: exploited workers, culturally appropriated data, lack of transparency around the models and algorithms used to build these systems….basically, all the ways they impinge upon human rights.

This isn’t the first time such a letter has been written and circulated. In 2015, Stephen Hawking, Musk, and about 150 others similarly warned of the dangers of the rise of “superintelligences”. Just a year later, in 2016, Pro Publica investigated the algorithm behind COMPAS, a risk-scoring criminal justice system in use in US courts in several states. Under Julia Angwin‘s scrutiny, the algorithm failed at both accuracy and fairness; it was heavily racially biased. *That*, not some distant fantasy, was the real threat to society.

“Threat” is the key issue here. This is, at heart, a letter about a security issue, and solutions to security issues are – or should be – responses to threat models. What is *this* threat model, and what level of resources to counter it does it justify?

Today, I’m far more worried by the release onto public roads of Teslas running Full Self Drive helmed by drivers with an inflated sense of the technology’s reliability than I am about all of human work being wiped away any time soon. This matters because, as Jessie Singal, author of There Are No Accidents, keeps reminding us, what we call “accidents” are the results of policy decisions. If we ignore the problems we are presently building in favor of fretting about a projected fantasy future, that, too, is a policy decision, and the collateral damage is not an accident. Can’t we do both? I imagine people saying. Yes. But only if we *do* both.

In a talk this week for a group at the French international research group AI Act. This effort began well before today’s generative tools exploded into public consciousness, and isn’t likely to conclude before 2024. It is, therefore, much more focused on the kinds of risks attached to public sector scandals like COMPAS and those documented in Cathy O’Neil’s 2017 book Weapons of Math Destruction, which laid bare the problems with algorithmic scoring with little to tether it to reality.

With or without a moratorium, what will “AI” look like in 2024? It has changed out of recognition just since the last draft text was published. Prediction from this biological supremacist: it still won’t be sentient.

All this said, as Edwards noted, even if the letter’s proposal is self-serving, a moratorium on development is not necessarily a bad idea. It’s just that if the risk is long-term and existential, what will six months do? If the real risk is the hidden continued centralization of data and power, then those six months could be genuinely destructive. So far, it seems like its major function is as a distraction. Resist.

Illustrations: IBM’s Watson, which beat two of Jeopardy‘s greatest champions in 2011. It has since failed to transform health care.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Follow on Mastodon or Twitter.

Memex 2.0

As language models get cheaper, it’s dawned on me what kind of “AI” I’d like to have: a fully personalized chat bot that has been trained on my 30-plus years of output plus all the material I’ve read, watched, listened to, and taken notes on all these years. A clone of my brain, basically, with more complete and accurate memory updated alongside my own. Then I could discuss with it: what’s interesting to write about for this week’s net.wars?

I was thinking of what’s happened with voice synthesis. In 2011, it took the Scottish company Cereproc months to build a text-to-speech synthesizer from recordings of Roger Ebert’s voice. Today, voice synthesizers are all over the place – not personalized like Ebert’s, but able to read a set text plausibly enough to scare voice actors.

I was also thinking of the Stochastic Parrots paper, whose first anniversary was celebrated last week by authors Emily Bender, Timnit Gebru, Angelina McMillan-Major, and Margaret Mitchell. An important part of the paper advocates for smaller, better-curated language models: more is not always better. I can’t find a stream for the event, but here’s the reading list collected during the proceedings. There’s lots I’d rather eliminate from my personal assistant. Eliminating unwanted options upfront has long been a widspread Internet failure, from shopping sites (“never show me pet items”) to news sites (“never show me fashion trends”). But that sort of selective display is more difficult and expensive than including everything and offering only inclusion filters.

A computational linguistics expert tells me that we’re an unknown amount of time away from my dream of the wg-bot. Probably, if such a thing becomes possible it will be based on someone’s large language model and fine-tuned with my stuff. Not sure I entirely like this idea; it means the model will be trained on stuff I haven’t chosen or vetted and whose source material is unknown, unless we get a grip on forcing disclosure or the proposed BLOOM academic open source language model takes over the world.

I want to say that one advantage to training a chatbot on your own output is you don’t have to worry so much about copyright. However, the reality is that most working writers have sold all rights to most of their work to large publishers, which means that such a system is a new version of digital cholera. In my own case, by the time I’d been in this business for 15 years, more than half of the publications I’d written for were defunct. I was lucky enough to retain at least non-exclusive rights to my most interesting work, but after so many closures and sales I couldn’t begin to guess – or even know how to find out – who owns the rights to the rest of it. The question is moot in any case: unless I choose to put those group reviews of Lotus 1-2-3 books back online, probably no one else will, and if I do no one will care.

On Mastodon, the specter of the upcoming new! improved! version of the copyright wars launched by the arrival of the Internet: “The real generative AI copyright wars aren’t going to be these tiny skirmishes over artists and Stability AI. Its going to be a war that puts filesharing 2.0 and the link tax rolled into one in the shade.” Edwards is referring to this case, in which artists are demanding billions from the company behind the Stable Diffusion engine.

Edwards went on to cite a Wall Street Journal piece that discusses publishers’ alarmed response to what they perceive as new threats to their business. First: that the large piles of data used to train generative “AI” models are appropriated without compensation. This is the steroid-fueled analogue to the link tax, under which search engines in Australia pay newspapers (primarily the Murdoch press) for including them in news search results. A similar proposal is pending in Canada.

The second is that users, satisfied with the answers they receive from these souped-up search services will no longer bother to visit the sources – especially since few, most notably Google, seem inclined to offer citations to back up any of the things they say.

The third is outright plagiarism without credit by the chatbot’s output, which is already happening.

The fourth point of contention is whether the results of generative AI should be themselves subject to copyright. So far, the consensus appears to be no, when it comes to artwork. But some publishers who have begun using generative chatbots to create “content” no doubt claim copyright in the results. It might make more sense to copyright the *prompt*. (And some bright corporate non-soul may yet try.)

At Walled Culture, Glyn Moody discovers that the EU has unexpectedly done something right by requiring positive opt-in to copyright protection against text and data mining. I’d like to see this as a ray of hope for avoiding the worst copyright conflicts, but given the transatlantic rhetoric around privacy laws and data flows, it seems much more likely to incite another trade conflict.

It now dawns on me that the system I outlined in the first paragraph is in fact Vannevar Bush’s Memex. Not the web, which was never sufficiently curated, but this, primed full of personal intellectual history. The “AI” represents those thousands of curating secretaries he thought the future would hold. As if.

Illustrations: Stable Diffusion rendering of “stochastic parrots”, as prompted by Jon Crowcroft.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Follow on Mastodon or Twitter.

Performing intelligence

“Oh, great,” I thought when news broke of the release of GPT-4. “Higher-quality deception.”

Most of the Internet disagreed; having gone mad only a few weeks ago over ChatGPT, everyone’s now agog over this latest model. It passed all these tests!

One exception was the journalist Paris Marx, who commented on Twitter: “It’s so funny to me that the AI people think it’s impressive when their programs pass a test after being trained on all the answers.”

Agreed. It’s also so funny to me that they call that “AI” and don’t like it when researchers like computational linguist Emily Bender call it a “stochastic parrot”. At Marx’s Tech Won’t Save Us podcast, Goldsmith professor Dan McQuillan, author of Resisting AI: An Anti-fascist Approach to Artificial Intelligence, calls it a “bullshit engine” whose developers’ sole goal is plausibility – plausibility that, as Bender has said, allows us imaginative humans to think we detect a mind behind it, and the result is to risk devaluing humans.

Let’s walk back to an earlier type of system that has been widely deployed: benefits scoring systems. A couple of weeks ago, Lighthouse Reports and Wired magazine teamed up on an investigation of these systems, calling them “suspicion machines”.

Their work focuses on the welfare benefits system in use in Rotterdam between 2017 and 2021, which uses 315 variables to risk-score benefits recipients according to the likelihood that their claims are fraudulent. In detailed, worked case analyses, they find systemic discrimination: you lose points for being female, for being female and having children (males aren’t asked about children), for being non-white, and for ethnicity (knowing Dutch a requirement for welfare recipients). Other variables include missing meetings, age, and “lacks organizing skills”, which was just one of 54 variables based on case workers’ subjective assessments. Any comment a caseworker adds translates to a 1 added to the risk score, even if it’s positive. The top-scoring 10% are flagged for further investigation.

This is the system that Accenture, the city’s technology partner on the early versions, said at its unveiling in 2018 was an “ethical solution” and promised “unbiased citizen outcomes”. Instead, Wired says, the algorithm “fails the city’s own test of fairness”.

The project’s point wasn’t to pick on Rotterdam; of the dozens of cities they contacted it just happened to be the only one that was willing to share the code behind the algorithm, along with the list of variables, prior evaluations, and the data scientists’ handbook. It even – after being threatened with court action under freedom of information laws, shared the mathematical model itself.

The overall conclusion: the system was so inaccurate it was little better than random sampling “according to some metrics”.

What strikes me, aside from the details of this design, is the initial choice of scoring benefits recipients for risk of fraud. Why not score them for risk of missing out on help they’re entitled to? The UK government’s figures on benefits fraud indicate that in 2021-2022 overpayment (including error as well as fraud) amounted to 4%; and *underpayment* 1.2% of total expenditure. Underpayment is a lot less, but it’s still substantial (£2.6 billion). Yes, I know, the point of the scoring system is to save money, but the point of the *benefits* system is to help people who need it. The suspicion was always there, but the technology has altered the balance.

This was the point the writer Ellen Ullman noted in her 1996 book Close to the Machine”: the hard-edged nature of these systems and their ability to surveil people in new ways, “infect” their owners with suspicion even of people they’ve long trusted and even when the system itself was intended to be helpful. On a societal scale, these “suspicion machines” embed increased division in our infrastructure; in his book, McQuillan warns us to watch for “functionality that contributes to violent separations of ‘us and them’.”

Along those lines, it’s disturbing that Open AI, the owner of ChatGPT and GPT-4 (and several other generative AI gewgaws) has now decided to keep secret the details of its large language models. That is, we have no sight into what data was used in training, what software and hardware methods were used, or how energy-intensive it is. If there’s a machine loose in the world’s computer systems pretending to be human, shouldn’t we understand how it works? It would help with damping down imagining we see a mind in there.

The company’s argument appears to be that because these models could become harmful it’s bad to publish how they work because then bad actors will use them to create harm. In the cybersecurity field we call this “security by obscurity” and there is a general consensus that it does not work as a protection.

In a lengthy article at New York magazine, Elizabeth Weil. quotes Daniel Dennett’s assessment of these machines: “counterfeit people” that should be seen as the same sort of danger to our system as counterfeit money. Bender suggests that rather than trying to make fake people we should be focusing more on making tools to help people.

The thing that makes me tie it to the large language models that are producing GPT is that in both cases it’s all about mining our shared cultural history, with all its flaws and misjudgments, in response to a prompt and pretending the results have meaning and create new knowledge. And *that’s* what’s being embedded into the world’s infrastructure. Have we learned nothing from Clever Hans?

Illustrations: Clever Hans, performing in Leipzig in 1912 (by Karl Krali, via Wikimedia.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Follow on Mastodon or Twitter.

Re-centralizing

But first, a housekeeping update. Net.wars has moved – to a new address and new blogging software. For details, see here. If you read net.wars via RSS, adjust your feed to https://netwars.pelicancrossing.net. Past posts’ old URLs will continue to work, as will the archive index page, which lists every net.wars column back to November 2001. And because of the move: comments are now open for the first time in probably about ten years. I will also shortly set up a mailing list for those who would rather get net.wars by email.

***

This week the Ada Lovelace Institute held a panel discussion of ethics for researchers in AI. Arguably, not a moment too soon.

At Noema magazine, Timnet Gebru writes, as Mary L Gray and Siddharth Suri have previously, that what today passes for “AI” and “machine learning” is, underneath, the work of millions of poorly-paid marginalized workers who add labels, evaluate content, and provide verification. At Wired, Gebru adds that their efforts are ultimately directed by a handful of Silicon Valley billionaires whose interests are far from what’s good for the rest of us. That would be the “rest of us” who are being used, willingly or not, knowingly or not, as experimental research subjects.

Two weeks ago, for example, a company called Koko ran an experiment offering chatbot-written/human-overseen mental health counseling without informing the 4,000 people who sought help via the “Koko Cares” Discord server. In a Twitter thread. company co-founder Rob Morris said those users rated the bot’s responses highly until they found out a bot had written them.

People can build relationships with anything, including chatbots, as was proved in 1996 with the release of the experimental chatbot therapist Eliza. People found Eliza’s responses comforting even though they knew it was a bot. Here, however, informed consent processes seem to have been ignored. Morris’s response, when widely criticized for the unethical nature of this little experiment was to say it was exempt from informed consent requirements because helpers could opt whether to use the chatbot’s reponses and Koko had no plan to publish the results.

One would like it to be obvious that *publication* is not the biggest threat to vulnerable people in search of help. One would also like modern technology CEOs to have learned the right lesson from prior incidents such as Facebook’s 2012 experiment to study users’ moods when it manipulated their newsfeeds. Facebook COO Sheryl Sandberg apologized for *how the experiment was communicated*, but not for doing it. At the time, we thought that logic suggested that such companies would continue to do the research but without publishing the results. Though isn’t tweeting publication?

It seems clear that scale is part of the problem here, like the old saying, one death is a tragedy; a million deaths are a statistic. Even the most sociopathic chatbot owner is unlikely to enlist an experimental chatbot to respond to a friend or family member in distress. But once a screen intervenes, the thousands of humans on the other side are just a pile of user IDs; that’s part of how we get so much online abuse. For those with unlimited control over the system we must all look like ants. And who wouldn’t experiment on ants?

In that sense, the efforts of the Ada Lovelace panel to sketch out the diligence researchers should follow are welcome. But the reality of human nature is that it will always be possible to find someone unscrupulous to do unethical research – and the reality of business nature is not to care much about research ethics if the resulting technology will generate profits. Listening to all those earnest, worried researchers left me writing this comment: MBAs need ethics. MBAs, government officials, and anyone else who is in charge of how new technologies are used and whose decisions affect the lives of the people those technologies are imposed upon.

This seemed even more true a day later, at the annual activists’ gathering Privacy Camp. In a panel on the proliferation of surveillance technology at the borders, speakers noted that every new technology that could be turned to helping migrants is instead being weaponized against them. The Border Violence Monitoring Network has collected thousands of such testimonies.

The especially relevant bit came when Hope Barker, a senior policy analyst with BVMN, noted this problem with the forthcoming AI Act: accountability is aimed at developers and researchers, not users.

Granted, technology that’s aborted in the lab isn’t available for abuse. But no technology stays the same after leaving the lab; it gets adapted, altered, updated, merged with other technologies, and turned to uses the researchers never imagined – as Wendy Hall noted in moderating the Ada Lovelace panel. And if we have learned anything from the last 20 years it is that over time technology services enshittify, to borrow Cory Doctorow’s term in a rant which covers the degradation of the services offered by Amazon, Facebook, and soon, he predicts, TikTok.

The systems we call “AI” today have this in common with those services: they are centralized. They are technologies that re-advantage large organizations and governments because they require amounts of data and computing power that are beyond the capabilities of small organizations and individuals to acquire. We can only rent them or be forced to use them. The ur-evil AI, HAL in Stanley Kubrick’s 2001: A Space Odyssey taught us to fear an autonomous rogue. But the biggest danger with “AIs” of the type we are seeing today, that are being put into decision making and law enforcement, is not the technology, nor the people who invented it, but the expanding desires of its controller.

Illustrations: HAL, in 2001.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns back to November 2001. Comment here, or follow on Mastodon or Twitter.