Memex 2.0

As language models get cheaper, it’s dawned on me what kind of “AI” I’d like to have: a fully personalized chat bot that has been trained on my 30-plus years of output plus all the material I’ve read, watched, listened to, and taken notes on all these years. A clone of my brain, basically, with more complete and accurate memory updated alongside my own. Then I could discuss with it: what’s interesting to write about for this week’s net.wars?

I was thinking of what’s happened with voice synthesis. In 2011, it took the Scottish company Cereproc months to build a text-to-speech synthesizer from recordings of Roger Ebert’s voice. Today, voice synthesizers are all over the place – not personalized like Ebert’s, but able to read a set text plausibly enough to scare voice actors.

I was also thinking of the Stochastic Parrots paper, whose first anniversary was celebrated last week by authors Emily Bender, Timnit Gebru, Angelina McMillan-Major, and Margaret Mitchell. An important part of the paper advocates for smaller, better-curated language models: more is not always better. I can’t find a stream for the event, but here’s the reading list collected during the proceedings. There’s lots I’d rather eliminate from my personal assistant. Eliminating unwanted options upfront has long been a widspread Internet failure, from shopping sites (“never show me pet items”) to news sites (“never show me fashion trends”). But that sort of selective display is more difficult and expensive than including everything and offering only inclusion filters.

A computational linguistics expert tells me that we’re an unknown amount of time away from my dream of the wg-bot. Probably, if such a thing becomes possible it will be based on someone’s large language model and fine-tuned with my stuff. Not sure I entirely like this idea; it means the model will be trained on stuff I haven’t chosen or vetted and whose source material is unknown, unless we get a grip on forcing disclosure or the proposed BLOOM academic open source language model takes over the world.

I want to say that one advantage to training a chatbot on your own output is you don’t have to worry so much about copyright. However, the reality is that most working writers have sold all rights to most of their work to large publishers, which means that such a system is a new version of digital cholera. In my own case, by the time I’d been in this business for 15 years, more than half of the publications I’d written for were defunct. I was lucky enough to retain at least non-exclusive rights to my most interesting work, but after so many closures and sales I couldn’t begin to guess – or even know how to find out – who owns the rights to the rest of it. The question is moot in any case: unless I choose to put those group reviews of Lotus 1-2-3 books back online, probably no one else will, and if I do no one will care.

On Mastodon, the specter of the upcoming new! improved! version of the copyright wars launched by the arrival of the Internet: “The real generative AI copyright wars aren’t going to be these tiny skirmishes over artists and Stability AI. Its going to be a war that puts filesharing 2.0 and the link tax rolled into one in the shade.” Edwards is referring to this case, in which artists are demanding billions from the company behind the Stable Diffusion engine.

Edwards went on to cite a Wall Street Journal piece that discusses publishers’ alarmed response to what they perceive as new threats to their business. First: that the large piles of data used to train generative “AI” models are appropriated without compensation. This is the steroid-fueled analogue to the link tax, under which search engines in Australia pay newspapers (primarily the Murdoch press) for including them in news search results. A similar proposal is pending in Canada.

The second is that users, satisfied with the answers they receive from these souped-up search services will no longer bother to visit the sources – especially since few, most notably Google, seem inclined to offer citations to back up any of the things they say.

The third is outright plagiarism without credit by the chatbot’s output, which is already happening.

The fourth point of contention is whether the results of generative AI should be themselves subject to copyright. So far, the consensus appears to be no, when it comes to artwork. But some publishers who have begun using generative chatbots to create “content” no doubt claim copyright in the results. It might make more sense to copyright the *prompt*. (And some bright corporate non-soul may yet try.)

At Walled Culture, Glyn Moody discovers that the EU has unexpectedly done something right by requiring positive opt-in to copyright protection against text and data mining. I’d like to see this as a ray of hope for avoiding the worst copyright conflicts, but given the transatlantic rhetoric around privacy laws and data flows, it seems much more likely to incite another trade conflict.

It now dawns on me that the system I outlined in the first paragraph is in fact Vannevar Bush’s Memex. Not the web, which was never sufficiently curated, but this, primed full of personal intellectual history. The “AI” represents those thousands of curating secretaries he thought the future would hold. As if.

Illustrations: Stable Diffusion rendering of “stochastic parrots”, as prompted by Jon Crowcroft.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Follow on Mastodon or Twitter.

Performing intelligence

“Oh, great,” I thought when news broke of the release of GPT-4. “Higher-quality deception.”

Most of the Internet disagreed; having gone mad only a few weeks ago over ChatGPT, everyone’s now agog over this latest model. It passed all these tests!

One exception was the journalist Paris Marx, who commented on Twitter: “It’s so funny to me that the AI people think it’s impressive when their programs pass a test after being trained on all the answers.”

Agreed. It’s also so funny to me that they call that “AI” and don’t like it when researchers like computational linguist Emily Bender call it a “stochastic parrot”. At Marx’s Tech Won’t Save Us podcast, Goldsmith professor Dan McQuillan, author of Resisting AI: An Anti-fascist Approach to Artificial Intelligence, calls it a “bullshit engine” whose developers’ sole goal is plausibility – plausibility that, as Bender has said, allows us imaginative humans to think we detect a mind behind it, and the result is to risk devaluing humans.

Let’s walk back to an earlier type of system that has been widely deployed: benefits scoring systems. A couple of weeks ago, Lighthouse Reports and Wired magazine teamed up on an investigation of these systems, calling them “suspicion machines”.

Their work focuses on the welfare benefits system in use in Rotterdam between 2017 and 2021, which uses 315 variables to risk-score benefits recipients according to the likelihood that their claims are fraudulent. In detailed, worked case analyses, they find systemic discrimination: you lose points for being female, for being female and having children (males aren’t asked about children), for being non-white, and for ethnicity (knowing Dutch a requirement for welfare recipients). Other variables include missing meetings, age, and “lacks organizing skills”, which was just one of 54 variables based on case workers’ subjective assessments. Any comment a caseworker adds translates to a 1 added to the risk score, even if it’s positive. The top-scoring 10% are flagged for further investigation.

This is the system that Accenture, the city’s technology partner on the early versions, said at its unveiling in 2018 was an “ethical solution” and promised “unbiased citizen outcomes”. Instead, Wired says, the algorithm “fails the city’s own test of fairness”.

The project’s point wasn’t to pick on Rotterdam; of the dozens of cities they contacted it just happened to be the only one that was willing to share the code behind the algorithm, along with the list of variables, prior evaluations, and the data scientists’ handbook. It even – after being threatened with court action under freedom of information laws, shared the mathematical model itself.

The overall conclusion: the system was so inaccurate it was little better than random sampling “according to some metrics”.

What strikes me, aside from the details of this design, is the initial choice of scoring benefits recipients for risk of fraud. Why not score them for risk of missing out on help they’re entitled to? The UK government’s figures on benefits fraud indicate that in 2021-2022 overpayment (including error as well as fraud) amounted to 4%; and *underpayment* 1.2% of total expenditure. Underpayment is a lot less, but it’s still substantial (£2.6 billion). Yes, I know, the point of the scoring system is to save money, but the point of the *benefits* system is to help people who need it. The suspicion was always there, but the technology has altered the balance.

This was the point the writer Ellen Ullman noted in her 1996 book Close to the Machine”: the hard-edged nature of these systems and their ability to surveil people in new ways, “infect” their owners with suspicion even of people they’ve long trusted and even when the system itself was intended to be helpful. On a societal scale, these “suspicion machines” embed increased division in our infrastructure; in his book, McQuillan warns us to watch for “functionality that contributes to violent separations of ‘us and them’.”

Along those lines, it’s disturbing that Open AI, the owner of ChatGPT and GPT-4 (and several other generative AI gewgaws) has now decided to keep secret the details of its large language models. That is, we have no sight into what data was used in training, what software and hardware methods were used, or how energy-intensive it is. If there’s a machine loose in the world’s computer systems pretending to be human, shouldn’t we understand how it works? It would help with damping down imagining we see a mind in there.

The company’s argument appears to be that because these models could become harmful it’s bad to publish how they work because then bad actors will use them to create harm. In the cybersecurity field we call this “security by obscurity” and there is a general consensus that it does not work as a protection.

In a lengthy article at New York magazine, Elizabeth Weil. quotes Daniel Dennett’s assessment of these machines: “counterfeit people” that should be seen as the same sort of danger to our system as counterfeit money. Bender suggests that rather than trying to make fake people we should be focusing more on making tools to help people.

The thing that makes me tie it to the large language models that are producing GPT is that in both cases it’s all about mining our shared cultural history, with all its flaws and misjudgments, in response to a prompt and pretending the results have meaning and create new knowledge. And *that’s* what’s being embedded into the world’s infrastructure. Have we learned nothing from Clever Hans?

Illustrations: Clever Hans, performing in Leipzig in 1912 (by Karl Krali, via Wikimedia.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Follow on Mastodon or Twitter.

Esquivalience

The science fiction author Charles Stross had a moment of excitement on Mastodon this week: WRITER CHALLENGE!.

Stross challenged writers to use the word “esquivalience” in their work. The basic idea: turn this Pinocchio word into a “real” word.

Esquivalience is the linguistic equivalent of a man-made lake. The creator, editor Christine Lindberg, invented it for the 2001 edition of the New American Oxford Dictionary and defined it as “the willful avoidance of one’s official responsibilities; the shirking of duties”. It was a trap to catch anyone republishing the dictionary rather than developing their own (a job I have actually done). This is a common tactic for protecting large compilations where it’s hard to prove copying – fake streets are added to maps, for example, and the people who rent out mailing lists add ringers whose use will alert them if the list is used outside the bounds of the contractual agreement.

There is, however, something peculiarly distasteful about fake entries in supposedly authoritative dictionaries, even though I agree with Lindberg that “esquivalience” is a pretty useful addition to the language. It’s perfect – perhaps in the obvious adjectival form “esquivalient” – for numerous contemporary politicians, though here be dragons: “willful” risks libel actions.

Probably most writers have wanted to make up words, and many have, from playwright and drama critic George S. Kaufman, often credited for coining, among other things, “underwhelmed”, to Anthony Burgess, who invented an entire futurist street language for A Clockwork Orange. Some have gone so far as to create enough words to publish dictionaries – such as the humorist Gelett Burgess, whose Burgess Unabridged (free ebook!) compiles “words you’ve always needed”. From that collection, I have always been particularly fond of Burgess’s “wox”, defined as “a state of placid enjoyment; sluggish satisfaction”. It seems particularly apt in the hours immediately following Thanksgiving dinner.

In these cases, though, the context lets you know the language is made up. The dictionary is supposed to be authoritative, admitting words only after they are well-established. The presence of fake words feels damaging in a way that a fake place on a map doesn’t. It’s comparatively easy to check whether a place exists by going there, but at some point down the echoing corridors of time *every* word was used for the first time. Pinpointing exactly when is hard unless someone ‘fesses up. I don’t like the idea that my dictionary is lying to me. Better if NOAD had planted two fake words and had them recursively point at each other for their definitions.

I had been avoiding the ChatGPT hoopla, but it seemed plausible to ask it: Is “esquivalience” a real word?” Its response started well enough: “‘Esquivalience’ is not recognized as a standard word in the English language. It is a made-up word…” And then cuckoo land arrived: “…that was created by a writer named Adam Jacot de Boinod for his book “The Meaning of Tingo”.” Pause to research. The book in question was written in 2006. The word “esquivalience” does not, from a quick text search, appear in it. Huh? I went on to suggest Christine Lindberg’s name to ChatGPT, and after a digression attributing the word to the singer-songwriter Christine Lavin, it appeared to find references to Lindberg’s “claim” in its corpus of data. But, it continued to warn, in every response, “it is still not recognized as a standard word in the English language”. It’s a bot. It’s not being stern. It doesn’t know what it’s saying. Getting it to agree on Christine Lindberg as the original source isn’t winning the argument. It’s just giving it a different prompt.

I ask if it has ever encountered the word “wox”. “As an AI language model, I have certainly come across the word ‘wox’.” A human reads lightly insulted pride into that. Resist. It’s a bot. It has no pride. The bot went on to speculate on possible origins (“it may be a neologism…”). I ask if it’s heard of Gelett Burgess. Oh, yes, followed by a short biography. Then, when told Burgess invented “wox”: “Gelett Burgess did indeed invent the word…” and goes on to cite the correct book…but then continues that Burgess defined it as “to make fun of, to poke fun at” which is absolutely not what Burgess says, and I know this because I have the original 1914 book right here, and the definition I cited above is right there on p112. The bot does “apologize” every time you point out a mistake, though.

This isn’t much of a sample, but based on it, I find ChatGPT quite alarming as an extraordinarily efficient way of undermining factual knowledge. The responses sound authoritative, but every point must be fact-checked. It could not be worse-suited for today’s world, where everyone wants fast answers. Coupled with search, it turns the algorithms that give us answers into even more obscure and less trustworthy black boxes. Wikipedia has many flaws, but its single biggest strength is its sourcing and curation; how every page has been changed and shaped over the years is open for inspection.

So when ChatGPT went on to say that Gelett Burgess is widely credited with coining the term “blurb”, Wikipedia is where I turned. Wikipedia agrees (asked, ChatGPT cites the Oxford English Dictionary). Burgess FTW.

Illustrations: Gelett Burgess’s 1914 Burgess Unabridged, a dictionary of made-up words.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Follow on Mastodon or Twitter.

Inappt

Recently, it took a flatwoven wool rug cmore than two weeks to travel from Luton, Bedfordshire to southwest London. The rug’s source – an Etsy seller – and I sent back and forth dozens of messages. It would be there tomorrow. Oh, no, the courier now says Wednesday. Um, Friday. Er, next week. I can send you a different rug, if you want to choose one. No.

In the end, the rug arrived into my life. I don’t dare decide it’s the wrong color.

I would dismiss this as a one-off aberration, except that a few weeks ago the intended recipient of a parcel sent at the beginning of November casually mentioned they had never received it. Upon chasing, the courier company replied: “Despite an extensive investigation, we have not been able to locate your parcel.”

I would dismiss those as a two-off aberration except that late last year the post office tracking on yet another item went on showing it stuck in some unidentifiable depot somewhere for two weeks. Eventually, I applied brain and logic and went down to the nearest delivery office and there it was, waiting for me to pay the customs fee specified on the card I never received. It was only a few days away from being sent back.

And I would dismiss those as a three-off aberration except that two weeks ago I was notified to expect a package from a company whose name I didn’t recognize between 7pm and 9pm. I therefore felt perfectly safe to go into the room furthest from the front door, the kitchen, and wash some dishes at 5:30. Nope. They delivered at 5:48, I didn’t hear them, and I had a hard time figuring out whom to contact to persuade them to redeliver.

The point about all this is not to yell at random couriers to get off my lawn but to note that at least this part of the app-based economy has stopped delivering the results it promised. Less than ten years since these companies set out to disrupt delivery services by providing lower prices, accurate information, on-time deliveries, and constant tracking, we’re back to waiting at home for unspecified numbers of hours wondering if they’re going to show and struggling to trace lost packages. Only this time, there’s no customer service, working conditions and pay are much worse for drivers and delivery folk, and the closure of many local outlets has left us all far more dependent on them.

***

Also falling over this week, as widely reported (because: journalists), was Twitter, which for a time on Wednesday barred posting new tweets unless they were posted via the kind of scheduling software that the site is limiting). Many of us have been expecting outages ever since November, when Charlie Warzel at The Atlantic and Chris Stokel-Walker at MIT Technology Review interviewed Twitter engineers past and present. All of them warned that the many staff cuts and shrinking budgets have left the service undersupplied with people who can keep the site running and that outages of increasing impact should be expected.

Nonetheless, the “Apocalypse, Now!” reporting that ensued was about as sensible as the reporting earlier in the week that the Fediverse was failing to keep the Tweeters who flooded there beginning in November. In response, https://www.techdirt.com/2023/02/08/lazy-reporters-claiming-fediverse-is-slumping-despite-massive-increase-in-usage/ Mike Masnick noted at TechDirt how silly this was. Because: 1) There’s a lot more to the Fediverse than just Mastodon, which is all these reporters looked at; 2) even then, Mastodon had lost a little from its peak but was still vastly more active than before November; 3) it’s hard for people to change their habits, and they will revert to what’s familiar if they don’t see a reason why they can’t; and 4) it’s still early days. So, meh.

However, Zeynep Tufekci reminds that Twitter’s outage is entertainment only for the privileged; for those trying to coordinate rescue and aid efforts for Turkey, Twitter is an essential tool.

***

While we’re sniping at the failings of current journalism, it appears that yet another technology has been overhyped: DoNotPay, “the world’s first robot lawyer”, the bot written by a British university student that has supposedly been helping folks successfully contest traffic tickets. Masnick (again) and Kathryn Tewson have been covering the story for TechDirt. Tewson, a paralegal, has taken advantage of the fact that cities publish their parking ticket data in order to study DoNotPay’s claims in detail.

TechDirt almost ran a skeptical article about the service in 2017. Suffice to say that now Masnick concludes, “I wish that DoNotPay actually could do much of what it claims to do. It sounds like it could be a really useful service…”

***

The pile-up of this sort of thing – apps that disrupt and then degrade service, technology that’s overhyped (see also self-driving cars), flat-out fraud (see cryptocurrencies), breathless media reporting of nothing much – is probably why I have been unable to raise any excitement over the wow-du-jour, ChatGPT. It seems obvious that of course it can’t read, and can’t understand anything it’s typing, and that sober assessment of what it might be good for is some way off. In the New Yorker, Ted Chiang puts it in its place: think of it as a blurred JPEG. Sounds about right.

Illustrations: Drunk parrot (taken by Simon Bisson).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Stories about the border wars between cyberspace and real life are posted occasionally during the week at the net.wars Pinboard – or follow on Mastodon or Twitter.

Re-centralizing

But first, a housekeeping update. Net.wars has moved – to a new address and new blogging software. For details, see here. If you read net.wars via RSS, adjust your feed to https://netwars.pelicancrossing.net. Past posts’ old URLs will continue to work, as will the archive index page, which lists every net.wars column back to November 2001. And because of the move: comments are now open for the first time in probably about ten years. I will also shortly set up a mailing list for those who would rather get net.wars by email.

***

This week the Ada Lovelace Institute held a panel discussion of ethics for researchers in AI. Arguably, not a moment too soon.

At Noema magazine, Timnet Gebru writes, as Mary L Gray and Siddharth Suri have previously, that what today passes for “AI” and “machine learning” is, underneath, the work of millions of poorly-paid marginalized workers who add labels, evaluate content, and provide verification. At Wired, Gebru adds that their efforts are ultimately directed by a handful of Silicon Valley billionaires whose interests are far from what’s good for the rest of us. That would be the “rest of us” who are being used, willingly or not, knowingly or not, as experimental research subjects.

Two weeks ago, for example, a company called Koko ran an experiment offering chatbot-written/human-overseen mental health counseling without informing the 4,000 people who sought help via the “Koko Cares” Discord server. In a Twitter thread. company co-founder Rob Morris said those users rated the bot’s responses highly until they found out a bot had written them.

People can build relationships with anything, including chatbots, as was proved in 1996 with the release of the experimental chatbot therapist Eliza. People found Eliza’s responses comforting even though they knew it was a bot. Here, however, informed consent processes seem to have been ignored. Morris’s response, when widely criticized for the unethical nature of this little experiment was to say it was exempt from informed consent requirements because helpers could opt whether to use the chatbot’s reponses and Koko had no plan to publish the results.

One would like it to be obvious that *publication* is not the biggest threat to vulnerable people in search of help. One would also like modern technology CEOs to have learned the right lesson from prior incidents such as Facebook’s 2012 experiment to study users’ moods when it manipulated their newsfeeds. Facebook COO Sheryl Sandberg apologized for *how the experiment was communicated*, but not for doing it. At the time, we thought that logic suggested that such companies would continue to do the research but without publishing the results. Though isn’t tweeting publication?

It seems clear that scale is part of the problem here, like the old saying, one death is a tragedy; a million deaths are a statistic. Even the most sociopathic chatbot owner is unlikely to enlist an experimental chatbot to respond to a friend or family member in distress. But once a screen intervenes, the thousands of humans on the other side are just a pile of user IDs; that’s part of how we get so much online abuse. For those with unlimited control over the system we must all look like ants. And who wouldn’t experiment on ants?

In that sense, the efforts of the Ada Lovelace panel to sketch out the diligence researchers should follow are welcome. But the reality of human nature is that it will always be possible to find someone unscrupulous to do unethical research – and the reality of business nature is not to care much about research ethics if the resulting technology will generate profits. Listening to all those earnest, worried researchers left me writing this comment: MBAs need ethics. MBAs, government officials, and anyone else who is in charge of how new technologies are used and whose decisions affect the lives of the people those technologies are imposed upon.

This seemed even more true a day later, at the annual activists’ gathering Privacy Camp. In a panel on the proliferation of surveillance technology at the borders, speakers noted that every new technology that could be turned to helping migrants is instead being weaponized against them. The Border Violence Monitoring Network has collected thousands of such testimonies.

The especially relevant bit came when Hope Barker, a senior policy analyst with BVMN, noted this problem with the forthcoming AI Act: accountability is aimed at developers and researchers, not users.

Granted, technology that’s aborted in the lab isn’t available for abuse. But no technology stays the same after leaving the lab; it gets adapted, altered, updated, merged with other technologies, and turned to uses the researchers never imagined – as Wendy Hall noted in moderating the Ada Lovelace panel. And if we have learned anything from the last 20 years it is that over time technology services enshittify, to borrow Cory Doctorow’s term in a rant which covers the degradation of the services offered by Amazon, Facebook, and soon, he predicts, TikTok.

The systems we call “AI” today have this in common with those services: they are centralized. They are technologies that re-advantage large organizations and governments because they require amounts of data and computing power that are beyond the capabilities of small organizations and individuals to acquire. We can only rent them or be forced to use them. The ur-evil AI, HAL in Stanley Kubrick’s 2001: A Space Odyssey taught us to fear an autonomous rogue. But the biggest danger with “AIs” of the type we are seeing today, that are being put into decision making and law enforcement, is not the technology, nor the people who invented it, but the expanding desires of its controller.

Illustrations: HAL, in 2001.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns back to November 2001. Comment here, or follow on Mastodon or Twitter.