Beware the duck

Once upon a time, “convergence” was a buzzword. That was back in the days when audio was on stereo systems, television was on a TV, and “communications” happened on phones that weren’t computers. The word has disappeared back into its former usage pattern, but it could easily be revived to describe what’s happening to content as humans dive into using generative tools.

Said another way. Roughly this time last year, the annual technology/law/pop culture conference Gikii was awash in (generative) AI. That bubble is deflating, but in the experiments that nonetheless continue a new topic more worthy of attention is emerging: artificial content. It’s striking because what happens at this gathering, which mines all types of popular culture for cues for serious ideas, is often a good guide to what’s coming next in futurelaw.

That no one dared guess which of Zachary Cooper‘s pair of near-identicalaudio clips was AI-generated, and which human-performed was only a starting point. One had more static? Cooper’s main point: “If you can’t tell which clip is real, then you can’t decide which one gets copyright.” Right, because only human creations are eligible (although fake bands can still scam Spotify).

Cooper’s brief, wild tour of the “generative music underground” included using AI tools to create songs whose content is at odds with their genre, whole generated albums built by a human producer making thousands of tiny choices, and the new genre “gencore”, which exploits the qualities of generated sound (Cher and Autotune on steroids). Voice cloning, instrument cloning, audio production plugins, “throw in a bass and some drums”….

Ultimately, Cooper said, “The use of generative AI reveals nothing about the creative relationship to work; it destabilizes the international market by having different authorship thresholds; and there’s no means of auditing any of it.” Instead of uselessly trying to enforce different rights predicated on the use or non-use of a specific set of technologies, he said, we should tackle directly the challenges new modes of production pose to copyright. Precursor: the battles over sampling.

Soon afterwards, Michael Veale was showing us Civitai, an Idaho-based site offering open source generative AI tools, including fine-tuned models. “Civitai exists to democratize AI media creation,” the site explains. “Everything has a valid legal purpose,” Veale said, but the way capabilities can be retrained and chained together to create composites makes it hard to tell which tools, if any, should be taken down, even for creators (see also the puzzlement as Redditors try to work this out). Even environmental regulation can’t help, as one attendee suggested: unlike large language models, these smaller, fine-tuned models (as Jon Crowcroft and I surmised last year would be the future) are efficient; they can run on a phone.

Even without adding artificial content there is always an inherent conflict when digital meets an analog spectrum. This is why, Andy Phippen said, the threshold of 18 for buying alcohol and cigarettes turns into a real threshold of 25 at retail checkouts. Both software and humans fail at determining over-or-under-18, and retailers fear liability. Online age verification as promoted in the Online Safety Act will not work.

If these blurred lines strain the limits of current legal approaches, others expose gaps in the law. Andrea Matwyshyn, for example, has been studying parallels I’ve also noticed between early 20th century company towns and today’s tech behemoths’ anti-union, surveillance-happy working practices. As a result, she believes that regulatory authorities need to start considering closely the impact of data aggregation when companies merge and look for company town-like dynamics”.

Andelka Phillips parodied the overreach of app contracts by imagining the EULA attached to “ThoughtReader app”. A sample clause: “ThoughtReader may turn on its service at any time. By accepting this agreement, you are deemed to accept all monitoring of your thoughts.” Well, OK, then. (I also had a go at this here, 19 years ago.)

Emily Roach toured the history of fan fiction and the law to end up at Archive of Our Own, a “fan-created, fan-run, nonprofit, noncommercial archive for transformative fanworks, like fanfiction, fanart, fan videos, and podfic”, the idea being to ensure that the work fans pour their hearts into has a permanent home where it can’t be arbitrarily deleted by corporate owners. The rules are strict: not so much as a “buy me a coffee” tip link that could lead to a court-acceptable claim of commercial use.

History, the science fiction writer Charles Stross has said, is the science fiction writer’s secret weapon. Also at Gikii: Miranda Mowbray unearthed the 18th century “Digesting Duck” automaton built by Jacques de Vauconson. It was a marvel that appeared to ingest grain and defecate waste and that in its day inspired much speculation about the boundaries between real and mechanical life. Like the amazing ancient Greek automata before it, it was, of course, a purely mechanical fake – it stored the grain in a small compartment and released pellets from a different compartment – but today’s humans confused into thinking that sentences mean sentience could relate.

Illustrations: One onlooker’s rendering of his (incorrect) idea of the interior of Jacques de Vaucanson’s Digesting Duck (via Wikimedia).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

Doom cyberfuture

Midway through this year’s gikii miniconference for pop culture-obsessed Internet lawyers, Jordan Hatcher proposed that generational differences are the key to understanding the huge gap between the Internet pioneers, who saw regulation as the enemy, and the current generation, who are generally pushing for it. While this is a bit too pat – it’s easy to think of Millennial libertarians and I’ve never thought of Boomers as against regulation, just, rationally, against bad Internet law that sticks – it’s an intriguing idea.

Hatcher, because this is gikii and no idea can be presented without a science fiction tie-in, illustrated this with 1990s movies, which spread the “DCF-84 virus” – that is, “doom cyberfuture-84”. The “84” is not chosen for Orwell but for the year William Gibson’s Neuromancer was published. Boomers – he mentioned John Perry Barlow, born 1947, and Lawrence Lessig, born 1961 – were instead infected with the “optimism virus”.

It’s not clear which 1960s movies might have seeded us with that optimism. You could certainly make the case that 1968’s 2001: A Space Odyssey ends on a hopeful note (despite an evil intelligence out to kill humans along the way), but you don’t even have to pick a different director to find dystopia: I see your 2001 and give you Dr Strangelove (1964). Even Woodstock (1970) is partly dystopian; the consciousness of the Vietnam war permeates every rain-soaked frame. But so did the belief that peace could win: so, wash.

For younger people’s pessimism, Hatcher cited 1995’s Johnny Mnemonic (based on a Gibson short story) and Strange Days.

I tend to think that if 1990s people are more doom-laden than 1960s people it has more to do with real life. Boomers were born in a time of economic expansion, relatively affordable education and housing, and and when they protested a war the government eventually listened. Millennials were born in a time when housing and education meant a lifetime of debt, and when millions of them protested a war they were ignored.

In any case, Hatcher is right about the stratification of demographic age groups. This is particularly noticeable in social media use; you can often date people’s arrival on the Internet by which communications medium they prefer. Over dinner, I commented on the nuisance of typing on a phone versus a real keyboard, and two younger people laughed at me: so much easier to type on a phone! They were among the crowd whose papers studied influencers on TikTok (Taylor Annabell, Thijs Kelder, Jacob van de Kerkhof, Haoyang Gui, and Catalina Goanta) and the privacy dangers of dating apps (Tima Otu Anwana and Paul Eberstaller), the kinds of subjects I rarely engage with because I am a creature of text, like most journalists. Email and the web feel like my native homes in a way that apps, game worlds, and video services never will. That dates me both chronologically and by my first experiences of the online world (1991).

Most years at this event there’s a new show or movie that fires many people’s imagination. Last year it was Upload with a dash of Severance. This year, real technological development overwhelmed fiction, and the star of the show was generative AI and large language models. Besides my paper with Jon Crowcrosft, there was one from Marvin van Bekkum, Tim de Jonge, and Frederik Zuiderveen Borgesius that compared the science fiction risks of AI – Skynet, Roko’s basilisk, and an ordering of Asimov’s Laws that puts obeying orders above not harming humans (see XKCD, above) – to the very real risks of the “AI” we have: privacy, discrimination, and environmental damage.

Other AI papers included one by Colin Gavaghan, who asked if it actually matters if you can’t tell whether the entity that’s communicating with you is an AI? Is that what you really need to know? You can see his point: if you’re being scammed, the fact of the scam matters more than the nature of the perpetrator, though your feelings about it may be quite different.

A standard explanation of what put the “science” in science fiction (or the “speculative” in “speculative fiction”) used be to that the authors ask, “What if?” What if a planet had six suns whose interplay meant that darkness only came once every 1,000 years? Would the reaction really be as Ralph Waldo Emerson imagined it? (Isaac Asimov’s Nightfall). What if a new link added to the increasingly complex Boston MTA accidentally turned the system into a Mobius strip (A Subway Named Mobius, by Armin Joseph Deutsch). And so on.

In that sense, gikii is often speculative law, thought experiments that tease out new perspectives. What if Prime Day becomes a culturally embedded religious holiday (Megan Rae Blakely)? What if the EU’s trademark system applied in the Star Trek universe (Simon Sellers)? What if, as in Max Gladsone’s Craft Sequence books, law is practical magic (Antonia Waltermann)? In the trademark example, time travel is a problem; as competing interests can travel further and further back to get the first registration. In the latter…well, I’m intrigued by the idea that a law making dumping sewage in England’s rivers illegal could physically stop it from happening without all the pesky apparatus of law enforcement and parliamentary hearings.

Waltermann concluded by suggesting that to some extent law *is* magic in our world, too. A useful reminder: be careful what law you wish for because you just may get it. Boomer!

Illustrations: Part of XKCD‘s analysis of Asimov’s Laws of Robotics.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Wendy M. GrossmanPosted on Categories AI, Future tech, LawTags , , 2 Comments on Doom cyberfuture

Small data

Shortly before this gets posted, Jon Crowcroft and I will have presented this year’s offering at Gikii, the weird little conference that crosses law, media, technology, and pop culture. This is what we will possibly may have said, as I understand it, with some added explanation for the slightly less technical audience I imagine will read this.

Two years ago, a team of four researchers – Timnit Gebru, Emily Bender, Margaret Mitchell (writing as Shmargaret Shmitchell), and Angelina McMillan-Major – wrote a now-famous paper called On the Dangers of Stochastic Parrots (PDF) calling into question the usefulness of the large language models (LLMs) that have caused so much ruckus this year. The “Stochastic Four” argued instead of small models built on carefully curated data: less prone to error, less exploitive of people’s data, less damaging to the planet. Gebru got fired over this paper; Google also fired Mitchell soon afterwards. Two years later, neural networks pioneer Geoff Hinton quit Google in order to voice similar concerns.

Despite the hype, LLMs have many problems. They are fundamentally an extractive technology and are resource-intensive. Building LLMs requires massive amounts of training data; so far, the companies have been unwilling to acknowledge their sources, perhaps because (as is happening already) they fear copyright suits.

More important from a technical standpoint, is the issue of model collapse; that is, models degrade when they begin to ingest synthetic AI-generated data instead of human input. We’ve seen this before with Google Flu Trends, which degraded rapidly as incoming new search data included many searches on flu-like symptoms that weren’t actually flu, and others that simply reflected the frequency of local news coverage. “Data pollution” as LLM-generated data fills the web, will mean that the web will be an increasingly useless source of training data for future generations of generative AI. Lots more noise, drowning out the signal (in the photo above, the signal would be the parrot).

Instead, if we follow the lead of the Stochastic Four, the more productive approach is small data – small, carefully curated datasets that train models to match specific goals. Far less resource-intensive, far fewer issues with copyright, appropriation, and extraction.

We know what the LLM future looks like in outline: big, centralized services, because no one else will be able to amass enough data. In that future, surveillance capitalism is an essential part of data gathering. SLM futures could look quite different: decentralized, with realigned incentives. At one point, we wanted to suggest that small data could bring the end of surveillance capitalism; that’s probably an overstatement. But small data could certainly create the ecosystem in which the case for mass data collection would be less compelling.

Jon and I imagined four primary alternative futures: federation, personalization, some combination of those two, and paradigm shift.

Precursors to a federated small data future already exist; these include customer service chatbots, predictive text assistants. In this future, we could imagine personalized LLM servers designed to serve specific needs.

An individualized future might look something like I suggested here in March: a model that fits in your pocket that is constantly updated with material of your own choosing. Such a device might be the closest yet to Vannevar Bush’s 1945 idea of the Memex (PDF), updated for the modern era by automating the dozens of secretary-curators he imagined doing the grunt work of labeling and selection. That future again has precursors in techniques for sharing the computation but not the data, a design we see proposed for health care, where the data is too sensitive to share unless there’s a significant public interest (as in pandemics or very rare illnesses), or in other data analysis designs intended to protect privacy.

In 2007, the science fiction writer Charles Stross suggested something like this, though he imagined it as a comprehensive life log, which he described as a “google for real life”. So this alternative future would look something like Stross’s pocket $10 life log with enhanced statistics-based data analytics.

Imagining what a paradigm shift might look like is much harder. That’s the kind of thing science fiction writers do; it’s 16 years since Stross gave that life log talk. However, in his 2018 history of advertising, The Attention Merchants, Columbia professor Tim Wu argued that industrialization was the vector that made advertising and its grab for our attention part of commerce. A hundred and fifty-odd years later, the centralizing effects of industrialization are being challenged starting with energy via renewables and local power generation and social media via the fediverse. Might language models also play their part in bringing a new, more collaborative and cooperative society?

It is, in other words, just possible that the hot new technology of 2023 is simply a dead end bringing little real change. It’s happened before. There have been, as Wu recounts, counter-moves and movements before, but they didn’t have the technological affordances of our era.

In the Q&A that followed, Miranda Mowbray pointed out that companies are trying to implement the individualized model, but that it’s impossible to do unless there are standardized data formats, and even then hard to do at scale.

Illustrations: Spot the parrot seen in a neighbor’s tree.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Wendy M. GrossmanPosted on Categories AI, Events, New tech, old knowledgeTags 3 Comments on Small data