Review: The Web We Weave

The Web We Weave
By Jeff Jarvis
Basic Books
ISBN: 9781541604124

Sometime in the very early 1990s, someone came up to me at a conference and told me I should read the work of Robert McChesney. When I followed the instruction, I found a history of how radio and TV started as educational media and wound up commercially controlled. Ever since, this is the lens through which I’ve watched the Internet develop: how do we keep the Internet from following that same path? If all you look at is the last 30 years of web development, you might think we can’t.

A similar mission animates retired CUNY professor Jeff Jarvis in his latest book, The Web We Weave. In it, among other things, he advocates reanimating the open web by reviving the blogs many abandoned when Twitter came along and embracing other forms of citizen media. Phenomena such as disinformation, misinformation, and other harms attributed to social media, he writes, have precursor moral panics: novels, comic books, radio, TV, all were once new media whose evils older generations fretted about. (For my parents, it was comic books, which they completely banned while ignoring the hours of TV I watched.) With that past in mind, much of today’s online harms regulation leaves him skeptical.

As a media professor, Jarvis is interested in the broad sweep of history, setting social media into the context that began with the invention of the printing press. That has its benefits when it comes to later chapters where he’s making policy recommendations on what to regulate and how. Jarvis is emphatically a free-speech advocate.

Among his recommendations are those such advocates typically support: users should be empowered, educated, and taught to take responsibility, and we should develop business models that support good speech. Regulation, he writes, should include the following elements: transparency, accountability, disclosure, redress, and behavior rather than content.

On the other hand, Jarvis is emphatically not a technical or design expert, and therefore has little to say about the impact on user behavior of technical design decisions. Some things we know are constants. For example, the willingness of (fully identified) online communicators to attack each other was noted as long ago as the 1980s, when Sara Kiesler studied the first corporate mailing lists.

Others, however, are not. Those developing Mastodon, for example, deliberately chose not to implement the ability to quote and comment on a post because they believed that feature fostered abuse and pile-ons. Similarly, Lawrence Lessig pointed out in 1999 in Code and Other Laws of Cyberspae (PDF) that you couldn’t foment a revolution using AOL chatrooms because they had a limit of 23 simultaneous users.

Understanding the impact of technical decisions requires experience, experimentation, and, above all, time. If you doubt this, read Mike Masnick’s series at Techdirt on Elon Musk’s takeover and destruction of Twitter. His changes to the verification system alone have undermined the ability to understand who’s posting and decide how trustworthy their information is.

Jarvis goes on to suggest we should rediscover human scale and mutual obligation, both crucial as the covid pandemic progressed. The money will always favor mass scale. But we don’t have to go that way.

A hole is a hole

We told you so.

By “we” I mean thousands, of privacy advocates, human rights activists, technical experts, and information security journalists.

By “so”, I mean: we all said repeatedly over decades that there is no such thing as a magic hole that only “good guys” can use. If you build a supposedly secure system but put in a hole to give the “authorities” access to communications, that hole can and will be exploited by “bad guys” you didn’t want spying on you.

The particular hole Chinese hackers used to spy on the US is the Communications Assistance for Law Enforcement Act (1994). CALEA mandates that telecommunications providers design their equipment so that they can wiretap any customer if law enforcement presents a warrant. At Techcrunch, Zack Whittaker recaps much of the history, tracing technology giants’ new emphasis on end-to-end encryption to the 2013 Snowden revelations of the government’s spying on US citizens.

The mid-1990s were a time of profound change for telecommunications: the Internet was arriving, exchanges were converting from analog to digital, and deregulation was providing new competition for legacy telcos. In those pre-broadband years, hundreds of ISPs offered dial-up Internet access. Law enforcement could no longer just call up a single central office to place a wiretap. When CALEA was introduced, critics were clear and prolific; for an in-depth history see Susan Landau’s and Whit Diffie’s book, Privacy on the Line (originally published 1998, second edition 2007). The net.wars archive includes a compilation of years of related arguments, and at Techdirt, Mike Masnick reviews the decades of law enforcement insistence that they need access to encrypted text. “Lawful access” is the latest term of art.

In the immediate post-9/11 shock, some of those who insisted on the 1990s version of today’s “lawful access” – key escrow, took the opportunity to tell their opponents (us) that the attacks proved we’d been wrong. One such was the just-departed Jack Straw, the home secretary from 1997 to (June) 2001, who blamed BBC Radio Four and “…large parts of the industry, backed by some people who I think will now recognise they were very naive in retrospect”. That comment sparked the first net.wars column. We could now say, “Right back atcha.”

Whatever you call an encryption backdoor, building a hole into communications security was, is, and will always be a dangerous idea, as the Dutch government recently told the EU. Now, we have hard evidence.

***

The time is long gone when people used to be snobbish about Internet addresses (see net.wars-the-book, chapter three). Most of us are therefore unlikely to have thought much about the geekishly popular “.io”. It could be a new-fangled generic top-level domain – but it’s not. We have been reading linguistic meaning into what is in fact a country code. Which is all fine and good, except that the country it belongs to is the Chagos Islands, also known as the British Indian Ocean Territory, which I had never heard of until the British government announced recently that it will hand the islands back to Mauritius (instead of asking the Chagos Islanders what they want…). Gareth Edwards made the connection: when that transfer happens, .io will cease to exist (h/t Charles Arthur’s The Overspill).

Edwards goes on to discuss the messy history of orphaned country code domains: Yugoslavia, and the Soviet Union. As a result, ICANN, the naming authority, now has strict rules that mandate termination in such cases. This time, there’s a lot at stake: .io is a favorite among gamers, crypto companies, and many others, some of them substantial businesses. Perhaps a solution – such as setting .io up anew as a gTLD with its domains intact – will be created. But meantime, it’s worth noting that the widely used .tv (Tuvalu), .fm (Federated States of Micronesia), and .ai (Anguilla) are *also* country code domains.

***

The story of what’s going on with Automattic, the owner of the blogging platform WordPress.com, and WP Engine, which provides hosting and other services for businesses using WordPress, is hella confusing. It’s also worrying: WordPress, which is open source content management software overseen by the WordPress Foundation, powers a little over 40% of the Internet’s top ten million websites and more than 60% of sites overall (including this one).

At Heise Online, Kornelius Kindermann offers one of the clearer explanations: Automattic, whose CEO, Matthew Mullenweg is also a director of the WordPress Foundation and a co-creator of the software, wants WP Engine, which has been taken over by the investment company Silver Lake, to pay “trademark royalties” of 8% to the WordPress Foundation to support the software. WP Engine doesn’t wanna. Kindermann estimates the sum involved at $35 million, After the news of all that broke, 159 employees have announced they are leaving Automattic.

The more important point that, like the users of the encrypted services governments want to compromise, the owners of .io domains, or, ultimately, the Chagos Islanders themselves, WP Engine’s customers, some of them businesses worth millions, are hostages of uncertainty surrounding the decisions of others. Open source software is supposed to give users greater control. But as always, complexity brings experts and financial opportunities, and once there’s money everyone wants some of it.

Illustrations: View of the Chagos Archipelago taken during ISS Expedition 60 (NASA, via Wikimedia).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

Beware the duck

Once upon a time, “convergence” was a buzzword. That was back in the days when audio was on stereo systems, television was on a TV, and “communications” happened on phones that weren’t computers. The word has disappeared back into its former usage pattern, but it could easily be revived to describe what’s happening to content as humans dive into using generative tools.

Said another way. Roughly this time last year, the annual technology/law/pop culture conference Gikii was awash in (generative) AI. That bubble is deflating, but in the experiments that nonetheless continue a new topic more worthy of attention is emerging: artificial content. It’s striking because what happens at this gathering, which mines all types of popular culture for cues for serious ideas, is often a good guide to what’s coming next in futurelaw.

That no one dared guess which of Zachary Cooper‘s pair of near-identicalaudio clips was AI-generated, and which human-performed was only a starting point. One had more static? Cooper’s main point: “If you can’t tell which clip is real, then you can’t decide which one gets copyright.” Right, because only human creations are eligible (although fake bands can still scam Spotify).

Cooper’s brief, wild tour of the “generative music underground” included using AI tools to create songs whose content is at odds with their genre, whole generated albums built by a human producer making thousands of tiny choices, and the new genre “gencore”, which exploits the qualities of generated sound (Cher and Autotune on steroids). Voice cloning, instrument cloning, audio production plugins, “throw in a bass and some drums”….

Ultimately, Cooper said, “The use of generative AI reveals nothing about the creative relationship to work; it destabilizes the international market by having different authorship thresholds; and there’s no means of auditing any of it.” Instead of uselessly trying to enforce different rights predicated on the use or non-use of a specific set of technologies, he said, we should tackle directly the challenges new modes of production pose to copyright. Precursor: the battles over sampling.

Soon afterwards, Michael Veale was showing us Civitai, an Idaho-based site offering open source generative AI tools, including fine-tuned models. “Civitai exists to democratize AI media creation,” the site explains. “Everything has a valid legal purpose,” Veale said, but the way capabilities can be retrained and chained together to create composites makes it hard to tell which tools, if any, should be taken down, even for creators (see also the puzzlement as Redditors try to work this out). Even environmental regulation can’t help, as one attendee suggested: unlike large language models, these smaller, fine-tuned models (as Jon Crowcroft and I surmised last year would be the future) are efficient; they can run on a phone.

Even without adding artificial content there is always an inherent conflict when digital meets an analog spectrum. This is why, Andy Phippen said, the threshold of 18 for buying alcohol and cigarettes turns into a real threshold of 25 at retail checkouts. Both software and humans fail at determining over-or-under-18, and retailers fear liability. Online age verification as promoted in the Online Safety Act will not work.

If these blurred lines strain the limits of current legal approaches, others expose gaps in the law. Andrea Matwyshyn, for example, has been studying parallels I’ve also noticed between early 20th century company towns and today’s tech behemoths’ anti-union, surveillance-happy working practices. As a result, she believes that regulatory authorities need to start considering closely the impact of data aggregation when companies merge and look for company town-like dynamics”.

Andelka Phillips parodied the overreach of app contracts by imagining the EULA attached to “ThoughtReader app”. A sample clause: “ThoughtReader may turn on its service at any time. By accepting this agreement, you are deemed to accept all monitoring of your thoughts.” Well, OK, then. (I also had a go at this here, 19 years ago.)

Emily Roach toured the history of fan fiction and the law to end up at Archive of Our Own, a “fan-created, fan-run, nonprofit, noncommercial archive for transformative fanworks, like fanfiction, fanart, fan videos, and podfic”, the idea being to ensure that the work fans pour their hearts into has a permanent home where it can’t be arbitrarily deleted by corporate owners. The rules are strict: not so much as a “buy me a coffee” tip link that could lead to a court-acceptable claim of commercial use.

History, the science fiction writer Charles Stross has said, is the science fiction writer’s secret weapon. Also at Gikii: Miranda Mowbray unearthed the 18th century “Digesting Duck” automaton built by Jacques de Vauconson. It was a marvel that appeared to ingest grain and defecate waste and that in its day inspired much speculation about the boundaries between real and mechanical life. Like the amazing ancient Greek automata before it, it was, of course, a purely mechanical fake – it stored the grain in a small compartment and released pellets from a different compartment – but today’s humans confused into thinking that sentences mean sentience could relate.

Illustrations: One onlooker’s rendering of his (incorrect) idea of the interior of Jacques de Vaucanson’s Digesting Duck (via Wikimedia).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

Sectioned

Social media seems to be having a late-1990s moment, raising flashbacks to the origins of platform liability and the passage of Section 230 of the Communications Decency Act (1996). It’s worth making clear at the outset: most of the people talking about S230 seem to have little understanding of what it is and does. It allows sites to moderate content without becoming liable for it. It is what enables all those trust and safety teams to implement sites’ restrictions on acceptable use. When someone wants to take an axe to it because there is vile content circulating, they have not understood this.

So, in one case this week a US appeals court is allowing a lawsuit to proceed that seeks to hold TikTok liable for users’ postings of the “blackout challenge”, the idea being to get an adrenaline rush by reviving from near-asphyxiation. Bloomberg reports that at least 20 children have died trying to accomplish this, at least 15 of them age 12 or younger (TikTok, like all social media, is supposed to be off-limits to under-13s). The people suing are the parents of one of those 20, a ten-year-old girl who died attempting the challenge.

The other case is that of Pavel Durov, CEO of the messaging service Telegram, who has been arrested in France as part of a criminal investigation. He has been formally charged with complicity in managing an online platform “in order to enable an illegal transaction in organized group”, and refusal to cooperate with law enforcement authorities and ordered not to leave France, with bail set at €5 million (is that enough to prevent the flight of a billionaire with four passports?).

While there have been many platform liability cases, there are relatively few examples of platform owners and operators being charged. The first was in 1997, back when “online” still had a hyphen; the German general manager of CompuServe, Felix Somm, was arrested in Bavaria on charges of “trafficking in pornography”. That is, German users of Columbus, Ohio-based CompuServe could access pornography and illegal material on the Internet through the service’s gateway. In 1998, Somm was convicted and given a two-year suspended sentence. In 1999 his conviction was overturned on appeal, partly, the judge wrote, because there was no technology at the time that would have enabled CompuServe to block the material.

The only other example I’m aware of came just this week, when an Arizona judge sentenced Michael Lacey, co-founder of the classified ads site Backpage.com, to five years in prison and fined him $3 million for money laundering. He still faces further charges for prostitution facilitation and money laundering; allegedly he profited from a scheme to promote prostitution on his site. Two other previously convicted Backpages executives were also sentenced this week to ten years in prison.

In Durov’s case, the key point appears to be his refusal to follow industry practice with respect to to reporting child sexual abuse material or cooperate with properly executed legal requests for information. You don’t have to be a criminal to want the social medium of your choice to protect your privacy from unwarranted government snooping – but equally, you don’t have to be innocent to be concerned if billionaire CEOs of large technology companies consider themselves above the law. (See also Elon Musk, whose X platform may be tossed out of Brazil right now.)

Some reports on the Durov case have focused on encryption, but the bigger issue appears to be failure to register to use encryption , as Signal has. More important, although Telegram is often talked about as encrypted, it’s really more like other social media, where groups are publicly visible, and only direct one-on-one messages are encrypted. But even then, they’re only encrypted if users opt in. Given that users notoriously tend to stick with default settings, that means that the percentage of users who turn that encryption on is probably tiny. So it’s not clear yet whether France is seeking to hold Durov responsible for the user-generated content on his platform (which S230 would protect in the US), or accusing him of being part of criminal activity relating to his platform (which it wouldn’t).

Returning to the Arizona case, in allowing the lawsuit to go ahead, the appeals court judgment says that S230 has “evolved away from its original intent”, and argues that because TikTok’s algorithm served up the challenge on the child’s “For You” page, the service can be held responsible. At TechDirt, Mike Masnick blasts this reasoning, saying that it overturns numerous other court rulings upholding S230, and uses the same reasoning as the 1995 decision in Stratton Oakmont v. Prodigy. That was the case that led directly to the passage of S230, introduced by then-Congressman Christopher Cox (R-CA) and Senator Ron Wyden (D-OR), who are still alive to answer questions about their intent. Rather than evolving away, we’ve evolved back full circle.

The rise of monopolistic Big Tech has tended to obscure the more important point about S230. As Cory Doctorow writes for EFF, killing S230 would kill the small federated communities (like Mastodon and Discord servers) and web boards that offer alternatives to increasing Big Tech’s pwoer. While S230 doesn’t apply outside the US (some Americans have difficulty understanding that other countries have different laws), its ethos is pervasive and the companies it’s enabled are everywhere. In the end, it’s like democracy: the alternatives are worse.

Illustrations: Drunken parrot in Putney (by Simon Bisson).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

Gather ye lawsuits while ye may

Most of us howled with laughter this week when the news broke that Elon Musk is suing companies for refusing to advertise on his exTwitter platform. To be precise, Musk is suing the World Federation of Advertisers, Unilever, Mars, CVS, and Ørsted in a Texas court.

How could Musk, who styles himself a “free speech absolutist”, possibly force companies to advertise on his site? This is pure First Amendment stuff: both the right to free speech (or to remain silent) and freedom of assembly. It adds to the nuttiness of it all that last November Musk was telling advertisers to “go fuck yourselves” if they threatened him with a boycott. Now he’s mad because they responded in kind.

Does the richest man in the world even need advertisers to finance his toy?

At Techdirt, Mike Masnick catalogues the “so much stupid here”.

The WFA initiative that offends Musk is the Global Alliance for Responsible Media, which develops guidelines for content moderation – things like a standard definition for “hate speech” to help sites operate consistent and transparent policies and reassure advertisers that their logos don’t appear next to horrors like the livestreamed shooting in Christchurch, New Zealand. GARM’s site says: membership is voluntary, following its guidelines is voluntary, it does not provide a rating service, and it is apolitical.

Pre-Musk, Twitter was a member. After Musk took over, he pulled exTwitter out of it – but rejoined a month ago. Now, Musk claims that refusing to advertise on his site might be a criminal matter under RICO. So he’s suing himself? Blink.

Enter US Republicans, who are convinced that content moderation exists only to punish conservative speech. On July 10, House Judiciary Committee, under the leadership of Jim Jordan (R-OH), released an interim report on its ongoing investigation of GARM.

The report says GARM appears to “have anti-democratic views of fundamental American freedoms” and likens its work to restraint of trade Among specific examples, it says GARM’s recommended that its members stop advertising on exTwitter, threatened Spotify when podcaster Joe Rogan told his massive audience that young, healthy people don’t need to be vaccinated against covid, and considered blocking news sites such as Fox News, Breitbart, and The Daily Wire. In addition, the report says, GARM advised its members to use fact-checking services like NewsGuard and the Global Disinformation Index “which disproportionately label right-of-center news sites as so-called misinformation”. Therefore, the report concludes, GARM’s work is “likely illegal under the antitrust laws”.

I don’t know what a court would have made of that argument – for one thing, GARM can’t force anyone to follow its guidelines. But now we’ll never know. Two days after Musk filed suit, the WFA announced it’s shuttering GARM immediately because it can’t afford to defend the lawsuit and keep operating even though it believes it’s complied with competition rules. Such is the role of bullies in our public life.

I suppose Musk can hope that advertisers decide it’s cheaper to buy space on his site than to fight the lawsuit?

But it’s not really a laughing matter. GARM is just one of a number of initiatives that’s come under attack as we head into the final three months of campaigning before the US presidential election. In June, Renee DiResta, author of the new book Invisible Rulers, announced that her contract as the research manager of the Stanford Internet Observatory was not being renewed. Founding director Alex Stamos was already gone. Stanford has said the Observatory will continue under new leadership, but no details have been published. The Washington Post says conspiracy theorists have called DiResta and Stamos part of a government-private censorship consortium.

Meanwhile, one of the Observatory’s projects, a joint effort with the University of Washington called the Election Integrity Partnership, has announced, in response to various lawsuits and attacks, that it will not work on the 2024 or future elections. At the same time, Meta is shutting down CrowdTangle next week, removing a research tool that journalists and academics use to study content on Facebook and Instagram. While CrowdTangle will be replaced with Meta Content Library, access will be limited to academics and non-profits, and those who’ve seen it say it’s missing useful data that was available through CrowdTangle.

The concern isn’t the future of any single initiative; it’s the pattern of these things winking out. As work like DiResta’s has shown, the flow of funds financing online political speech (including advertising) is dangerously opaque. We need access and transparency for those who study it, and in real time, not years after the event.

In this, as so much else, the US continues to clash with the EU, which it accused in December of breaching its rules with respect to disinformation, transparency, and extreme content. Last month, it formally charged Musk’s site for violating the Digital Services Act, for which Musk could be liable for a fine of up to 6% of exTwitter’s global revenue. Among the EU’s complaints is the lack of a searchable and reliable advertisement repository – again, an important element of the transparency we need. Its handling of disinformation and calls to violence during the current UK riots may be added to the investigation.

Musk will be suing *us*, next.

Illustrations: A cartoon caricature of Christina Rossetti by her brother Dante Gabriel Rossetti 1862, showing her having a tantrum after reading The Times’ review of her poetry (via Wikimedia).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

Boxed up

If the actions of the owners of streaming services are creating the perfect conditions for the return of piracy, it’s equally true that the adtech industry’s decisions continue to encourage installing ad blockers as a matter of self-defense. This is overall a bad thing, since most of us can’t afford to pay for everything we want to read online.

This week, Google abruptly aborted a change it’s been working on for four years: it will abandon its plan to replace third-party cookies with new technology it called Privacy Sandbox. From the sounds of it, Google will continue working on the Sandbox, but will continue to retain third-party cookies. The privacy consequences of this are…muddy.

To recap: there are two kinds of cookies, which are small files websites place on your computer, distinguished by their source and use. Sites use first-party cookies to give their pages the equivalent of memory. They’re how the site remembers which items you’ve put in your cart, or that you’ve logged in to your account. These are the “essential cookies” that some consent banners mention, and without them you couldn’t use the web interactively.

Third-party cookies are trackers. Once a company deposits one of these things on your computer, it can use it to follow along as you browse the web, collecting data about you and your habits the whole time. To capture the ickiness of this, Demos researcher Carl Miller has suggested renaming them slime trails. Third-party cookies are why the same ads seem to follow you around the web. They are also why people in the UK and Europe see so many cookie consent banners: the EU’s General Data Protection Regulation requires all websites to obtain informed consent before dropping them on our machines. Ad blockers help here. They won’t stop you from seeing the banners, but they can save you the time you’d have to spend adjusting settings on the many sites that make it hard to say no.

The big technology companies are well aware that people hate both ads and being tracked in order to serve ads. In 2020, Apple announced that its Safari web browser would block third-party cookies by default, continuing work it started in 2017. This was one of several privacy-protecting moves the company made; in 2021, it began requiring iPhone apps to offer users the opportunity to opt out of tracking for advertising purposes at installation. In 2022, Meta estimated Apple’s move would cost it $10 billion that year.

If the cookie seemed doomed at that point, it seemed even more so when Google announced it was working on new technology that would do away with third-party cookies in its dominant Chrome browser. Like Apple, however, Google proposed to give users greater control only over the privacy invasions of third parties without in any way disturbing Google’s own ability to track users. Privacy advocates quickly recognized this.

At Ars Technica, Ron Amadeo describes the Sandbox’s inner workings. Briefly, it derives a list of advertising topics from the websites users visits, and shares those with web pages when they ask. This is what you turn on when you say yes to Chrome’s “ad privacy feature”. Back when it was announced, EFF’s Bennett Cyphers was deeply unimpressed: instead of new tracking versus old tracking, he asked, why can’t we have *no* tracking? Just a few days ago, EFF followed up with the news that its Privacy Badger browser add-on now opts users out of the Privacy Sandbox (EFF has also published manual instructions.).

Google intended to make this shift in stages, beginning the process of turning off third-party cookies in January 2024 and finishing the job in the second half of 2024. Now, when the day of completion should be rapidly approaching, the company has said it’s over – that is, it no longer plans to turn off third-party cookies. As Thomas Claburn writes at The Register, implementing the new technology still requires a lot of work from a lot of companies besides Google. The technology will remain in the browser – and users will “get” to choose which kind of tracking they prefer; Kevin Purdy reports at Ars Technica that the company is calling this a “new experience”.

At The Drum, Kendra Barnett reports that the UK’s Information Commissioner’s Office is unhappy about Google’s decision. Even though it had also identified possible vulnerabilities in the Sandbox’s design, the ICO had welcomed the plan to block third-party cookies.

I’d love to believe that Google’s announcement might have been helped by the fact that Sandbox is already the subject of legal action. Last month the privacy-protecting NGO noyb complained to the Austrian data protection authority, arguing that Sandbox tracking still requires user consent. Real consent, not obfuscated “ad privacy feature” stuff, as Richard Speed explains at The Register. But far more likely it’s money, At the Press Gazette, Jim Edwards reports that Sandbox could cost publishers 60% of their revenue “from programmatically sold ads”. Note, however, that the figure is courtesy of adtech company Criteo, likely a loser under Sandbox.

The question is what comes next. As Cyphers said, we deserve real choices: *whether* we are tracked, not just who gets to do it. Our lives should not be the leverage big technology companies use to enhance their already dominant position.

Illustrations: A sandbox (via Wikimedia)

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

The return of piracy

In Internet terms, it’s been at least a generation since the high-profile fights over piracy – that is, the early 2000s legal actions against unauthorized sites offering music, TV, and films, and the people who used them. Visits to the news site TorrentFreak this week feel like a revival.

The wildest story concerns Z-Library, for some years the largest shadow book collection. Somewhere someone must be busily planning a true crime podcast series. Z-Library was briefly offline in 2022, when the US Department of Justice seized many of its domains. Shortly afterwards there arrived Anna’s Archive, a search engine for shadow libraries – Z-Library and many others, and the journal article shadow repository Sci-Hub. Judging from a small sampling exercise, you can find most books that have been out for longer than a month. Newer books tend to be official ebooks stripped of digital rights management.

In November 2022, the Russian nationals Anton Napolsky and Valeriia Ermakova were arrested in Argentina, alleged to be Z-Library administrators. The US requested extradition, and an Argentinian judge agreed. They appealed to the Argentinian supreme court, asking to be classed as political refugees. This week, a story in local publication La Voz, made its way north. As Ashley Belanger explains at Ars Technica, Napolsky and Ermakova decided not to wait for a judgment, escaped house arrest back in May, and vanished. The team running Z-library say the pair are innocent of copyright infringement.

Also missing in court: Anna’s Archive’s administrators. As noted here in February; the library service company OCLC sued Anna’s Archive for having exploited a security hole in its website in order to scrape 2,2TB of its book metadata. This might have gone unnoticed, except that the admins published the news on its blog. OCLC is claiming that the breach has cost millions to remediate its systems.

This week saw an update to the case: OCLC has moved for summary judgment as Anna’s Archive’s operators have failed to turn up in court. At TorrentFreak, Ernesto van der Sar reports that OCLC is also demanding millions in damages and injunctive relief barring Anna’s from continuing to publish the scraped data, though it does not ask for the site to be blocked. (The bit demanding that Anna’s Archive pay the costs of remediating OCLC’s flawed systems is puzzling; do we make burglars who climb in through open windows pay for locksmiths?)

And then there is the case of the Internet Archive’s Open Library, which claims its scanned-in books are legal under the novel theory of controlled digital lending. When the Internet Archive responded to the covid crisis by removing those controls in 2020, four major publishers filed suit. In 2023, the US District Court for the Southern District of New York ruled against the Internet Archive, saying its library enables copyright infringement. Since then, the Archive has removed 500,000 books.

This is the moment when lessons from the past of music, TV, and video piracy could be useful. Critics always said that the only workable answer to piracy is legal, affordable services, and they were right, as shown by Pandora, Spotify, Netflix, which launched its paid streaming service in 2007, and so many others.

It’s been obvious for at least two years that things are now going backwards. Last December, in one of many such stories, the Discovery/Warner Brothers merger ended a licensing agreement with Sony, leading the latter to delete from Playstation users’ libraries TV shows they had paid for in the belief that they would remain permanently available. The next generation is learning the lesson. Many friends over 40 say they can no longer play CDs or DVD; teenaged friends favor physical media because they’ve already learned that digital services can’t be trusted.

Last September, we learned that Hollywood studios were deleting finished, but unaired programs and parts of their back catalogues for tax reasons. Sometimes, shows have abruptly disappeared mid-season. This week, Paramount removed decades of Comedy Central video clips; last month it axed the MTV news archives. This is *our* culture, even if it’s *their* copyright.

Meanwhile, the design of streaming services has stagnated. The complaints people had years ago about interfaces that make it hard to find the shows they want to see are the same ones they have now. Content moves unpredictably from one service to another. Every service is bringing in ads and raising prices. The benefits that siphoned users from broadcast and cable are vanishing.

As against this, consider pirate sites: they have the most comprehensive libraries; there are no ads; you can use the full-featured player of your choice; no one other than you can delete them; and they are free. Logically, piracy should be going back up, and at least one study suggests it is. If only they paid creators…

The lesson: publishers may live to regret attacking the Internet Archive rather than finding ways to work with it – after all, it sends representatives to court hearings and obeys rulings; if so ordered, they might even pay authors. In 20 years, no one’s managed to sink The Pirate Bay; there’ll be no killing the shadow libraries either, especially since my sampling finds that the Internet Archive’s uncorrected scans are often the worst copies to read. Why let the pirate sites be the one to offer the best services?

Illustrations: The New York Public Library, built 1911 (via Wikimedia).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

Hostages

If you grew up with the slow but predictable schedule of American elections, the abruptness with which a British prime minister can prorogue Parliament and hit the campaign trail is startling. Among the pieces of legislation that fell by the wayside this time is the Data Protection and Digital Information bill, which had reached the House of Lords for scrutiny. The bill had many problems. This was the bill that proposed to give the Department of Work and Pensions the right to inspect the bank accounts and financial assets of anyone receiving any government benefits and undermined aspects of the adequacy agreement that allows UK companies to exchange data with businesses in the EU.

Less famously, it also includes the legislative underpinnings for a trust framework for digital verification. On Monday, at a UCL’s conference on crime science, Sandra Peaston, director of research and development at the fraud prevention organization Cifas, outlined how all this is intended to work and asked some pertinent questions. Among them: whether the new regulator will have enough teeth; whether the certification process is strong enough for (for example) mortgage lenders; and how we know how good the relevant algorithm is at identifying deepfakes.

Overall, I think we should be extremely grateful this bill wasn’t rushed through. Quite apart from the digital rights aspects, the framework for digital identity really needs to be right; there’s just too much risk in getting it wrong.

***

At Bloomberg, Mark Gurman reports that Apple’s arrangement with OpenAI to integrate ChatGPT into the iPhone, iPad, and Mac does not involve Apple paying any money. Instead, Gurman cites unidentified sources to the effect that “Apple believes pushing OpenAI’s brand and technology to hundreds of millions of its devices is of equal or greater value than monetary payments.”

We’ve come across this kind of claim before in arguments between telcos and Internet companies like Netflix or between cable companies and rights holders. The underlying question is who brings more value to the arrangement, or who owns the audience. I can’t help feeling suspicious that this will not end well for users. It generally doesn’t.

***

Microsoft is on a roll. First there was the Recall debacle. Now come accusations by a former employee that it ignored a reported security flaw in order to win a large government contract, as Renee Dudley and Doris Burke report at Pro Publica. Result: the Russian Solarwinds cyberattack on numerous US government departments and agencies, including the National Nuclear Security Administration.

This sounds like a variant of Cory Doctorow’s enshittification at the enterprise level (see also: Boeing). They don’t have to be monopolies: these organizations’ evolving culture has let business managers override safety and security engineers. This is how Challenger blew up in 1986.

Boeing is too big and too lacking in competition to be allowed to fail entirely; it will have to find a way back. Microsoft has a lot of customer lock-in. Is it too big to fail?

***

I can’t help feeling a little sad at the news that Raspberry Pi has had an IPO. I see no reason why it shouldn’t be successful as a commercial enterprise, but its values will inevitably change over time. CEO Eben Upton swears they won’t, but he won’t be CEO forever, as even he admits. But: Raspberry Pi could become the “unicorn” Americans keep saying Europe doesn’t have.

***

At that same UCL event, I finally heard someone say something positive about AI – for a meaning of “AI” that *isn’t* chatbots. Sarah Lawson, the university’s chief information security officer, said that “AI and machine learning have really changed the game” when it comes to detecting email spam, which remains the biggest vector for attacks. Dealing with the 2% that evades the filters is still a big job, as it leaves 6,000 emails a week hitting people’s inboxes – but she’ll take it. We really need to be more specific when we say “AI” about what kind of system we mean; success at spam filtering has nothing to say about getting accurate information out of a large language model.

***

Finally, I was highly amused this week when long-time security guy Nick Selby, posted on Mastodon about a long-forgotten incident from 1999 in which I disparaged the sort of technology Apple announced this week that’s supposed to organize your life for you – tell you when it’s time to leave for things based on the traffic, juggle meetings and children’s violin recitals, that sort of thing. Selby felt I was ahead of my time because “it was stupid then and is stupid now because even if it works the cost is insane and the benefit really, really dodgy”,

One of the long-running divides in computing is between the folks who want computers to behave predictably and those who want computers to learn from our behavior what’s wanted and do that without intervention. Right now, the latter is in ascendance. Few of us seem to want the “AI features” being foisted on us. But only a small percentage of mainstream users turn off defaults (a friend was recently surprised to learn you can use the history menu to reopen a closed browser tab). So: soon those “AI features” will be everywhere, pointlessly and extravagantly consuming energy, water, and human patience. How you use information technology used to be a choice. Now, it feels like we’re hostages.

Illustrations: Raspberry Pi: the little computer that could (via Wikimedia).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

The apostrophe apocalypse

It was immediately tempting to view the absence of apostrophes on new street signs in a North Yorkshire town as a real-life example of computer systems crushing human culture. Then, near-simultaneously, Apple launched an ad (which it now regrets) showing just that process, raising the temptation even more. But no.

In fact, as Brandon Vigliarolo writes at The Register, not only is the removal of apostrophes in place names not new in the UK, but it also long precedes computers. The US Board on Geographic Names declared apostrophes unwanted as long ago as its founding year, 1890, apparently to avoid implying possession. This decision by the BGN, which has only made five exceptions in its history, was later embedded in the US’s Geographic Names Information System and British Standard 7666. When computers arrived to power databases, the practice carried on.

All that said, it’s my experience that the older British generation are more resentful of American-derived changes to their traditional language than they are of computer-driven alterations (one such neighbor complains about “sidewalk”). So campaigns to reinstate missing apostrophes seem likely to persist.

Blaming computers seemed like a coherent narrative, not least because new technology often disrupts social customs. Railways brought standardized time, and the desire to simplify things for computers led to the 2023 decision to eliminate leap seconds in 2035 (after 18 years of debate). Instead, the apostrophe apocalypse is a more ordinary story of central administrators preferencing their own convenience over local culture and custom (which may itself be contested). It still seems like people should be allowed to keep their street signs. I mean.

***

Of course language changes over time and usage. The character limits imposed by texting (and therefore exTwitter and other microblogging sites) brought us many abbreviations that are now commonplace in daily life, just as long before that the telegraph’s cost per word spawned its own compressed dialect. A new example popped up recently in Charles Arthur’s The Overspill.

Arthur highlighted an article at Level Up Coding/Medium by Fareed Khan that offered ways to distinguish between human-written and machine-generated text. It turns out that chatbots use distinctively different words than we do. Khan was able to generate a list of about 100 words that may indicate a chatbot has been at work, as well as a web app that can check a block of text or a file in one go. The word “delve” was at the top.

I had missed Khan’s source material, an earlier claim by YCombinator founder Paul Graham that “delve” used in an email pitch is a clear sign of ChatGPT-generated text. At the Guardian, Alex Hern suggests that an underlying cause may be the fact that much of the labeling necessary to train the large language models that power chatbots is carried out by badly paid people in the global South – including Africa, where “delve” is more commonly used than in Western countries.

At the Premium Times, Chiamaka Okafor argues that therefore identifying “delve” as a marker of “robotic text” penalizes African writers. “We are losing sight of an opportunity to rewrite the AI narratives that exclude people in the global majority,” she writes. A reminder: these chatbots are just math and statistics predicting the next word. They will always regress to the mean. And now they’ll penalize us for being different.

***

Just two years ago, researchers fretted that we were running out of “high-quality text” on which to train large language models. We’ve been seeing the results since, as sites hosting user-generated content strike deals with LLM owners, leading to contentious disputes between those owners and sites’ users, who feel betrayed and ripped off. Reddit began by charging for access to its API, then made a deal with Google to use its database of posts for training for an injection of cash that enabled it to go public. Yesterday, Reddit announced a similar deal with OpenAI – and the stock went up. In reality, these deals are asset-stripping a site that has consistently lost money for 18 years.

The latest site to sell its users’ content is the technical site Stack Overflow, Developers who offer mutual aid by answering each other’s questions are exactly the user base you would expect to be most offended by the news that the site’s owner, the investment group Prosus, which bought the site in 2021 for $1.8 billion, has made a deal giving OpenAI access to all its content. And so it proved: developers promptly began altering or removing their posts to protest the deal. Shortly thereafter, the site’s moderators began restoring those posts and suspending the users.

There’s no way this ends well; Internet history’s many such stories never have. The site’s original owners, who created the culture, are gone. The new ones don’t care what users *believe* their rights are if the terms and conditions grant an irrevocable license to everything they post. Inertia makes it hard to build a replacement; alienation thins out the old site. As someone posted to Twitter a few years ago, “On the Internet your home always leaves you.”

‘Twas ever thus. And so it will be until people stop taking the bait in the first place.

Illustrations: Apple’s canceled “crusher” ad.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

Review: A History of Fake Things on the Internet

A History of Fakes on the Internet
By Walter J. Scheirer
Stanford University Press
ISBN 2023017876

One of Agatha Christie’s richest sources of plots was the uncertainty of identity in England’s post-war social disruption. Before then, she tells us, anyone arriving to take up residence in a village brought a letter of introduction; afterwards, old-time residents had to take newcomers at their own valuation. Had she lived into the 21st century, the arriving Internet would have given her whole new levels of uncertainty to play with.

In his recent book A History of Fake Things on the Internet, University of Notre Dame professor Walter J. Scheirer describes creating and detecting online fakes as an ongoing arms race. Where many people project doomishly that we will soon lose the ability to distinguish fakery from reality, Scheirer is more optimistic. “We’ve had functional policies in the past; there is no good reason we can’t have them again,” he concludes, adding that to make this happen we need a better understanding of the media that support the fakes.

I have a lot of sympathy with this view; as I wrote recently, things that fool people when a medium is new are instantly recognizable as fake once they become experienced. We adapt. No one now would be fooled by the images that looked real in the early days of photography. Our perceptions become more sophisticated, and we learn to examine context. Early fakes often work simply because we don’t know yet that such fakes are possible. Once we do know, we exercise much greater caution before believing. Teens who’ve grown up applying filters to the photos and videos they upload to Instagram and TikTok, see images very differently than those of us who grew up with TV and film.

Schierer begins his story with the hacker counterculture that saw computers as a source of subversive opportunities. His own research into media forensics began with Photoshop. At the time, many, especially in the military, worried that nation-states would fake content in order to deceive and manipulate. What they found, in much greater volume, was memes and what Schierer calls “participatory fakery” – that is, the cultural outpouring of fakes for entertainment and self-expression, most of it harmless. Further chapters consider cheat codes in games, the slow conversion of hackers into security practitioners, adversarial algorithms and media forensics, shock-content sites, and generative AI.

Through it all, Schierer remains optimistic that the world we’re moving into “looks pretty good”. Yes, we are discovering hundreds of scientific papers with faked data, faked results, or faked images, but we also have new analysis tools to use to detect them and Retraction Watch to catalogue them. The same new tools that empower malicious people enable many more positive uses for storytelling, collaboration, and communication. Perhaps forgetting that the computer industry relentlessly ignores its own history, he writes that we should learn from the past and react to the present.

The mention of scientific papers raises an issue Schierer seems not to worry about: waste. Every retracted paper represents lost resources – public funding, scientists’ time and effort, and the same multiplied into the future for anyone who attempts to build on that paper. Figuring out how to automate reliable detection of chatbot-generated text does nothing to lessen the vast energy, water, and human resources that go into building and maintaining all those data centers and training models (see also filtering spam). Like Scheirer, I’m largely optimistic about our ability to adapt to a more slippery virtual reality. But the amount of wasted resources is depressing and, given climate change, dangerous.