Conundrum

It took me six hours of listening to people with differing points of view discuss AI and copyright at a workshop, organized by the Sussex Centre for Law and Technology at the Sussex Humanities Lab (SHL), to come up with a question that seemed to me significant: what is all this talk about who “wins the AI race”? The US won the “space race” in 1969, and then for 50 years nothing happened.

Fretting about the “AI race”, an argument at least one participant used to oppose restrictions on using copyrighted data for training AI models, is buying into several ideas that are convenient for Big Tech.

One: there is a verifiable endpoint everyone’s trying to reach. That isn’t anything like today’s “AI”, which is a pile of math and statistics predicting the most likely answers to prompts. Instead, they mean artificial general intelligence, which would be as much like generative AI as I am like a mushroom.

Two: it’s a worthy goal. But is it? Why don’t we talk about the renewables race, the zero carbon race, or the sustainability race? All of those could be achievable. Why just this well-lobbied fantasy scenario?

Three: we should formulate public policy to eliminate “barriers” that might stop us from winning it. *This* is where we run up against copyright, a subject only a tiny minority used to care about, but that now affects everyone. And, accordingly, everyone has had time to formulate an opinion since the Internet first challenged the historical operation of intellectual property.

The law as it stands is clear: making a copy is the exclusive right of the rightsholder. This is the basis of AI-related lawsuits. For training data to escape that law, it would have to be granted an exemption: ruled fair use (as in the Anthropic and Meta cases), create an exception for temporary copies, or shoehorned into existing exceptions such as parody. Even then, copyright law is administered territorially, so the US may call it fair use but the rest of the world doesn’t have to agree. This is why the esteemed legal scholar Pamela Samuelson has said copyright law poses an existential threat to generative AI.

But, as one participant pointed out, although the entertainment industry dominates these discussions, there are many other sectors with different needs. Science, for example, both uses and studies AI, and is built on massive amounts of public funding. Surely that data should be free to access?

I wanted to be at this meeting because what should happen with AI, training data, and copyright is a conundrum. You do not have to work for a technology company to believe that there is value in allowing researchers both within and outwith companies to work on machine learning and build AI tools. When people balk at the impossible scale of securing permission from every copyright holder of every text, image, or sound, they have a point. The only organizations that could afford that are the companies we’re already mad at for being too big, rich, and powerful.

At the same time, why should we allow those big, rich, powerful companies to plunder our cultural domain without compensating anyone and extract even larger fortunes while doing it? To a published author who sees years of work reflected in a chatbot’s split-second answer to a prompt, it’s lost income and readers.

So for months, as Parliament has wrangled over the Data bill, the argument narrowed to copyright. Should there be an exception for data mining? Should technology companies have to get permission from creators and rights holders? Or should use of their work be automatically allowed, unless they opt out? All answers seem equally impossible. Technology companies would have to find every copyright holder of every datum to get permission. Licensing by the billion.

If creators must opt out, does that mean one piece at a time? How will they know when they need to opt out and who they have to notify? At the meeting, that was when someone said that the US and China won’t do this. Britain will fall behind internationally. Does that matter?

And yet, we all seemed to converge on this: copyright is the wrong tool. As one person said, technologies that threaten the entertainment industry always bring demands to tighten or expand copyright. See the last 35 years, in which Internet-fueled copying spawned the Digital Millennium Copyright Act and the EU Copyright Directive, and copyright terms expanded from 28 years, renewable once, to author’s life plus 70.

No one could suggest what the right tool would be. But there are good questions. Such as: how do we grant access to information? With business models breaking, is copyright still the right way to compensate creators? One of us believed strongly in the capabilities of collection societies – but these tend to disproportionately benefit the most popular creators, who will survive anyway.

Another proposed the highly uncontroversial idea of taxing the companies. Or levies on devices such as smartphones. I am dubious on this one: we have been there before.

And again, who gets the money? Very successful artists like Paul McCartney, who has been vocal about this? Or do we have a broader conversation about how to enable people to be artists? (And then, inevitably, who gets to be called an artist.)

I did not find clarity in all this. How to resolve generative AI and copyright remains complex and confusing. But I feel better about not having an answer.

Illustrations: Drunk parrot in a Putney garden (by Simon Bisson; used by permission).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon or Bluesky.

Optioned

The UK’s public consultation on creating a copyright exception for AI model training closed on Tuesday, and it was profoundly unsatisfying.

Many, many creators and rights holders (who are usually on opposing sides when it comes to contract negotiations) have opposed the government’s proposals. Every national newspaper ran the same Make It Fair front page opposing them; musicians released a silent album. In the Guardian, the peer and independent filmmaaker Beeban Kidron calls the consultation “fixed” in favor of the AI companies. Kidron’s resume includes directing Bridget Jones: The Edge of Reason (2004) and the meticulously researched 2013 study of teens online, InRealLife, and she goes on to call the government’s preferred option a “wholesale transfer of wealth from hugely successful sector that invests hundreds of millions in the UK to a tech industry that extracts profit that is not assured and will accrue largely to the US and indeed China.”

The consultation lists four options: leave the situation as it is; require AI companies to get licenses to use copyrighted work (like everyone else has to); allow AI companies to use copyrighted works however they want; and allow AI companies to use copyrighted works but grant rights holders the right to opt out.

I don’t like any of these options. I do believe that creators will figure out how to use AI tools to produce new and valuable work. I *also* believe that rights holders will go on doing their best to use AI to displace or impoverish creators. That is already happening in journalism and voice acting, and was a factor in the 2023 Hollywood writers’ strike. AI companies have already shown that won’t necessarily abide by arrangements that lack the force of law. The UK government acknowledged this in its consultation document, saying that “more than 50% of AI companies observe the longstanding Internet convention robots.txt.” So almost half of them *don’t*.

At Pluralistic, Cory Doctorow argued in February 2023 that copyright won’t solve the problems facing creators. His logic is simple: after 40 years of expanding copyright terms (from a maximum of 56 years in 1975 to “author’s life plus 70” now), creators are being paid *less* than they were then. Yes, I know Taylor Swift has broken records for tour revenues and famously took back control of her own work. but millions of others need, as Doctorow writes, structural market changes. Doctorow highlights what happened with sampling: the copyright maximalists won, and now musicians are required to sign away sampling rights to their labels, who pocket the resulting royalties.

For this sort of reason, the status quo, which the consultation calls “option 0”, seems likely to open the way to lots more court cases and conflicting decisions, but provide little benefit to anyone. A licensing regime (“option 1”) will likely go the way of sampling. If you think of AI companies as inevitably giant “pre-monopolized” outfits, like Vladen Joler at last year’s Computers, Privacy, and Data Protection conference, “Option 2” looks like simply making them richer and more powerful at the expense of everyone else in the world. But so does “option 3”, since that *also* gives AI companies the ability to use anything they want. Large rights holders will opt out and demand licensing fees, which they will keep, and small ones will struggle to exercise their rights.

As Kidron said, the government’s willingness to take chances with the country’s creators’ rights is odd, since intellectual property is a sector in which Britain really *is* a world leader. On the other hand, as Moody says, all of it together is an anthill compared to the technology sector.

None of these choices is a win for creators or the public. The government’s preferred option 3 seems unlikely to achieve its twin goals of making Britain a world leader in AI and mainlining AI into the veins of the nation, as the government put it last month.

China and the US both have complete technology stacks *and* gigantic piles of data. The UK is likely better able to matter in AI development than many countries – see for example DeepMind, which was founded here in 2010. On the other hand, also see DeepMind for the probable future: Google bought it in 2014, and now its technology and profits belong to that giant US company.

At Walled Culture, Glyn Moody argued last May that requiring the AI companies to pay copyright industries makes no sense; he regards using creative material for training purposes as “just a matter of analysis” that should not require permission. And, he says correctly, there aren’t enough such materials anyway. Instead, he and Mike Masnick at Techdirt propose that the generative AI companies should pay creators of all types – journalists, musicians, artists, filmmakers, book authors – to provide them with material they can use to train their models, and the material so created should be placed in the public domain. In turn it could become new building blocks the public can use to produce even more new material. As a model for supporting artists, patronage is old.

I like this effort to think differently a lot better than any of the government’s options.

Illustrations:: Tuesday’s papers, unprecedentedly united to oppose the government’s copyright plan.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon or Bluesky.

Disharmony

When an individual user does it, it’s piracy. When a major company does it…it may just get away with it.

At TechCrunch, Kyle Wiggers reports that buried in newly unredacted documents in the copyright case Kadrey v. Meta is testimony that Meta trained its Llama language model on a dataset of ebooks it torrented from LibGen. So, two issues. First, LibGen has been sued numerous times, fined, and ordered to shut down. Second: torrent downloads simultaneously upload to others. So, allegedly, Meta knowingly pirated copyrighted books to train its language model.

Kadrey v. Meta was brought by novelist Richard Kardrey, writer Christopher Golden, and comedian Sarah Silverberg, and is one of a number of cases accusing technology companies of training language models on copyrighted works without permission. Meta claims fair use. Still, not a good look.

***

Coincidentally, this week CEO Mark Zuckerberg announced changes to the company’s content moderation policies in the US (for now), a move widely seen as pandering to the incoming administration. The main changes announced in Zuckerberg’s video clip: Meta will replace fact-checkers (“too politically biased”) with a system of user-provided “community notes” as on exTwitter, remove content restrictions that “shut out people with different ideas”, dial back its automated filters to focus solely on illegal content, rely on user reports to identify material that should be taken down, bring back political content, and move its trust and safety and content moderation teams from California to Texas (“where there is less concern about the bias of our teams”). He also pledges to work with the incoming president to “push back on governments around the world that are going after American companies and pushing to censor more”.

Journalists and fact-checkers are warning that misinformation and disinformation will be rampant, and many are alarmed by the specifics of the kind of thing people are now allowed to say. Zuckerberg frames all this as a “return” to free expression while acknowledging that, “We’re going to catch less bad stuff”

At Techdirt, Mike Masnick begins as an outlier, arguing that many of these changes are actually sensible, though he calls the reasoning behind the Texas move “stupid”, and deplores Zuckerberg’s claim that this is about “free speech” and removing “censorship”. A day later, after seeing the company’s internal guidelines unearthed by Kate Knibbs at Wired , he deplores the new moderation policy as “hateful people are now welcome”.

More interesting for net.wars purposes is the international aspect. As the Guardian says, Zuckerberg can’t bring these changes across to the EU or UK without colliding headlong with the UK’s Online Safety Act and the EU’s Digital Markets Act. Both lay down requirements for content moderation on the largest platforms.

And yet, it’s possible that Zuckerberg may also think these changes help lay the groundwork to meet the EU/UK requirements. Meta will still remove illegal content, which it’s required to do anyway. But he may think there’s a benefit in dialing back users expectations about what else Meta will remove, in that platforms must conform to the rules they set in their terms and conditions. Notice-and-takedown is an easier standard to meet than performance indicators for automated filters. It’s also likely cheaper. This approach is, however, the opposite of what critics like Open Rights Group have predicted the law will bring; ORG believes that platforms will instead over-moderate in order to stay out of trouble, chilling free speech.

Related is an interesting piece by Henry Farrell at his Programmable Matter newsletter, who argues that the more important social media speech issue is that what we read there determines how we imagine others think rather than how we ourselves think. In other words, misinformation, disinformation, and hate speech change what we think is normal, expanding the window of what we think other people find acceptable. That has resonance for me: the worst thing about prominent trolls is they give everyone else permission to behave as badly as they do.

***

It’s now 25 years since I heard a privacy advocate predict that the EU’s then-new data protection rights could become the basis of a trade war with the US. While instead the EU and US have kept trying to find a bypass that will withstand a legal challenge from Max Schrems, the approaches seem to be continuing to diverge, and in more ways.

For example, last week in the longrunning battle over network neutrality, judges on the US Sixth Circuit Court of Appeals ruled that the Federal Communications Commission was out of line when it announced rules in 2023 that classified broadband suppliers as common carriers under Title II of the Communications Act (1934). This judgment is the result of the Supreme Court’s 2024 decision to overturn the Chevron deference, setting courts free to overrule government agencies’ expertise. And that means the end in the US (until or unless Congress legislates) of network neutrality, the principle that all data flowing across the Internet was created equal and should be transmitted without fear or favor. Network neutrality persists in California, Washington, and Colorado, whose legislatures have passed laws to protect it.

China has taught us that the Internet is more divisible by national law than many thought in the 1990s. Copyright law may be the only thing everyone agrees on.

Illustrations: Drunk parrot in a South London garden (by Simon Bisson; used by permission).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon or Bluesky.

The return of piracy

In Internet terms, it’s been at least a generation since the high-profile fights over piracy – that is, the early 2000s legal actions against unauthorized sites offering music, TV, and films, and the people who used them. Visits to the news site TorrentFreak this week feel like a revival.

The wildest story concerns Z-Library, for some years the largest shadow book collection. Somewhere someone must be busily planning a true crime podcast series. Z-Library was briefly offline in 2022, when the US Department of Justice seized many of its domains. Shortly afterwards there arrived Anna’s Archive, a search engine for shadow libraries – Z-Library and many others, and the journal article shadow repository Sci-Hub. Judging from a small sampling exercise, you can find most books that have been out for longer than a month. Newer books tend to be official ebooks stripped of digital rights management.

In November 2022, the Russian nationals Anton Napolsky and Valeriia Ermakova were arrested in Argentina, alleged to be Z-Library administrators. The US requested extradition, and an Argentinian judge agreed. They appealed to the Argentinian supreme court, asking to be classed as political refugees. This week, a story in local publication La Voz, made its way north. As Ashley Belanger explains at Ars Technica, Napolsky and Ermakova decided not to wait for a judgment, escaped house arrest back in May, and vanished. The team running Z-library say the pair are innocent of copyright infringement.

Also missing in court: Anna’s Archive’s administrators. As noted here in February; the library service company OCLC sued Anna’s Archive for having exploited a security hole in its website in order to scrape 2,2TB of its book metadata. This might have gone unnoticed, except that the admins published the news on its blog. OCLC is claiming that the breach has cost millions to remediate its systems.

This week saw an update to the case: OCLC has moved for summary judgment as Anna’s Archive’s operators have failed to turn up in court. At TorrentFreak, Ernesto van der Sar reports that OCLC is also demanding millions in damages and injunctive relief barring Anna’s from continuing to publish the scraped data, though it does not ask for the site to be blocked. (The bit demanding that Anna’s Archive pay the costs of remediating OCLC’s flawed systems is puzzling; do we make burglars who climb in through open windows pay for locksmiths?)

And then there is the case of the Internet Archive’s Open Library, which claims its scanned-in books are legal under the novel theory of controlled digital lending. When the Internet Archive responded to the covid crisis by removing those controls in 2020, four major publishers filed suit. In 2023, the US District Court for the Southern District of New York ruled against the Internet Archive, saying its library enables copyright infringement. Since then, the Archive has removed 500,000 books.

This is the moment when lessons from the past of music, TV, and video piracy could be useful. Critics always said that the only workable answer to piracy is legal, affordable services, and they were right, as shown by Pandora, Spotify, Netflix, which launched its paid streaming service in 2007, and so many others.

It’s been obvious for at least two years that things are now going backwards. Last December, in one of many such stories, the Discovery/Warner Brothers merger ended a licensing agreement with Sony, leading the latter to delete from Playstation users’ libraries TV shows they had paid for in the belief that they would remain permanently available. The next generation is learning the lesson. Many friends over 40 say they can no longer play CDs or DVD; teenaged friends favor physical media because they’ve already learned that digital services can’t be trusted.

Last September, we learned that Hollywood studios were deleting finished, but unaired programs and parts of their back catalogues for tax reasons. Sometimes, shows have abruptly disappeared mid-season. This week, Paramount removed decades of Comedy Central video clips; last month it axed the MTV news archives. This is *our* culture, even if it’s *their* copyright.

Meanwhile, the design of streaming services has stagnated. The complaints people had years ago about interfaces that make it hard to find the shows they want to see are the same ones they have now. Content moves unpredictably from one service to another. Every service is bringing in ads and raising prices. The benefits that siphoned users from broadcast and cable are vanishing.

As against this, consider pirate sites: they have the most comprehensive libraries; there are no ads; you can use the full-featured player of your choice; no one other than you can delete them; and they are free. Logically, piracy should be going back up, and at least one study suggests it is. If only they paid creators…

The lesson: publishers may live to regret attacking the Internet Archive rather than finding ways to work with it – after all, it sends representatives to court hearings and obeys rulings; if so ordered, they might even pay authors. In 20 years, no one’s managed to sink The Pirate Bay; there’ll be no killing the shadow libraries either, especially since my sampling finds that the Internet Archive’s uncorrected scans are often the worst copies to read. Why let the pirate sites be the one to offer the best services?

Illustrations: The New York Public Library, built 1911 (via Wikimedia).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

To tell the truth

It was toward the end of Craig Wright’s cross-examination on Wednesday when, for the first time in many days, he was lost for words. Wright is in court because the non-profit Crypto Open Patent Alliance seeks a ruling that he is not, as he claims, bitcoin inventor Satoshi Nakomoto, who was last unambiguously heard from in 2011.

Over the preceding days, Wright had repeatedly insisted “I am the real Satoshi” and disputed forensic analysis – anachronistic fonts, metadata, time stamps – pronouncing his proffered proofs forgeries.. He was consistently truculent, verbose, and dismissive of everyone’s expertise but his own and of everyone’s degrees except the ones he holds. For example: “Meiklejohn has not studied cryptography in any depth,” he said of Sarah Meiklejohn, the now-professor who as a student in 2013 showed that bitcoin transactions are traceable. In a favorite moment, Jonathan Hough, KC, who did most of the cross-examination, interrupted a diatribe about the failings of the press with, “Moving on from your expertise on journalism, Dr Wright…”

Participants in a drinking game based on his saying “That is not correct” would be dead of alcohol poisoning. In between, he insisted several times that he never wanted to be outed as Satoshi, and wishes that everyone would “leave me alone and let me invent”. Any money he is awarded in court he will give to charities ; he wants nothing for himself.

But at the moment we began with he was visibly stumped. The question, regarding a variable on a Github page: “Do you know what unsigned means?”

Wright: “Basically, an unsigned variable…it’s not an integer with…it’s larger. I’m not sure how to say it.”

Lawyer: “Try.”

Wright: “How I’d describe it, I’m not quite sure. I’m not good with trying to do things like this.” He could explain it easily in writing… (Transcription by Norbert on exTwitter.)

The lawyer explained it thusly: an unsigned variable cannot be a negative number.

“I understand that, but would I have thought of saying it in such a simple way? No.”

Experience as a journalist teaches you that the better you understand something the more simply and easily you can explain it. Wright’s inability to answer blew the inadequately bolted door plug out of his world’s expert persona. Everything until then could be contested: the stomped hard drive, the emails he wrote, or didn’t write, or wrote only one sentence of, the allegations that he had doctored old documents to make it look like he had been thinking about bitcoin before the publication of Satoshi’s foundational 2008 paper. But there’s no disguising lack of basic knowledge. “Should have been easy,” says a security professor (tenured, chaired) friend.

Normally, cryptography removes ambiguity. This is especially true of public key cryptography and its complementary pair of public and private keys. Being able to decrypt something with a well-attested public key is clear proof that it was encrypted with the complementary private key. Contrariwise, if a specific private key decrypts it, you know that key’s owner is the intended recipient. In both cases, as a bonus, you get proof that the text has not been altered since its encryption. It *ought* to be simple for Wright to support his claim by using Satoshi’s private keys. If he can’t do that, he must present a reason and rely on weaker alternatives.

Courts of law, on the other hand, operate on the balance of probabilities. They don’t remove ambiguity; they study it. Wright’s case is therefore a cultural clash, with far-reaching consequences. COPA is complaining that Wright’s repeated intellectual property lawsuits against developers working on bitcoin projects are expensive in both money and time. Soon after the unsigned variable exchange, the lawyer asked Wright what he will do if the court rules against him. “Move on to patents,” Wright said. He claims thousands of patents relating to bitcoin and the blockchain, and a brief glance at Google Patents shows many filings, some granted.

However this case comes out, therefore, it seems likely Wright will continue to try to control bitcoin. Wright insists that bitcoin isn’t meant to be “digital gold”, but that its true purpose is to facilitate micropayments. I haven’t “studied bitcoin in any depth” (as he might say), but as far as I can tell it’s far too slow, too resource-intensive, and too volatile to be used that way. COPA argues, I think correctly, that it’s the opposite of the world enshrined in Satoshi’s original paper; its whole point was to use cryptography to create the blockchain as a publicly attested, open, shared database that could eliminate central authorities such as banks.

In the Agatha Christie version of this tale, most likely Wright would be an imposter, an early hanger-on who took advantage of the gap formed by Satoshi’s disappearance and the deaths of other significant candidates. Dorothy Sayers would have Lord Peter Wimsey display unexpected mathematical brilliance to improve on Satoshi’s work, find him, and persuade him to turn over his keys and documents to king and country. Sir Arthur Conan Doyle would have both Moriarty and Sherlock Holmes on the trail. Holmes would get there first and send him into protection to ensure Morarty couldn’t take criminal advantage. And then the whole thing would be hushed up in the public interest.

The case continues.

Illustrations: The cryptographic code from “The Dancing Men”, by Sir Arthur Conan Doyle (via Wikimedia).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

Nefarious

Torrentfreak is reporting that OCLC, owner of the WorldCat database of bibliographic records, is suing the “shadow library” search engine Anna’s Archive. The claim: that Anna’s Archive hacked into WorldCat, copied 2.2TB of records, and posted them publicly.

Shadow libraries are the text version of “pirate” sites. The best-known is probably Sci-Hub, which provides free access to hundreds of thousands of articles from (typically expensive) scientific journals. Others such as Library Genesis and sites on the dark web offer ebooks. Anna’s Archive indexes as many of these collections as it can find; it was set up in November 2022, shortly after the web domains belonging to the then-largest of these book libraries, Z-Library, were seized by the US Department of Justice. Z-Library has since been rebuilt on the dark web, though it remains under attack by publishers and law enforcement.

Anna’s Archive also includes some links to the unquestionably legal and long-running Gutenberg Project, which publishes titles in the public domain in a wide variety of formats.

The OCLC-Anna’s Archive case has a number of familiar elements that are variants of long-running themes, open versus gatekept being the most prominent. Like many such sites (post-Napster), Anna’s Archive does not host files itself. That’s no protection from the law; authorities in various countries from have nonetheless blocked or seized the domains belonging to such sites. But OCLC is not a publisher or rights holder, although it takes large swipes at Anna’s Archive for lawlessness and copyright infringement. Instead, it says Anna’s Archive hacked WorldCat, violating its terms and conditions, disrupting its business arrangements, and costing it $1.4 million and 10,000 employee hours in system remediation. Second, it complains that Anna’s Archive has posted the data in the aggregate for public download, and is “actively encouraging nefarious use of the data”. Other than the use of “nefarious”, there seems little to dispute about either claim; Anna’s Archive published the details in an October 2023 blog posting.

Anna’s Archive describes this process as “preserving” the world’s books for public access. OCLC describes it as “tortious inference” with its business. It wants the court to issue injunctive relief to make the scraping and use of the data stop, compensatory damages in excess of $75,000, punitive damages, costs, and whatever else the court sees fit. The sole named defendant is a US citizen, María A. Matienzo, thought to be resident near Seattle. If the identification and location are correct, that’s a high-risk situation to be in.

In the blog posting, Anna’s Archive writes that its initial goal was to answer the question of what percentage of the world’s published books are held in shadow libraries and create a to-do list of gaps to fill. To answer these questions, they began by scraping ISBNdb, the database of publications with ISBNs, which only came into use in 1970. When the overlap with the Internet Archive’s Open Library and the seized Z-library was less than they hoped, they turned to Worldcat. At that point, they openly say that security flaws in the fortuitously redesigned Worldcat website allowed them to grab more or less the comprehensive set of records. While scraping can be legal, exploiting security flaws to gain unauthorized access to a computer system likely violates the widely criticized Computer Fraud and Abuse Act (1986), which could be a felony. OCLC has, however, brought a civil case.

Anna’s Archive also searches the Internet Archive’s Open Library, founded in 2006. In 2009, co-creator Aaron Swartz told me that he believed the creation of Open Library pushed OCLC into opening up greater public access to the basic tier of its bibliographic data. The Open Library currently has its own legal troubles; it lost in court in August 2023 after Hachette sued it for copyright infringement. The Internet Archive is appealing; in the meantime it is required to remove on request of any member of the American Asociation of Publishers any book commercially available in electronic format.

OCLC began life as the Ohio Library College Library Center; its WorldCat database is a collaboration between it and its member libraries to create a shared database of bibliographic records and enable online cataloguing. The last time I wrote about it, in 2009, critics were complaining that libraries in general were failing to bring book data onto the open web. It has gotten a lot better in the years since, and many local libraries are now searchable online and enable their card holders to borrow from their holdings of ebooks over the web.

The fact that it’s now often possible to borrow ebooks from libraries should mean there’s less reason to use unauthorized sites. Nonetheless, these still appeal: they have the largest catalogues, the most convenient access, DRM-free files, and no time limits, so you can read them at your leisure using the full-featured reader you prefer.

In my 2009 piece, an OCLC spokesperson fretted about “over-exploitation”, which there would be no good way to maintain or update countless unknown scattered pockets of data, seemingly a solvable problem.

OCLC and its member libraries are all non-profit organizations ultimately funded by taxpayers. The data they collect has one overriding purpose: to facilitate public access to libraries’ holdings by showing who holds what books in which editions. What are “nefarious” uses? Arguably, the data they collect should be public by right. But that’s not the question the courts will decide.

Illustrations: The New York Public Library, built 1911 (via Wikimedia).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

Objects of copyright

Back at the beginning, the Internet was going to open up museums to those who can’t travel to them. Today…

At the Art Newspaper, Bender Grosvenor reports that a November judgment from the UK Court of Appeal means museums can’t go on claiming copyright in photographs of public domain art works. Museums have used this claim to create costly licensing schemes. For art history books and dissertations that need the images for discussion, the costs are often prohibitive. And, it turns out, the “GLAM” (galleries, libraries, archives, and museums) sector isn’t even profiting from it.

Grosvenor cites figures: the National Gallery alone lost £31,000 on its licensing scheme in 2021-2022 (how? is left as an exercise for the reader). This figure was familiar: Douglas McCarthy, whom the article quotes, cited it at gikii 2023. As an ongoing project with Andrea Wallace, McCarthy co-runs the Open GLAM survey, which collects data to show the state of open access in this sector.

In his talk, McCarthy, an art historian by training and the head of the Library Learning Center at Delft University of Technology, showed that *most* such schemes are losing money. The National Gallery of Liverpool, for example, lost £71,873 on licensing activities between 2018 and 2023.

Like Grosvenor, McCarthy noted that the scholars whose work underpins the work of museums and libraries, are finding it increasingly difficult to afford that work. One of McCarthy’s examples was St Andrews art history professor Kathryn M. Rudy, who summed up her struggles in a 2019 piece for Times Higher Education: “The more I publish, the poorer I get.”

Rudy’s problem is that publishing in art history, as necessary for university hiring and promotions, requires the use of images of the works under discussion. In her own case, the 1,419 images she needed to use to publish six monographs and 15 articles have consumed most of her disposable income. To be fair, licensing fees are only part of this. She also lists travel to view offline collecctions, the costs of attending conferences, data storage, academic publishers’ production fees, and paying for the copies of books contracts require her to send the libraries supplying the images; some of this is covered by her university. But much of those extra costs come from licensing fees that add up to thousands of pounds for the material necessary for a single book: reproduction fees, charges for buying high-resolution copies for publication, and even, when institutions allow it at all, fees for photographing images in situ using her phone. Yet these institutions are publicly funded, and the works she is photographing have been bought with money provided by taxpayers.

On the face of it, THJ v. Sheridan, as explained by the law firm the law firm Pennington, Manches, Cooper in a legal summary, doesn’t seem to have much to do with the GLAM sector. Instead, the central copyright claim was regarding the defendant software used in webinars and presentations. However, the point, as the Kluwer Copyright blog explains, was deciding which test to apply to decide whether a copyrighted work is original.

In court, THJ, a UK-based software development firm, claimed that Daniel Sheridan, a US options trading mentor and former licensee, had misrepresented its software as his own and had violated THJ’s copyright by using the software after his license agreement expired by including images of the software in his presentations. One of THJ’s two claims failed on the basis that the THJ logo and copyright notices were displayed throughout the presentation.

The second is the one that interests us here: THJ claimed copyright in the images of its software based on the 1988 Copyright, Designs, and Patents Act. The judge, however, ruled that while the CDPA applies to the software, images of the software’s graphical interface are not covered; to constitute infringement Sheridan would have had to communicate the images to the UK public. In analyzing the judgment, Grosvenor pulled out the requirements for copyright to apply: that making the images required some skill and labor on the part of the person or organization making the claim. By definition, this can’t be true of a photograph of a painting, which needs to be as accurate a representation as possible.

Grosvenor has been on this topic for a while. In 2017, he issued a call to arms in Art History News, arguing that image reproduction fees are “killing art history”.

In 2017, Grosvenor was hopeful, because US museums and a few European ones were beginning to do away with copyright claims and licensing fees and finding that releasing the images to the public to be used for free in any context created value in the form of increased discussion, broadcast, and visibility. Progress continues, as McCarthy’s data shows, but inconsistently: last year the incoming Italian government reversed its predecessor’s stance by bringing back reproduction fees even for scientific journals.

Granted, all of the GLAM sector is cash-strapped and is desperately seeking new sources of income. But these copyright claims seem particularly backwards. It ought to be obvious that the more widely images of an institution’s holdings are published the more people will want to see the original; greater discussion of these art works would seem to fulfill their mission of education. Opening all this up would seem to be a no-brainer. Whether the GLAM folks like it or not, the judge did them a favor.

Illustrations: “Harpist”, from antiphonal, Cambrai or Tournai c. 1260-1270, LA, Getty Museum, Ms. 44/Ludwig VI 5, p. 115 (via Discarding Images).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

Property is theft

If you were to judge just by behavior, you would have to conclude that the entertainment industry’s rights holders are desperate to promote piracy.

The latest instance is that Sony has warned American Playstation owners that shows they purchased – *bought* – from Discovery will, now that Discovery has merged with Warner Brothers, be removed from their video libraries. This isn’t like Netflix losing the license to stream your current favorite show halfway through season 2, which you can maybe fix by joining whichever streaming service the show is now on (assuming there is one). No, this is you (thought you) bought and they took it away.

In other words, the entertainment industry has taken the old anarchist slogan property is theft and turned it into a business model.

This isn’t a one-time occurrence. As Timothy Geigner writes at TechDirt, in 2022 customers in Germany and Austria lost access to hundreds of movies when a deal between Sony and film distributor Studio Canal expired. As in the Warner Brothers/Discovery case, it’s not just that the movies were removed from the list available for purchase; the long, remote arm of Sony reached into individual Playstations and removed them from there, too.

If Warner Brothers sent a minion to come into my house, take a DVD from a shelf, and take it away, that would clearly be theft, even if I had given the company a key so it could come in and update my Blu-Ray player. Why is it different if it’s a digital file held on an electronic device?

This is the kind of question I used to get asked back when these copyright battles were new. “You’re a freelance writer,” said the first person I interviewed on this sort of subject, back in 1991; he was the new head of the Federation Against Software Theft. “You make your living from copyright. Why aren’t you against piracy?” (Or something close to that.)

At the time the big battle in freelance journalism was that publishers were pushing toward all-rights contracts that would let them use whatever we wrote forever without further payment. Freelances were trying to hang onto the old arrangement, under which the publisher just got the right to run the piece once (and *first*), and then the freelance could go on and resell the piece in whole or in part to others and in other markets. Columnists made money by compiling their pieces into books. Magazine writers made money by reselling to other countries or selling reworked versions to specialist publications.

By 1995 you couldn’t really make money that way any more. Today, younger freelances have little idea it was ever possible. This, again, is the future the recent SAG-AFTRA strikes were trying to avoid. The shift is more simply described like this: the old way was pay per use; the new way the studios want is pay once, use forever. This struggle is endemic to every industry, as SAG head Fran Drescher pointed out.

The exact opposite is what’s happening to consumer access. In the old way, because buying physical media conferred ownership of the media (and the fact that the content was only ever licensed was largely moot), consumers bought once and used as much as they wanted until the disc or tape wore out. Even if streaming doesn’t quite open the way for paying for every use (though I bet that’s the hope), it does grant remote control to anyone who has access to the device – even if you thought you only granted permission to put stuff there, not remove it.

If I remember correctly, the first time people realized this kind of power existed was in 2009, when Amazon deleted (irony of ironies) copies of George Orwell’s novel 1984 from thousands of Kindles because the third-party company selling the ebook did not in fact have the rights to it. In this particular case, Amazon did refund the money people had paid. Since then, there’s been a steady trickle of cases where ultimate control of the device stays with its maker and doesn’t transfer to the person who paid to buy it.

You might think that the solution is to go on (or back to) buying the entertainment you love on physical media…but that option is also under threat. Disney announced in July that it would stop selling DVDs and Blu-Ray discs in Australia. In the US, Best buy is about to stop carrying them. Add in the recent trend for deleting even successful shows for tax reasons and the unpredictability of which streaming service might have the thing you’re looking for, and you have an extremely consumer-hostile industry.

For consumers, the perfect service looks something like this: the library is, if not complete, *very* extensive, all indexed in one place, and easily searchable using a simple but effective interface. Downloads are quick and give you a file you can move around, replay, or copy to friends at will. There are no ads. It will play on any device that can play video. Repeated viewings don’t require an Internet connection. *That* is what piracy offers. It’s not that it’s free. It’s that it gives people what they want. And the worse commercial services become, the better piracy looks. If only it paid the artists…

Illustrations: Opera Australia performing The Pirates of Penzance in 2007 (via Wikimedia).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon

The end of ownership

It seems no manufacturer will be satisfied until they have turned everything they make into an ongoing revenue stream. Once, it was enough to sell widgets. Then, you needed to have a line of upgrades and add-ons for your widgets and all your sales personnel were expected to “upsell” at every opportunity. Now, you need to turn some of those upgrades and add-ons into subscription services, and throw in some ads for extra revenue. All those ad-free moments in your life? To you, this is space in which to think your own thoughts. To advertisers, these are golden opportunities that haven’t been exploitable before and should be turned to their advantage. (Years ago, I remember, for example, a speaker at a lunchtime meeting convened by the Internet Advertising Bureau saying with great excitement that viral emails could bring ads into workplaces, which had previously been inaccessible.)

The immediate provocation for this musing is the Chamberlain garage door opener that blocks third-party apps in order to display ads. To be fair, I have no skin in this specific game: I have neither garage door opener nor garage door. I don’t even have a car (any more). But I have used these items, and I therefore feel comfortable in saying that this whole idea sucks.

There are three objectionable aspects. First is the ad itself and the market change it represents. I accept that some apps on my phone show ads, but I accept that because I have so far decided not to pay for them (in part because I don’t want to give my credit card information to Google in order to do so). I also accept them because I have chosen to use the apps. Here, however, the app comes with the garage door opener, which you *have* paid for, and the company is double-dipping by trying to turn it into an ongoing revenue stream; its desire to block third-party apps is entirely to protect that revenue stream. Did you even *want* an app with your garage door opener? Does a garage door need options? My friends who have them seem perfectly happy with the two choices of open or closed, and with a gizmo clipped to their sun visor that just has a physical button to push.

Second is the reported user interface design, which forces you to scroll past the ad to get to the button to open the door. This is theft: Chamberlain is stealing a sliver of your time and patience whenever you need to open your garage door. Both are limited resources.

Third is the loss of control over – ownership of – objects you have ostensibly bought. With few exceptions, it has always been understood that once you’ve bought a physical object it’s yours to do with what you want. Even in the case of physical containers of intellectual property – books, CDs, LPs – you always had the right to resell or give away the physical item and to use it as often as you wanted to. The arrival of digital media forced a clarification: you owned the physical object but not the music, pictures, film, or text encoded on it. The part-pairing discussed here a couple of weeks ago is an early example of the extension of this principle to formerly wholly-owned objects. The more software infiltrates the physical world, the more manufacturers will seek to use that software to control how we use the devices they make.

In the case we began with, Chamberlain’s decision to shut off API access to third parties to protect its own profits mirrors a recent trend in social media such as Reddit and Twitter in response to large language models built on training data scraped from their sites. The upshot in the Chamberlain case is that the garage door openers stop working with home automation systems into which the owners want to integrate them. Chamberlain has called this integration unauthorized usage and complains that said use means a tiny proportion of its customers consumed more than half of the traffic to and from its system. Seems like someone could have designed a technical solution for this.

At Pluralistic, Cory Doctorow lists four ways companies can be stopped from exerting unreasonable post-purchase control: fear of their competition, regulation, technical feasibility, and customer DIY. All four, he writes, have so far failed in this case, not least because Chamberlain is now owned by the private equity firm Blackstone, which has already bought up its competitors. Because there are so many other examples, we can’t dismiss this as a one-off; it’s a trend! Or, in Doctorow’s words, “a vast and deadly rot”.

An early example came from Tesla in 2020; when it disabled Full Self-Drive on a used Model S on the grounds that the customer hadn’t paid for it. Over-the-air software updates give companies this level of control long after purchase.

Doctorow believes a countering movement is underway. I hope so, because writing this has led me to this little imaginary future horror: the guitar that silences itself until you type in a code to verify that you have paid royalties for the song you’re trying to play. Logically, then, all interaction with physical objects could become like waiting through the ads for other shows on DVDs until you could watch the one you paid to see. Life is *really* too short.

Illustrations: Steve (Campbell Scott) shows Linda (Kyra Sedgwick) how much he likes her by offering her a garage door opener in Cameron Crowe’s 1992 film Singles.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon

Power cuts

In the latest example of corporate destruction, the Guardian reports on the disturbing trend in which streaming services like Disney and Warner Bros Discovery are deleting finished, even popular, shows for financial reasons. It’s like Douglas Adams’ rock star Hotblack Desiato spending a year dead for tax reasons.

Given that consumers’ budgets are stretched so thin that many are reevaluating the streaming services they’re paying for, you would think this would be the worst possible time to delete popular entertainments. Instead, the industry seems to be possessed by a death wish in which it’s making its offerings *less* attractive. Even worse, the promise they appeared to offer to showrunners was creative freedom and broad and permanent access to their work. The news that Disney+ is even canceling finished shows (Nautilus) shortly before their scheduled release in order to pay less *tax* should send a chill through every creator’s spine. No one wants to spend years of their life – for almost *any* amount of money – making things that wind up in the corporate equivalent of the warehouse at the end of Raiders of the Lost Ark.

It’s time, as the Masked Scheduler suggested recently on Mastodon, for the emergence of modern equivalents of creator-founded studios United Artists and Desilu.

***

Many of us were skeptical about Meta’s Oversight Board; it was easy to predict that Facebook would use it to avoid dealing with the PR fallout from controversial cases, but never relinquish control. And so it is proving.

This week, Meta overruled the Board‘s recommendation of a six-month suspension of the Facebook account belonging to former Cambodian prime minister Hun Sen. At issue was a video of one of Sen’s speeches, which everyone agreed incited violence against his opposition. Meta has kept the video up on the grounds of “newsworthiness”; Meta also declined to follow the Board’s recommendation to clarify its rules for public figures in “contexts in which citizens are under continuing threat of retaliatory violence from their governments”.

In the Platformer newsletter Casey Newton argues that the Board’s deliberative process is too slow to matter – it took eight months to decide this case, too late to save the election at stake or deter the political violence that has followed. Newton also concludes from the list of decisions that the Board is only “nibbling round the edges” of Meta’s policies.

A company with shareholders, a business model, and a king is never going to let an independent group make decisions that will profoundly shape its future. From Kate Klonick’s examination, we know the Board members are serious people prepared to think deeply about content moderation and its discontents. But they were always in a losing position. Now, even they must know that.

***

It should go without saying that anything that requires an Internet connection should be designed for connection failures, especially when the connected devices are required to operate the physical world. The downside was made clear by the 2017 incident, when lost signal meant a Tesla-owning venture capitalist couldn’t restart his car. Or the one in 2021, when a bunch of Tesla owners found their phone app couldn’t unlock their car doors. Tesla’s solution both times was to tell car owners to make sure they always had their physical car keys. Which, fine, but then why have an app at all?

Last week, Bambu 3D printers began printing unexpectedly when they got disconnected from the cloud. The software managing the queue of printer jobs lost the ability to monitor them, causing some to be restarted multiple times. Given the heat and extruded material 3D printers generate, this is dangerous for both themselves and their surroundings.

At TechRadar, Bambu’s PR acknowledges this: “It is difficult to have a cloud service 100% reliable all the time, but we should at least have designed the system more carefully to avoid such embarrassing consequences.” As TechRadar notes, if only embarrassment were the worst risk.

So, new rule: before installation test every new “smart” device by blocking its Internet connection to see how it behaves. Of course, companies should do this themselves, but as we/’ve seen, you can’t rely on that either.

***

Finally, in “be careful what you legislate for”, Canada is discovering the downside of C-18, which became law in June. and requires the biggest platforms to pay for the Canadian news content they host. Google and Meta warned all along that they would stop hosting Canadian news rather than pay for it. Experts like law professor Michael Geist predicted that the bill would merely serve to dramatically cut traffic to news sites.

On August 1, Meta began adding blocks for news links on Facebook and Instagram. A coalition of Canadian news outlets quickly asked the Competition Bureau to mount an inquiry into Meta’s actions. At TechDirt Mike Masnick notes the irony: first legacy media said Meta’s linking to news was anticompetitive; now they say not linking is anticompetitive.

However, there are worse consequences. Prime minister Justin Trudeau complains that Meta’s news block is endangering Canadians, who can’t access or share local up-to-date information about the ongoing wildfires.

In a sensible world, people wouldn’t rely on Facebook for their news, politicians would write legislation with greater understanding, and companies like Meta would wield their power responsibly. In *this* world, a we have a perfect storm.

Illustrations:XKCD’s Dependency.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Wendy M. GrossmanPosted on Categories Infrastructure, Intellectual Property, Law, Media, Net lifeTags , ,