Facts are scarified

The recent doctored Palace photo has done almost as much as the arrival of generative AI to raise fears that in future we will completely lose the ability to identify fakes. The royal photo was sloppily composited – no AI needed – for reasons unknown (though Private Eye has a suggestion). A lot of conspiracy theorizing could be avoided if the palace would release the untouched original(s), but as things are, the photograph is a perfect example of how to provide the fuel for spreading nonsense to 400 million people.

The most interesting thing about the incident was discovering the rules media apply to retouching photos. AP specified, for example, that it does not use altered or digitally manipulated images. It allows cropping and minor adjustments to color and tone where necessary, but bans more substantial changes, even retouching to remove red eye. As Holly Hunter’s character says, trying to uphold standards in the 1987 movie Broadcast News (written by James Brooks), “We are not here to stage the news.”

The desire to make a family photo as appealing as possible is understandable; the motives behind spraying the world with misinformation are less clear and more varied. I’ve long argued here that for this reason combating misinformation and disinformation is similar to cybersecurity because of the complexity of the problem and the diversity of actors and agendas. At last year’s Disinformation Summit in Cambridge cybersecurity was, sadly, one of the missing communities.

Just a couple of weeks ago the BBC announced its adoption of C2PA for authenticating images, developed by a group of technology and media companies including the BBC, the New York Times, Microsoft, and Adobe. The BBC says that many media organizations are beginning to adopt C2PA, and even Meta is considering it. Edits must be signed, and create a chain of provenance all the way back to the original photo. In 2022, the BBC and the Royal Society co-hosted a workshop on digital provenance, following a Royal Society report, at which C2PA featured prominently.

That’s potentially a valuable approach for publishing and broadcast, where the conduit to the public is controlled by one of a relatively small number of organizations. And you can see why those organizations would want it: they need, and in many cases are struggling to retain, public trust. It is, however, too complex a process for the hundreds of millions of people with smartphone cameras posting images to social media, and unworkable for citizen journalists capturing newsworthy events in real time. Ancillary issue: sophisticated phone cameras try so hard to normalize the shots we take that they falsify the image at source. In 2020, Californians attempting to capture the orange color of their smoke-filled sky were defeated by autocorrection that turned it grey. So, many images are *originally* false.

In lengthy blog posting, Neal Krawitz analyzes difficulties with C2PA. He lists security flaws, but also is opposed to the “appeal to authority” approach, which he dubs a “logical fallacy”. In the context of the Internet, it’s worse than that; we already know what happens when a tiny handful of commercial companies (in this case, chiefly Adobe) become the gatekeeper for billions of people.

All of this was why I was glad to hear about work in progress at a workshop last week, led by Mansoor Ahmed-Rengers, a PhD candidate studying system security: Human-Oriented Proof Standard (HOPrS). The basic idea is to build an “Internet-wide, decentralised, creator-centric and scalable standard that allows creators to prove the veracity of their content and allows viewers to verify this with a simple ‘tick’.” Co-sponsoring the workshop was Open Origins, a project to distinguish between synthetic and human-created content.

It’s no accident that HOPrS’ mission statement echoes the ethos of the original Internet; as security researcher Jon Crowcroft explains, it’s part of long-running work on redecentralization. Among HOPrS’ goals, Ahmed-Rengers listed: minimal centralization; the ability for anyone to prove their content; Internet-wide scalability; open decision making; minimal disruption to workflow; and easy interpretability of proof/provenance. The project isn’t trying to cover all bases – that’s impossible. Given the variety of motivations for fakery, there will have to be a large ecosystem of approaches. Rather, HOPrS is focusing specifically on the threat model of an adversary determined to sow disinformation, giving journalists and citizens the tools they need to understand what they’re seeing.

Fakes are as old as humanity. In a brief digression, we were reminded that the early days of photography were full of fakery: the Cottingley Fairies, the Loch Ness monster, many dozens of spirit photographs. The Cottingley Fairies, cardboard cutouts photographed by Elsie Wright, 16, and Florence Griffiths, 9, were accepted as genuine by Sherlock Holmes creator Sir Arthur Conan Doyle, famously a believer in spiritualism. To today’s eyes, trained on millions of photographs, they instantly read as fake. Or take Ireland’s Knock apparitions, flat, unmoving, and, philosophy professor David Berman explained in 1979, magic lantern projections. Our generation, who’ve grown up with movies and TV, would I think have instantly recognized that as fake, too. Which I believe tells us something: yes, we need tools, but we ourselves will get better at detecting fakery, as unlikely as it seems right now. The speed with which the royal photo was dissected showed how much we’ve learned just since generative AI became available.

Illustrations: The first of the Cottingley Fairies photographs (via Wikimedia).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

Borderlines

Think back to the year 2000. New York’s World Trade Center still stood. Personal digital assistants were a niche market. There were no smartphones (the iPhone arrived in 2006) or tablets (the iPad took until 2010). Social media was nascent; Facebook first opened in 2004. The Good Friday agreement was just two years old, and for many in Britain “terrorists” were still “Irish”. *That* was when the UK passed the Terrorism Act (2000).

Usually when someone says the law can’t keep up with technological change they mean that technology can preempt regulation at speed. What the documentary Phantom Parrot shows, however, is that technological change can profoundly alter the consequences of laws already on the books. The film’s worked example is Schedule 7 of the 2000 Terrorism Act, which empowers police to stop, question, search, and detain people passing through the UK’s borders. They do not need prior authority or suspicion, but may only stop and question people for the purpose of determining whether the individual may be or have been concerned in the commission, preparation, or instigation of acts of terrorism.

Today this law means that anyone ariving at the UK border may be compelled to unlock access to data charting their entire lives. The Hansard record of the debate on the bill shows clearly that lawmakers foresaw problems: the classification of protesters as terrorists, the uselessness of fighting terrorism by imprisoning the innocent (Jeremy Corbyn), the reversal of the presumption of innocence. But they could not foresee how far-reaching the powers the bill granted would become.

The film’s framing story begins in November 2016, when Muhammed Rabbani arrived at London’s Heathrow Airport from Doha and was stopped and questioned by police under Schedule 7. They took his phone and laptop and asked for his passwords. He refused to supply them. On previous occasions, when he had similarly refused, they’d let him go. This time, he was arrested. Under Schedule 7, the penalty for such a refusal can be up to three months in jail.

Rabbani is managing director of CAGE International, a human rights organization that began by focusing on prisoners seized under the war on terror and expanded its mission to cover “confronting other rule of law abuses taking place under UK counter-terrorism strategy”. Rabbani’s refusal to disclose his passwords was, he said later, because he was carrying 30,000 confidential documents relating to a client’s case. A lawyer can claim client confidentiality, but not NGOs. In 2018, the appeals court ruled the password demands were lawful.

In September 2017, Rabbani was convicted. He was g iven a 12-month conditional discharge and ordered to pay £620 in costs. As Rabbani says in the film, “The law made me a terrorist.” No one suspected him of being a terrorist or placing anyone in danger; but the judge made clear she had no choice under the law and so he nonetheless has been convicted of a terrorism offense. On appeal in 2018, his conviction was upheld. We see him collect his returned devices – five years on from his original detention.

Britain is not the only country that regards him with suspicion. Citing his conviction, in 2023 France banned him, and, he claims, Poland deported him.

Unsurprisingly, CAGE is on the first list of groups that may be dubbed “extremist” under the new definition of extremism released last week by communities secretary Michael Gove. The direct consequence of this designation is a ban on participation in public life – chiefly, meetings with central and local government. The expansion of the meaning of “extremist”, however, is alarming activists on all sides.

Director Kate Stonehill tells the story of Rabbani’s detention partly through interviews and partly through a reenactment using wireframe-style graphics and a synthesized voice that reads out questions and answers from the interview transcripts. A cello of doom provides background ominance. Laced through this narrative are others. A retired law enforcement office teaches a class to use extraction and analysis tools, in which we see how extensive the information available to them really is. Ali Al-Marri and his lawyer review his six years of solitary detention as an enemy combatant in Charleston, South Carolina. Lastly, Stonehill calls on Ryan Gallegher’s reporting, which exposed the titular Phantom Parrot, the program to exploit the data retained under Schedule 7. There are no records of how many downloads have been taken.

The retired law enforcement officer’s class is practically satire. While saying that he himself doesn’t want to be tracked for safety reasons, he tells students to grab all the data they can when they have the opportunity. They are in Texas: “Consent’s not even a problem.” Start thinking outside of the box, he tells them.

What the film does not stress is this: rights are largely suspended at all borders. In 2022, the UK extended Schedule 7 powers to include migrants and refugees arriving in boats.

The movie’s future is bleak. At the Chaos Computer Congress, a speaker warns that gait recognition, eye movement detection, and speech analysis (accents, emotion) and and other types of analysis will be much harder to escape and enable watchers to do far more with the ever-vaster stores of data collected from and about each of us.

“These powers are capable of being misused,” said Douglas Hogg in the 1999 Commons debate. “Most powers that are capable of being misused will be misused.” The bill passed 210-1.

Illustrations: Still shot from the wireframe reenactment of Rabbani’s questioning in Phantom Parrot.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon

Review: The Bill Gates Problem

The Bill Gates Problem: Reckoning with the Myth of the Good Billionaire
By Tim Schwab
Metropolitan Books
ISBN: 978-1-25085009-6

Thirty years ago, the Federal Trade Commission began investigating one of the world’s largest technology companies on antitrust grounds. Was it leveraging its monopoly in one area to build dominance in others? Did it bully smaller competitors into disclosing their secrets, which it then copied? And so on. That company was Microsoft, Windows was giving it leverage over office productivity software, web browsers, and media players, and its leader was Bill Gates. In 1999, the courts ruled Microsoft a monopoly.

At the time, it was relatively commonplace for people to complain that Gates was insufficiently charitable. Why wasn’t he more philanthropic, given his vast and increasing wealth? (Our standards for billionaire wealth were lower back then.) Be careful what you wish for…

The transition from monopolist mogul to beneficent social entrepreneur where Tim Schwab starts in The Bill Gates Problem: Reckoning with the Myth of the Good Billionaire. In Schwab’s view, the reason is well-executed PR, in which category he includes the many donations the foundation makes to journalism organizations.

I have heard complaints for years that the Bill and Melinda Gates Foundation’s approach to philanthropy favors expensive technological interventions over cheaper, well-established ones. In education that might mean laptops and edtech software rather than training teachers; in medicine that might mean vaccine research rather than clean water. Schwab’s investigative work turns up dozens such stories in the areas BMGF works in: family planning, education, health. Yet, Schwab writes, citing numerous sources for his figures, for all the billions BMGF has poured into these areas, it has failed to meet its stated objectives.

You can argue that case, but Schwab moves on from there to examine the damaging effects of depending on a billionaire, no matter how smart and well-intentioned, to finance services that might more properly be the business of the state. No one elected Gates, and no one has voted on the priorities he has chosen to set. The covid pandemic provides a particularly good example. One of the biggest concerns as efforts to produce vaccines got underway was ensuring that access would not be limited to rich countries. Many believed that the most efficient way of doing this was to refrain from patenting the vaccines, and help poorer countries build their own production facilities. Gates was one of those who opposed this approach, arguing that patents were necessary to reward pharmaceutical companies for the investment they poured into research, and also that few countries had the expertise to make the vaccines. Gates gave in to pressure and reversed his position in May 2021 to support a “narrow waiver”. Reading that BMGF is the biggest funder of the WHO and remembering his preference for technological interventions made me wonder: how much do we have Gates to thank for the emphasis on vaccines and the reluctance to push cheaper non-pharmaceutical interventions like masks, HEPA filters, and ventilation in countries like the UK?

Schwab goes into plenty of detail about all this. But his wider point is to lay out the power Gates’s massive wealth – both the foundation’s and his own – gives him over the charitable sector and, through public-partnerships, many of the nations in which he operates. Schwab also calls Gates’s approach “philanthropic colonialism” because the bulk of his donations go to organizations based in the West, rather than directly to their counterparts elsewhere.

Pointing out the amount of taxpayer subsidy the foundation gets through the tax exemptions charities get, Schwab asks if we’re really getting value for our money. Wouldn’t we be better off collecting taxes and setting our own agendas? Is there really any such thing as a “good” billionaire?

Competitive instincts

This week – Wednesday, March 6 – saw the EU’s Digital Markets Act come into force. As The Verge reminds us, the law is intended to give users more choice and control by forcing technology’s six biggest “gatekeepers” to embrace interoperability and avoid preferencing their own offerings across 22 specified services. The six: Alphabet, Amazon, Apple, ByteDance, Meta, and Microsoft. Alphabet’s covered list is the longest: advertising, app store, search engine, maps, and shopping, plus Android, Chrome, and YouTube. For Apple, it’s the app store, operating system, and web browser. Meta’s list includes Facebook, WhatsApp, and Instagram, plus Messenger, Ads, and Facebook Marketplace. Amazon: third-party marketplace and advertising business. Microsoft: Windows and internal features. ByteDance just has TikTok.

The point is to enable greater competition by making it easier for us to pick a different web browser, uninstall unwanted features (like Cortana), or refuse the collection and use of data to target us with personalized ads. Some companies are haggling. Meta, for example, is trying to get Messenger and Marketplace off the list, while Apple has managed to get iMessage removed from the list. More notably, though, the changes Apple is making to support third-party app stores have been widely cricitized as undermining any hope of success for independents.

Americans visiting Europe are routinely astonished at the number of cookie consent banners that pop up as they browse the web. Comments on Mastodon this week have reminded that this was their churlish choice to implement the 2009 Cookie Directive and 2018 General Data Protection Regulation in user-hostile ways. It remains to be seen how grown-up the technology companies will be in this new round of legal constraints. Punishing users won’t get the EU law changed.

***

The last couple of weeks have seen a few significant outages among Internet services. Two weeks ago, AT&T’s wireless service went down for many hours across the US after a failed software update. On Tuesday, while millions of Americans were voting in the presidential primaries, it was Meta’s turn, when a “technical issue” took out both Facebook and Instagram (and with the latter, Threads) for a couple of hours. Concurrently but separately, users of Ad Manager had trouble logging in at Google, and users of Microsoft Teams and exTwitter also reported some problems. Ironically, Meta’s outage could have been fixed faster if the engineers trying to fix it hadn’t had trouble gaining remote access to the servers they needed to fix (and couldn’t gain access to the physical building because their passes didn’t work either).

Outages like these should serve as reminders not to put all your login eggs in one virtual container. If you use Facebook to log into other sites, besides the visibility you’re giving Meta into your activities elsewhere, those sites will be inaccessible any time Facebook goes down. In the case of AT&T, one reason this outage was so disturbing – the FTC is formally investigating it – is that the company has applied to get rid of its landlines in California. While lots of people no longer have landlines, they’re important in rural areas where cell service can be spotty, some services such as home alarm systems and other equipment depend on them, and they function in emergencies when electric power fails.

But they should also remind that the infrastructure we’re deprecating in favor of “modern” Internet stuff was more robust than the new systems we’re increasingly relying on. A home with smart devices that cannot function without an uninterrupted Internet connection is far more fragile and has more points of failure than one without them, just as you can read a paper map when your phone is dead. At The Verge, Jennifer Pattison Tuohy tests a bunch of smart kitchen appliances including a faucet you can operate via Alexa or Google voice assistants. As in digital microwave ovens, telling the faucet the exact temperature and flow rate you want…seems unnecessarily detailed. “Connect with your water like never before,” the faucet manufacturer’s website says. Given the direction of travel of many companies today, I don’t want new points of failure between me and water.

***

It has – already! – been three years since Australia’s News Media Bargaining Code led to Facebook and Google signing three-year deals that have primarily benefited Rupert Murdoch’s News Corporation, owner of most of Australia’s press. A week ago, Meta announced it will not renew the agreement. At The Conversation, Rod Sims, who chaired the commission that formulated the law, argues it’s time to force Meta into the arbitration the code created. At ABC Science, however, James Purtill counters that the often “toxic” relationship between Facebook and news publishers means that forcing the issue won’t solve the core problem of how to pay for news, since advertising has found elsewheres it would rather be. (Separately, in Europe, 32 media organizations covering 17 countries have filed a €2.1 billion lawsuit against Google, matching a similar one filed last year in the UK, alleging that the company abused its dominant position to deprive them of advertising revenue.)

Purtill predicts, I think correctly, that attempting to force Meta to pay up will instead bring Facebook to ban news, as in Canada, following the passage of a similar law. Facebook needed news once; it doesn’t now. But societies do. Suddenly, I’m glad to pay the BBC’s license fee.

Illustrations: Red deer (via Wikimedia.)

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon

Review: The Oracle

The Oracle
by Ari Juels
Talos Press
ISBN: 978-1-945863-85-1
Ebook ISBN: 978-1-945863-86-8

In 1994, a physicist named Timothy C. May posited the idea of an anonymous information market he called blacknet. With anonymity secured by cryptography, participants could trade government secrets. And, he wrote in 1988’s Crypto-Anarchist Manifesto “An anonymous computerized market will even make possible abhorrent markets for assassinations and extortion.” In May’s time, the big thing missing to enable such a market was a payment system. Then, in 2008, came bitcoin and the blockchain.

In 2015, Ari Juels, now the Weill Family Foundation and Joan and Sanford I. Weill Professor at Cornell Tech but previously chief scientist at the cryptography company RSA, saw blacknet potential in etherum’s adoption of “smart contracts”, an idea that had been floating around since the 1990s. Smart contracts are computer programs that automatically execute transactions when specified conditions are met without the need for a trusted intermediary to provide guarantees. Among other possibilities, they can run on blockchains – the public, tamperproof, shared ledger that records cryptocurrency transactions.

In the resulting research paper on criminal smart contracts PDF), Juels and co-authors Ahmed Kosba and Elaine Shi wrote: “We show how what we call criminal smart contracts (CSCs) can facilitate leakage of confidential information, theft of cryptographic keys, and various real-world crimes (murder, arson, terrorism).”

It’s not often a research paper becomes the basis for a techno-thriller novel, but Juels has prior form. His 2009 novel Tetraktys imagined that members of an ancient Pythagorean cult had figured out how to factor prime numbers, thereby busting the widely-used public key cryptography on which security on the Internet depends. Juels’ hero in that book was uniquely suited to help the NSA track down the miscreants because he was both a cryptographer and the well-schooled son of an expert on the classical world. Juels could almost be describing himself: before turning to cryptography he studied classical literature at Amherst and Oxford.

Juels’ new book, The Oracle, has much in common with his earlier work. His alter-ego here is a cryptographer working on blockchains and smart contracts. Links to the classical world – in this case, a cult derived from the oracle at Delphi – are provided by an FBI agent and art crime investigator who enlists his help when a rogue smart contract is discovered that offers $10,000 to kill an archeology professor, soon followed by a second contract offering $700,000 for a list of seven targets. Soon afterwards, our protagonist discovers he’s first on that list, and he has only a few days to figure out who wrote the code and save his own life. That quest also includes helping the FBI agent track down some Delphian artifacts that we learn from flashbacks to classical times were removed from the oracle’s temple and hidden.

The Delphi oracle, Juels writes, “revealed divine truth in response to human questions”. The oracles his cryptographer is working on are “a source of truth for questions asked by smart contracts about the real world”. In Juels’ imagining, the rogue assassination contract is issued with trigger words that could be expected to appear in a death announcement. When someone tries to claim the bounty, the smart contract checks news sources for those words, only paying out if it finds them. Juels has worked hard to make the details of both classical and cryptographic worlds comprehensible. They remain stubbornly complex, but you can follow the story easily enough even if you panic at the thought of math.

The tension is real, both within and without the novel. Juels’ idea is credible enough that it’s a relief when he says the contracts as described are not feasible with today’s technology, and may never become so (perhaps especially because the fictional criminal smart contract is written in flawless computer code). The related paper also notes that some details of their scheme have been left out so as not to enable others to create these rogue contracts for real. Whew. For now.

Anachronistics

“In my mind, computers and the Internet arrived at the same time,” my twenty-something companion said, delivering an entire mindset education in one sentence.

Just a minute or two earlier, she had asked in some surprise, “Did bulletin board systems predate the Internet?” Well, yes: BBSs were a software package running on a single back room computer with a modem users dialed into, whereas the Internet is this giant sprawling mess of millions of computers connected together…simple first, complex later.

Her confusion is understandable: from her perspective, computers and the Internet did arrive at the same time, since her first conscious encounters with them were simultaneous.

But still, speaking as someone who first programmed a (mainframe, with punch cards) computer in 1972 as a student, who got her first personal computer in 1982, and got online in 1991 by modem and 1999 by broadband and to whom the sequence of events is memorable: wow.

A 25-year-old today was born in 1999 (the year I got broadband). Her counterpart 15 years hence (born 2014, the year a smartphone replaced my personal digital assistant) may think smart phones and the Internet were simultaneous. And sometime around 2045 *her* counterpart born in 2020 (two years before ChatGPT was released) might think generative text and image systems were contemporaneous with the first computers.

I think this confusion must have something to do with the speed of change in a relatively narrow sector. I’m sure that even though they all entered my life simultaneously, by the time I was 25 I knew that radio preceded TV (because my parents grew up with radio), bicycles preceded cars, and that handwritten manuscripts predated printed books (because medieval manuscripts). But those transitions played out over multiple lifetimes, if not centuries, and all those memories were personal. Few of us reminisce about the mainframes of the 1960s because most of us didn’t have access to them.

And yet, understanding the timeline of earlier technologies probably mattered less than not understanding the sequence of events in information technology. Jumbling the arrival dates of the pieces of information technology means failing to understand dependencies. What currently passes for “AI” could not exist without being able to train models on giant piles of data that the Internet and the web made possible, and that took 20 years to build. Neural networks pioneer Geoff Hinton came up with the ideas for convolutional neural networks as long ago as the 1980s, but it took until the last decade for them to become workable. That’s because it took that long to build sufficiently powerful computers and to amass enough training data. How do you understand the ongoing battle between those who wish to protect privacy via data protection laws and those who want data to flow freely without hindrance if you do not understand what those masses of data are important for?

This isn’t the only such issue. A surprising number of people who should know better seem to believe that the solution to all our ills with social media is to destroy Section 230, apparently believing that if S230 allowed Big Tech to get big, it must be wrong. Instead, the reality is also that it allows small sites to exist and it is the legal framework that allows content moderation. Improve it by all means, but understand its true purpose first.

Reviewing movies and futurist projections such as Vannevar Bush’s 1946 essay As We May Think (PDF) and Alan Turing’s lecture, Computing Machinery and Intelligence? (PDF) doesn’t really help because so many ideas arrive long before they’re feasible. The crew in the original 1966 Star Trek series (to say nothing of secret agent Maxwell Smart in 1965) were talking over wireless personal communicators. A decade earlier, Arthur C. Clarke (in The Nine Billion Names of God) and Isaac Asimov (in The Last Question) were putting computers – albeit analog ones – in their stories. Asimov in particular imagined a sequence that now looks prescient, beginning with something like a mainframe, moving on to microcomputers, and finishing up with a vast fully interconnected network that can only be held in hyperspace. (OK, it took trillions of years, starting in 2061, but still..) Those writings undoubtedly inspired the technologists of the last 50 years when they decided what to invent.

This all led us to fakes: as the technology to create fake videos, images, and texts continues to improve, she wondered if we will ever be able to keep up. Just about every journalism site is asking some version of that question; they’re all awash in stories about new levels of fakery. My 25-year-old discussant believes the fakes will always be improving faster than our methods of detection – an arms race like computer security, to which I’ve compared problems of misinformation / disinformation before.

I’m more optimistic. I bet even a few years from now today’s versions of generative “AI” will look as primitive to us as the special effects in a 1963 episode of Dr Who or the magic lantern used to create the Knock apparitions do to generations raised on movies, TV, and computer-generated imagery. Humans are adaptable; we will find ways to identify what is authentic that aren’t obvious in the shock of the new. We might even go back to arguing in pubs.

Illustrations: Secret agent Maxwell Smart (Don Adams) talking on his shoe phone (via Wikimedia).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon

The bridge

Seven months ago, Mastodon was fretting about Meta’s newly-launched Threads. The issue: Threads, which was built on top of Instagram’s user database, had said it complied with the Activity Pub protocol, which allows Mastodon servers (“instances”) to federate with any other service that also uses that protocol. The potential threat that Threads would become interoperable and that potentially millions of Threads users would swamp Mastodon, ignoring its existing social norms and culture created an existential dilemma: to federate or not to federate?

Today, Threads’ integration is still just a plan.

Instead, it seems the first disruptive arrival looks set to be Bluesky, created by a team backed by Twitter co-founder Jack Dorsey and facilitated by a third party. Bluesky wrote a new open source protocol, AT, so the proposal isn’t federation with Mastodon but a bridge, as Amanda Silberling reports at TechCrunch. According to Silberling’s numbers, year-old Bluesky stands at 4.8 million users to Mastodon’s 8.7 million. Anyone familiar with the history of AOL’s gateway to Usenet will tell you that’s big enough to disrupt existing social norms. The AOL exercise was known as Eternal September (because every September Usenet had to ingest a new generation of incoming university freshmen).

There are two key differences, however. First, a third of those Blusky users are new to that system, only joining last week, when the service opened fully to the public. They will bring challenges to the culture Bluesky has so far developed. Second, AOL’s gateway was unidirectional: AOLers could read and post to Usenet newsgroups, but Usenet posters could not read anything on AOL without paying for access. The Bluesky-Mastodon bridge is planned to be bidirectional, so anything posted publicly on one service would be accessible to both – or to outsiders using BridgyFed to connect via website feeds.

I haven’t spent a lot of time on Bluesky, but it’s clear it and Mastodon have different cultures. Friends who spend more time there say Bluesky has a “weirdness” they like and is less “scoldy” than Mastodon, where long-time users tended to school incoming ex-Twitter users in 2022 on their mistakes. That makes sense, when you consider that Mastodon has had time since its 2016 founding to develop an existing culture that newcomers are joining, where Bluesky has been a closed beta group until last week, and its users to date were the ones defining its culture for the future. The newcomers of the past week may have a very different experience.

Even if they don’t, there’s a fundamental economic difference that no technology can bridge: Mastodon is a non-profit cooperative endeavor, while Bluesky is has venture capital funding, although the list of investors is not the usual suspects. Social media users have often been burned by corporate business decisions. It’s therefore easy to believe that the $8 million in seed funding will lead inevitably to user data exploitation, no matter what they say now about being determined to find a different and more sustainable business model based on selling ancillary servicesx. Even if that strategy works, later owners or the dictates of shareholders may demand higher profits via a pivot to advertising, just as the Netflix and Amazon Prime streaming services are doing now.

Designing any software involves making rules for how it will operate and setting defaults. Here’s where the project hit trouble: should it be opt-out, so that users who don’t want their posts to be visible outside their home system have to specifically turn it off, or opt-in, so that users who want their posts published far and wide have to turn it on? BridgyFed’s creator, Ryan Barrett chose opt-out. It was immediately divisive: privacy versus openness.

Silberman reports that Barrett has fashioned a solution, giving users warning pop-ups and a chance to decline if someone from another service tries to follow them, and is thinking more carefully about the risks to safety his bridge might bring.

That’s great, but the next guy may not be so willing to reconsider. As we’ve observed before, there is no way to restrict the use of open protocols without closing them and putting them under centralized control – which is the opposite of the federated, decentralized systems Mastodon and Bluesky were created to build.

In a federated system anything one person can open another can close. Individual admins will decide for their users how their instances will operate. Those who don’t like their choice will be told they can port their accounts to one whose policies they prefer. That’s true, but unsatisfying as an answer. As the “Fediverse” grows, it must accommodate millions of mainstream users for whom moving servers is too complicated.

The key point, however, is that the illusion of control Mastodon seemed to offer is being punctured. Usenet users could have warned them: from its creation in 1979, users believed their postings were readable for a few weeks before expiring and being expunged. Then, in 1995, Steve Madere created the Deja News archive from scattered collections. Overnight, those “ephemeral” postings became permanent and searchable – and even more so, after 2001, when Google bought the archive (see groups.google.com).

The upshot: privacy in public networks is only ever illusory. Assume you have no control over anything you post, no matter how cozy and personal the network seems. As we’ve said before, the privacy-in-public afforded by the physical world has no online counterpart.

Illustrations: A mastodon by Heinrich Harder (public domain, via Wikimedia).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

To tell the truth

It was toward the end of Craig Wright’s cross-examination on Wednesday when, for the first time in many days, he was lost for words. Wright is in court because the non-profit Crypto Open Patent Alliance seeks a ruling that he is not, as he claims, bitcoin inventor Satoshi Nakomoto, who was last unambiguously heard from in 2011.

Over the preceding days, Wright had repeatedly insisted “I am the real Satoshi” and disputed forensic analysis – anachronistic fonts, metadata, time stamps – pronouncing his proffered proofs forgeries.. He was consistently truculent, verbose, and dismissive of everyone’s expertise but his own and of everyone’s degrees except the ones he holds. For example: “Meiklejohn has not studied cryptography in any depth,” he said of Sarah Meiklejohn, the now-professor who as a student in 2013 showed that bitcoin transactions are traceable. In a favorite moment, Jonathan Hough, KC, who did most of the cross-examination, interrupted a diatribe about the failings of the press with, “Moving on from your expertise on journalism, Dr Wright…”

Participants in a drinking game based on his saying “That is not correct” would be dead of alcohol poisoning. In between, he insisted several times that he never wanted to be outed as Satoshi, and wishes that everyone would “leave me alone and let me invent”. Any money he is awarded in court he will give to charities ; he wants nothing for himself.

But at the moment we began with he was visibly stumped. The question, regarding a variable on a Github page: “Do you know what unsigned means?”

Wright: “Basically, an unsigned variable…it’s not an integer with…it’s larger. I’m not sure how to say it.”

Lawyer: “Try.”

Wright: “How I’d describe it, I’m not quite sure. I’m not good with trying to do things like this.” He could explain it easily in writing… (Transcription by Norbert on exTwitter.)

The lawyer explained it thusly: an unsigned variable cannot be a negative number.

“I understand that, but would I have thought of saying it in such a simple way? No.”

Experience as a journalist teaches you that the better you understand something the more simply and easily you can explain it. Wright’s inability to answer blew the inadequately bolted door plug out of his world’s expert persona. Everything until then could be contested: the stomped hard drive, the emails he wrote, or didn’t write, or wrote only one sentence of, the allegations that he had doctored old documents to make it look like he had been thinking about bitcoin before the publication of Satoshi’s foundational 2008 paper. But there’s no disguising lack of basic knowledge. “Should have been easy,” says a security professor (tenured, chaired) friend.

Normally, cryptography removes ambiguity. This is especially true of public key cryptography and its complementary pair of public and private keys. Being able to decrypt something with a well-attested public key is clear proof that it was encrypted with the complementary private key. Contrariwise, if a specific private key decrypts it, you know that key’s owner is the intended recipient. In both cases, as a bonus, you get proof that the text has not been altered since its encryption. It *ought* to be simple for Wright to support his claim by using Satoshi’s private keys. If he can’t do that, he must present a reason and rely on weaker alternatives.

Courts of law, on the other hand, operate on the balance of probabilities. They don’t remove ambiguity; they study it. Wright’s case is therefore a cultural clash, with far-reaching consequences. COPA is complaining that Wright’s repeated intellectual property lawsuits against developers working on bitcoin projects are expensive in both money and time. Soon after the unsigned variable exchange, the lawyer asked Wright what he will do if the court rules against him. “Move on to patents,” Wright said. He claims thousands of patents relating to bitcoin and the blockchain, and a brief glance at Google Patents shows many filings, some granted.

However this case comes out, therefore, it seems likely Wright will continue to try to control bitcoin. Wright insists that bitcoin isn’t meant to be “digital gold”, but that its true purpose is to facilitate micropayments. I haven’t “studied bitcoin in any depth” (as he might say), but as far as I can tell it’s far too slow, too resource-intensive, and too volatile to be used that way. COPA argues, I think correctly, that it’s the opposite of the world enshrined in Satoshi’s original paper; its whole point was to use cryptography to create the blockchain as a publicly attested, open, shared database that could eliminate central authorities such as banks.

In the Agatha Christie version of this tale, most likely Wright would be an imposter, an early hanger-on who took advantage of the gap formed by Satoshi’s disappearance and the deaths of other significant candidates. Dorothy Sayers would have Lord Peter Wimsey display unexpected mathematical brilliance to improve on Satoshi’s work, find him, and persuade him to turn over his keys and documents to king and country. Sir Arthur Conan Doyle would have both Moriarty and Sherlock Holmes on the trail. Holmes would get there first and send him into protection to ensure Morarty couldn’t take criminal advantage. And then the whole thing would be hushed up in the public interest.

The case continues.

Illustrations: The cryptographic code from “The Dancing Men”, by Sir Arthur Conan Doyle (via Wikimedia).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.

Review: Virtual You

Virtual You: How Building Your Digital Twin Will Revolutionize Medicine and Change Your Life
By Peter Coveney and Roger Highfield
Princeton University Press
ISBN: 978-0-691-22327-8

Probably the quickest way to appreciate how much medicine has changed in a lifetime is to pull out a few episodes of TV medical series over the years: the bloodless 1960s Dr Kildare; the 1980s St Elsewhere, which featured a high-risk early experiment in now-routine cardiac surgery; the growing panoply of machcines and equipment of the 2000s series E.R. (1994-2009). But there are always more improvements to be made, and around 2000, when the human genome was being sequenced, we heard a lot about the promise of personalized medicine it was supposed to bring. Then we learned over time that, as so often with scientific advances, knowing more merely served to show us how much more we *didn’t* know – in the genome’s case, about epigenetics, proteomics, and the microbiome. With some exceptions such as cancers that can be tested for vulnerability to particular drugs, the dream of personalized medicine so far mostly remains just that.

Growing alongside all that have been computer models, mostly famously used for metereology and climate change predictions. As Peter Coveney and Roger Highfield explain in Virtual You, models are expected to play a huge role in medicine, too. The best-known use is in drug development, where modeling can help suggest new candidates. But the use that interests Coveney and Highfield is on the personal level: a digital twin for each of us that can be used to determine the right course of treatment by spotting failures in advance, or help us make better lifestyle choices tailored to our particular genetic makeup.

This is not your typical book of technology hype. Instead, it’s a careful, methodical explanation of the mathematical and scientific basis for how this technology will work and its state of development from math and physics to biology. As they make clear, developing the technology to create these digital twins is a huge undertaking. Each of us is a massively complex ecosystem generating masses of data and governed by masses of variables. Modeling our analog selves requires greater complexity than may even be possible with classical digital computers. Coveney and Highfield explain all this meticulously.

It’s not as clear to me as it is to them that virtual twins are the future of mainstream “retail” medicine, especially if, as they suggest, they will be continually updated as our bodies produce new data. Some aspects will be too cost-effective to ignore; ensuring that the most expensive treatments are directed only to those who can benefit will be a money saver to any health service. But the vast amount of computational power and resources likely required to build and maintain a virtual twin for each individual seem prohibitive for all but billionaires. As in engineering, where virtual twins are used for prototyping or meterology, where simulations have led to better and more detailed forecasts, the primary uses seem likely to be at the “wholesale” level. That still leaves room for plenty of revolution.

Nefarious

Torrentfreak is reporting that OCLC, owner of the WorldCat database of bibliographic records, is suing the “shadow library” search engine Anna’s Archive. The claim: that Anna’s Archive hacked into WorldCat, copied 2.2TB of records, and posted them publicly.

Shadow libraries are the text version of “pirate” sites. The best-known is probably Sci-Hub, which provides free access to hundreds of thousands of articles from (typically expensive) scientific journals. Others such as Library Genesis and sites on the dark web offer ebooks. Anna’s Archive indexes as many of these collections as it can find; it was set up in November 2022, shortly after the web domains belonging to the then-largest of these book libraries, Z-Library, were seized by the US Department of Justice. Z-Library has since been rebuilt on the dark web, though it remains under attack by publishers and law enforcement.

Anna’s Archive also includes some links to the unquestionably legal and long-running Gutenberg Project, which publishes titles in the public domain in a wide variety of formats.

The OCLC-Anna’s Archive case has a number of familiar elements that are variants of long-running themes, open versus gatekept being the most prominent. Like many such sites (post-Napster), Anna’s Archive does not host files itself. That’s no protection from the law; authorities in various countries from have nonetheless blocked or seized the domains belonging to such sites. But OCLC is not a publisher or rights holder, although it takes large swipes at Anna’s Archive for lawlessness and copyright infringement. Instead, it says Anna’s Archive hacked WorldCat, violating its terms and conditions, disrupting its business arrangements, and costing it $1.4 million and 10,000 employee hours in system remediation. Second, it complains that Anna’s Archive has posted the data in the aggregate for public download, and is “actively encouraging nefarious use of the data”. Other than the use of “nefarious”, there seems little to dispute about either claim; Anna’s Archive published the details in an October 2023 blog posting.

Anna’s Archive describes this process as “preserving” the world’s books for public access. OCLC describes it as “tortious inference” with its business. It wants the court to issue injunctive relief to make the scraping and use of the data stop, compensatory damages in excess of $75,000, punitive damages, costs, and whatever else the court sees fit. The sole named defendant is a US citizen, María A. Matienzo, thought to be resident near Seattle. If the identification and location are correct, that’s a high-risk situation to be in.

In the blog posting, Anna’s Archive writes that its initial goal was to answer the question of what percentage of the world’s published books are held in shadow libraries and create a to-do list of gaps to fill. To answer these questions, they began by scraping ISBNdb, the database of publications with ISBNs, which only came into use in 1970. When the overlap with the Internet Archive’s Open Library and the seized Z-library was less than they hoped, they turned to Worldcat. At that point, they openly say that security flaws in the fortuitously redesigned Worldcat website allowed them to grab more or less the comprehensive set of records. While scraped“>scraping can be legal, exploiting security flaws to gain unauthorized access to a computer system likely violates the widely criticized Computer Fraud and Abuse Act (1986), which could be a felony. OCLC has, however, brought a civil case.

Anna’s Archive also searches the Internet Archive’s Open Library, founded in 2006. In 2009, co-creator Aaron Swartz told me that he believed the creation of Open Library pushed OCLC into opening up greater public access to the basic tier of its bibliographic data. The Open Library currently has its own legal troubles; it lost in court in August 2023 after Hachette sued it for copyright infringement. The Internet Archive is appealing; in the meantime it is required to remove on request of any member of the American Asociation of Publishers any book commercially available in electronic format.

OCLC began life as the Ohio Library College Library Center; its WorldCat database is a collaboration between it and its member libraries to create a shared database of bibliographic records and enable online cataloguing. The last time I wrote about it, in 2009, critics were complaining that libraries in general were failing to bring book data onto the open web. It has gotten a lot better in the years since, and many local libraries are now searchable online and enable their card holders to borrow from their holdings of ebooks over the web.

The fact that it’s now often possible to borrow ebooks from libraries should mean there’s less reason to use unauthorized sites. Nonetheless, these still appeal: they have the largest catalogues, the most convenient access, DRM-free files, and no time limits, so you can read them at your leisure using the full-featured reader you prefer.

In my 2009 piece, an OCLC spokesperson fretted about “over-exploitation”, which there would be no good way to maintain or update countless unknown scattered pockets of data, seemingly a solvable problem.

OCLC and its member libraries are all non-profit organizations ultimately funded by taxpayers. The data they collect has one overriding purpose: to facilitate public access to libraries’ holdings by showing who holds what books in which editions. What are “nefarious” uses? Arguably, the data they collect should be public by right. But that’s not the question the courts will decide.

Illustrations: The New York Public Library, built 1911 (via Wikimedia).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon.