Surveillance machines on wheels

After much wrangling and with just a few days of legislative time between the summer holidays and the party conference season, on Tuesday night the British Parliament passed the Online Safety bill, which will become law as soon as it gets royally signed (assuming they can find a pen that doesn’t leak). The government announcement brims with propagandist ecstasy, while the Open Rights Group’s statement offers the reality: Briton’s online lives will be less secure as a result. Which means everyone’s will.

Parliament – and the net.wars archive – dates the current version of this bill to 2022, and the online harms white paper on which it’s based to 2020. But it *feels* like it’s been a much longer slog; I want to say six years.

This is largely because the fight over two key elements – access to encrypted messaging and age verification – *is* that old. Age verification was enshrined in the Digital Economy Act (2017), and we reviewed the contenders to implement it in 2016. If it’s ever really implemented, age verification will make Britain the most frustrating place in the world to be online.

Fights over strong encryption have been going on for 30 years. In that time, no new mathematics has appeared to change the fact that it’s not possible to create a cryptographic hole that only “good guys” can use. Nothing will change about that; technical experts will continue to try to explain to politicians that you can have secure communications or you can have access on demand, but you can’t have both.

***

At the New York Times, Farhood Manjou writes that while almost every other industry understands that the huge generation of aging Boomers is a business opportunity, outside of health care Silicon Valley is still resolutely focused on under-30s. This, even though the titans themselves age; boy-king Mark Zuckerberg is almost 40. Hey, it’s California; they want to turn back aging, not accept it.

Manjou struggles to imagine the specific directions products might take, but I like his main point: where’s the fun? What is this idea that after 65 you’re just something to send a robot to check up on? Yes, age often brings impairments, but why not build for them? You would think that given the right affordances, virtual worlds and online games would have a lot to offer people whose lives are becoming more constrained.

It’s true that by the time you realize that ageism pervades our society you’re old enough that no one’s listening to you any more. But even younger people must struggle with many modern IT practices: the pale, grey type that pervades the web, the picklists, the hidden passwords you have to type twice… And captchas, which often display on my desktop too small to see clearly and are resistant to resizing upwards. Bots are better at captchas than humans anyway, so what *is* the point?

We’re basically back where we were 30 years ago, when the new discipline of human-computer interaction fought to convince developers that if the people who struggle to operate their products look stupid the problem is bad design. And all this is coming much more dangerously to cars; touch screens that can’t be operated by feel are Exhibit A.

***

But there is much that’s worse about modern cars. A few weeks ago, the Mozilla Foundation published a report reviewing the privacy of modern cars. Tl;dr: “Cars are the worst product category we have ever reviewed for privacy.”

The problems are universal across the 25 brands Mozilla researchers Jen Caltrider, Misha Rykov, and Zoë MacDonald reviewed: “Modern cars are surveillance-machines on wheels souped-up with sensors, radars, cameras, telematics, and apps that can detect everything we do inside.” Cars can collect all the data that phones and smart home devices can. But unlike phones, space is a non-issue, and unlike smart speakers, video cameras, and thermostats, cars move with you and watch where you go. Drivers, passengers, passing pedestrians…all are fodder for data collection in the new automotive industry, where heated seats and unlocking extra battery range are subscription add-ons, and the car you buy isn’t any more yours than the £6-per-hour Zipcar in the designated space around the corner.

Then there are just some really weird clauses in the companies’ privacy policies. Some collect “genetic data” (here the question that arises is not only “why?” but “how?). Nissan says it can collect information about owners’ “sexual activity” for use in “direct marketing” or to share with marketing partners. ” The researchers ask, “What on earth kind of campaign are you planning, Nissan?”

Still unknown: whether the data is encrypted while held on the car; how securely it’s held; and whether the companies will resist law enforcement requests at all. We do know that that car companies share and sell the masses of intimate information they collect, especially the cars’ telematics with insurance companies.

The researchers also note that new features allow unprecedented levels of control. VW’s Car-Net, for example, allows parents – or abusers – to receive a phone alert if the car is driven outside of set hours or in or near certain locations. Ford has filed a patent on a system for punishing drivers who miss car payments.

“I got old at the right time,” a friend said in 2019. You can see his point.

Illustrations: Artist Dominic Wilcox‘s imagined driverless sleeper car of the future, as seen at the Science Museum in 2019.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon

Review: Sorry, Sorry, Sorry

Sorry, Sorry, Sorry: The Case for Good Apologies
By Marjorie Ingalls and Susan McCarthy
Gallery Books
ISBN: 978-1-9821-6349-5

Years ago, a friend of mine deplored apologies: “People just apologize because they want you to like them,” he observed.

That’s certainly true at least some of the time, but as Marjorie Ingalls and Susan McCarthy argue at length in their book Sorry, Sorry, Sorry, well-constructed and presented apologies can make the world a better place. For the recipient, they can remove the sting of old wrongs; for the giver, they can ease the burden of old shames.

What you shouldn’t do, when apologizing, is what self-help groups sometimes describe as “plan the outcome”. That is, you present your apology and you take your chances. Follow Ingalls’ and McCarthy’s six steps to construct your apology, then hope for, but do not demand, forgiveness, and don’t mess the whole thing up by concluding with, “So, we’re good?”

Their six steps to a good apology:
1. Say you’re sorry.
2. For what you did.
3. Show you understand why it was bad.
4. Only explain if you need to; don’t make excuses.
5. Say why it won’t happen again.
6. Offer to make up for it.
Six and a half. Listen.

It’s certainly true that many apologies don’t have the desired effect. Often, it’s because the apology itself is terrible. Through their Sorry Watch blog, Ingalls and McCarthy have been collecting and analyzing bad public apologies for years (obDisclosure: I send in tips on apologies in tennis and British politics). Many of these appear in the book, organized into chapters on apologies from doctors and medical establishments, large corporations, and governments and nation-states. Alongside these are chapters on the psychology of apologies, teaching children to apologize, practical realities relating to gender, race, and other disparities. Women, for example, are more likely to apologize well, but take greater risk when they do – and are less likely to be forgiven.

Some templates for *bad* apologies when you’ve done something hurtful (do not try this at home!): “I’m sorry if…”, “I’m sorry that you felt…”, “I regret…”, and, of course, the often-used classic, “This is not who we are.”

These latter are, in Ingalls’ and McCarthy’s parlance “apology-shaped objects”, but not actually apologies. They explain this in detail with plenty of wit – and no less than five Bad Apology bingo cards.

Even for readers of the blog, there’s new information. I was particularly interested to learn that malpractice lawyers are likely wrong in telling doctors not to apologize because admitting fault invites a lawsuit. A 2006 Harvard hospital system report found little evidence for this contention – as long as the apologies are good ones. It’s the failure to communicate and the refusal to take responsibility that are much more anger-provoking. In other words, the problem there, as everywhere else, is *bad* apologies.

A lot of this ought to be common sense. But as Ingalls and McCarthy make plain, it may be sense but it’s not as common as any of us would like.

Doom cyberfuture

Midway through this year’s gikii miniconference for pop culture-obsessed Internet lawyers, Jordan Hatcher proposed that generational differences are the key to understanding the huge gap between the Internet pioneers, who saw regulation as the enemy, and the current generation, who are generally pushing for it. While this is a bit too pat – it’s easy to think of Millennial libertarians and I’ve never thought of Boomers as against regulation, just, rationally, against bad Internet law that sticks – it’s an intriguing idea.

Hatcher, because this is gikii and no idea can be presented without a science fiction tie-in, illustrated this with 1990s movies, which spread the “DCF-84 virus” – that is, “doom cyberfuture-84”. The “84” is not chosen for Orwell but for the year William Gibson’s Neuromancer was published. Boomers – he mentioned John Perry Barlow, born 1947, and Lawrence Lessig, born 1961 – were instead infected with the “optimism virus”.

It’s not clear which 1960s movies might have seeded us with that optimism. You could certainly make the case that 1968’s 2001: A Space Odyssey ends on a hopeful note (despite an evil intelligence out to kill humans along the way), but you don’t even have to pick a different director to find dystopia: I see your 2001 and give you Dr Strangelove (1964). Even Woodstock (1970) is partly dystopian; the consciousness of the Vietnam war permeates every rain-soaked frame. But so did the belief that peace could win: so, wash.

For younger people’s pessimism, Hatcher cited 1995’s Johnny Mnemonic (based on a Gibson short story) and Strange Days.

I tend to think that if 1990s people are more doom-laden than 1960s people it has more to do with real life. Boomers were born in a time of economic expansion, relatively affordable education and housing, and and when they protested a war the government eventually listened. Millennials were born in a time when housing and education meant a lifetime of debt, and when millions of them protested a war they were ignored.

In any case, Hatcher is right about the stratification of demographic age groups. This is particularly noticeable in social media use; you can often date people’s arrival on the Internet by which communications medium they prefer. Over dinner, I commented on the nuisance of typing on a phone versus a real keyboard, and two younger people laughed at me: so much easier to type on a phone! They were among the crowd whose papers studied influencers on TikTok (Taylor Annabell, Thijs Kelder, Jacob van de Kerkhof, Haoyang Gui, and Catalina Goanta) and the privacy dangers of dating apps (Tima Otu Anwana and Paul Eberstaller), the kinds of subjects I rarely engage with because I am a creature of text, like most journalists. Email and the web feel like my native homes in a way that apps, game worlds, and video services never will. That dates me both chronologically and by my first experiences of the online world (1991).

Most years at this event there’s a new show or movie that fires many people’s imagination. Last year it was Upload with a dash of Severance. This year, real technological development overwhelmed fiction, and the star of the show was generative AI and large language models. Besides my paper with Jon Crowcrosft, there was one from Marvin van Bekkum, Tim de Jonge, and Frederik Zuiderveen Borgesius that compared the science fiction risks of AI – Skynet, Roko’s basilisk, and an ordering of Asimov’s Laws that puts obeying orders above not harming humans (see XKCD, above) – to the very real risks of the “AI” we have: privacy, discrimination, and environmental damage.

Other AI papers included one by Colin Gavaghan, who asked if it actually matters if you can’t tell whether the entity that’s communicating with you is an AI? Is that what you really need to know? You can see his point: if you’re being scammed, the fact of the scam matters more than the nature of the perpetrator, though your feelings about it may be quite different.

A standard explanation of what put the “science” in science fiction (or the “speculative” in “speculative fiction”) used be to that the authors ask, “What if?” What if a planet had six suns whose interplay meant that darkness only came once every 1,000 years? Would the reaction really be as Ralph Waldo Emerson imagined it? (Isaac Asimov’s Nightfall). What if a new link added to the increasingly complex Boston MTA accidentally turned the system into a Mobius strip (A Subway Named Mobius, by Armin Joseph Deutsch). And so on.

In that sense, gikii is often speculative law, thought experiments that tease out new perspectives. What if Prime Day becomes a culturally embedded religious holiday (Megan Rae Blakely)? What if the EU’s trademark system applied in the Star Trek universe (Simon Sellers)? What if, as in Max Gladsone’s Craft Sequence books, law is practical magic (Antonia Waltermann)? In the trademark example, time travel is a problem; as competing interests can travel further and further back to get the first registration. In the latter…well, I’m intrigued by the idea that a law making dumping sewage in England’s rivers illegal could physically stop it from happening without all the pesky apparatus of law enforcement and parliamentary hearings.

Waltermann concluded by suggesting that to some extent law *is* magic in our world, too. A useful reminder: be careful what law you wish for because you just may get it. Boomer!

Illustrations: Part of XKCD‘s analysis of Asimov’s Laws of Robotics.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Wendy M. GrossmanPosted on Categories AI, Future tech, LawTags , , 2 Comments on Doom cyberfuture

Small data

Shortly before this gets posted, Jon Crowcroft and I will have presented this year’s offering at Gikii, the weird little conference that crosses law, media, technology, and pop culture. This is what we will possibly may have said, as I understand it, with some added explanation for the slightly less technical audience I imagine will read this.

Two years ago, a team of four researchers – Timnit Gebru, Emily Bender, Margaret Mitchell (writing as Shmargaret Shmitchell), and Angelina McMillan-Major – wrote a now-famous paper called On the Dangers of Stochastic Parrots (PDF) calling into question the usefulness of the large language models (LLMs) that have caused so much ruckus this year. The “Stochastic Four” argued instead of small models built on carefully curated data: less prone to error, less exploitive of people’s data, less damaging to the planet. Gebru got fired over this paper; Google also fired Mitchell soon afterwards. Two years later, neural networks pioneer Geoff Hinton quit Google in order to voice similar concerns.

Despite the hype, LLMs have many problems. They are fundamentally an extractive technology and are resource-intensive. Building LLMs requires massive amounts of training data; so far, the companies have been unwilling to acknowledge their sources, perhaps because (as is happening already) they fear copyright suits.

More important from a technical standpoint, is the issue of model collapse; that is, models degrade when they begin to ingest synthetic AI-generated data instead of human input. We’ve seen this before with Google Flu Trends, which degraded rapidly as incoming new search data included many searches on flu-like symptoms that weren’t actually flu, and others that simply reflected the frequency of local news coverage. “Data pollution” as LLM-generated data fills the web, will mean that the web will be an increasingly useless source of training data for future generations of generative AI. Lots more noise, drowning out the signal (in the photo above, the signal would be the parrot).

Instead, if we follow the lead of the Stochastic Four, the more productive approach is small data – small, carefully curated datasets that train models to match specific goals. Far less resource-intensive, far fewer issues with copyright, appropriation, and extraction.

We know what the LLM future looks like in outline: big, centralized services, because no one else will be able to amass enough data. In that future, surveillance capitalism is an essential part of data gathering. SLM futures could look quite different: decentralized, with realigned incentives. At one point, we wanted to suggest that small data could bring the end of surveillance capitalism; that’s probably an overstatement. But small data could certainly create the ecosystem in which the case for mass data collection would be less compelling.

Jon and I imagined four primary alternative futures: federation, personalization, some combination of those two, and paradigm shift.

Precursors to a federated small data future already exist; these include customer service chatbots, predictive text assistants. In this future, we could imagine personalized LLM servers designed to serve specific needs.

An individualized future might look something like I suggested here in March: a model that fits in your pocket that is constantly updated with material of your own choosing. Such a device might be the closest yet to Vannevar Bush’s 1945 idea of the Memex (PDF), updated for the modern era by automating the dozens of secretary-curators he imagined doing the grunt work of labeling and selection. That future again has precursors in techniques for sharing the computation but not the data, a design we see proposed for health care, where the data is too sensitive to share unless there’s a significant public interest (as in pandemics or very rare illnesses), or in other data analysis designs intended to protect privacy.

In 2007, the science fiction writer Charles Stross suggested something like this, though he imagined it as a comprehensive life log, which he described as a “google for real life”. So this alternative future would look something like Stross’s pocket $10 life log with enhanced statistics-based data analytics.

Imagining what a paradigm shift might look like is much harder. That’s the kind of thing science fiction writers do; it’s 16 years since Stross gave that life log talk. However, in his 2018 history of advertising, The Attention Merchants, Columbia professor Tim Wu argued that industrialization was the vector that made advertising and its grab for our attention part of commerce. A hundred and fifty-odd years later, the centralizing effects of industrialization are being challenged starting with energy via renewables and local power generation and social media via the fediverse. Might language models also play their part in bringing a new, more collaborative and cooperative society?

It is, in other words, just possible that the hot new technology of 2023 is simply a dead end bringing little real change. It’s happened before. There have been, as Wu recounts, counter-moves and movements before, but they didn’t have the technological affordances of our era.

In the Q&A that followed, Miranda Mowbray pointed out that companies are trying to implement the individualized model, but that it’s impossible to do unless there are standardized data formats, and even then hard to do at scale.

Illustrations: Spot the parrot seen in a neighbor’s tree.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Wendy M. GrossmanPosted on Categories AI, Events, New tech, old knowledgeTags 3 Comments on Small data

Power cuts

In the latest example of corporate destruction, the Guardian reports on the disturbing trend in which streaming services like Disney and Warner Bros Discovery are deleting finished, even popular, shows for financial reasons. It’s like Douglas Adams’ rock star Hotblack Desiato spending a year dead for tax reasons.

Given that consumers’ budgets are stretched so thin that many are reevaluating the streaming services they’re paying for, you would think this would be the worst possible time to delete popular entertainments. Instead, the industry seems to be possessed by a death wish in which it’s making its offerings *less* attractive. Even worse, the promise they appeared to offer to showrunners was creative freedom and broad and permanent access to their work. The news that Disney+ is even canceling finished shows (Nautilus) shortly before their scheduled release in order to pay less *tax* should send a chill through every creator’s spine. No one wants to spend years of their life – for almost *any* amount of money – making things that wind up in the corporate equivalent of the warehouse at the end of Raiders of the Lost Ark.

It’s time, as the Masked Scheduler suggested recently on Mastodon, for the emergence of modern equivalents of creator-founded studios United Artists and Desilu.

***

Many of us were skeptical about Meta’s Oversight Board; it was easy to predict that Facebook would use it to avoid dealing with the PR fallout from controversial cases, but never relinquish control. And so it is proving.

This week, Meta overruled the Board‘s recommendation of a six-month suspension of the Facebook account belonging to former Cambodian prime minister Hun Sen. At issue was a video of one of Sen’s speeches, which everyone agreed incited violence against his opposition. Meta has kept the video up on the grounds of “newsworthiness”; Meta also declined to follow the Board’s recommendation to clarify its rules for public figures in “contexts in which citizens are under continuing threat of retaliatory violence from their governments”.

In the Platformer newsletter Casey Newton argues that the Board’s deliberative process is too slow to matter – it took eight months to decide this case, too late to save the election at stake or deter the political violence that has followed. Newton also concludes from the list of decisions that the Board is only “nibbling round the edges” of Meta’s policies.

A company with shareholders, a business model, and a king is never going to let an independent group make decisions that will profoundly shape its future. From Kate Klonick’s examination, we know the Board members are serious people prepared to think deeply about content moderation and its discontents. But they were always in a losing position. Now, even they must know that.

***

It should go without saying that anything that requires an Internet connection should be designed for connection failures, especially when the connected devices are required to operate the physical world. The downside was made clear by the 2017 incident, when lost signal meant a Tesla-owning venture capitalist couldn’t restart his car. Or the one in 2021, when a bunch of Tesla owners found their phone app couldn’t unlock their car doors. Tesla’s solution both times was to tell car owners to make sure they always had their physical car keys. Which, fine, but then why have an app at all?

Last week, Bambu 3D printers began printing unexpectedly when they got disconnected from the cloud. The software managing the queue of printer jobs lost the ability to monitor them, causing some to be restarted multiple times. Given the heat and extruded material 3D printers generate, this is dangerous for both themselves and their surroundings.

At TechRadar, Bambu’s PR acknowledges this: “It is difficult to have a cloud service 100% reliable all the time, but we should at least have designed the system more carefully to avoid such embarrassing consequences.” As TechRadar notes, if only embarrassment were the worst risk.

So, new rule: before installation test every new “smart” device by blocking its Internet connection to see how it behaves. Of course, companies should do this themselves, but as we/’ve seen, you can’t rely on that either.

***

Finally, in “be careful what you legislate for”, Canada is discovering the downside of C-18, which became law in June. and requires the biggest platforms to pay for the Canadian news content they host. Google and Meta warned all along that they would stop hosting Canadian news rather than pay for it. Experts like law professor Michael Geist predicted that the bill would merely serve to dramatically cut traffic to news sites.

On August 1, Meta began adding blocks for news links on Facebook and Instagram. A coalition of Canadian news outlets quickly asked the Competition Bureau to mount an inquiry into Meta’s actions. At TechDirt Mike Masnick notes the irony: first legacy media said Meta’s linking to news was anticompetitive; now they say not linking is anticompetitive.

However, there are worse consequences. Prime minister Justin Trudeau complains that Meta’s news block is endangering Canadians, who can’t access or share local up-to-date information about the ongoing wildfires.

In a sensible world, people wouldn’t rely on Facebook for their news, politicians would write legislation with greater understanding, and companies like Meta would wield their power responsibly. In *this* world, a we have a perfect storm.

Illustrations:XKCD’s Dependency.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Wendy M. GrossmanPosted on Categories Infrastructure, Intellectual Property, Law, Media, Net lifeTags , , Leave a comment on Power cuts

Guarding the peace

Police are increasingly attempting to prevent crime by using social media targeting tools to shape public behavior, says a new report from the Scottish Institute for Policing Research (PDF) written by a team of academic researchers led by Ben Collier at the University of Edinburgh. There is no formal regulation of these efforts, and the report found many examples of what is genteelly calls “unethical practice”.

On the one hand, “behavioral change marketing” seems an undeniably clever use of new technological tools. If bad actors can use targeted ads to scam, foment division, and incite violence, why shouldn’t police use them to encourage the opposite? The tools don’t care whether you’re a Russian hacker targeting 70-plus white pensioners with anti-immigrant rhetoric or a charity trying to reach vulnerable people to offer help. Using them is a logical extension of the drive toward preventing, rather than solving, crime. Governments have long used PR techniques to influence the public, from benign health PSAs on broadcast media to Theresa May’s notorious , widely cricised, and unsuccessful 2013 campaign of van ads telling illegal immigrants to go home.

On the other hand, it sounds creepy as hell. Combining police power with poorly-evidenced assumptions about crime and behavior and risk and the manipulation and data gathering of surveillance capitalism…yikes.

The idea of influence policing derives at least in part from Cass R. Sunstein‘s and Richard H. Thaler‘s 2008 book Nudge. The “nudge theory” it promoted argued that the use of careful design (“choice architecture”) could push people into making more desirable decisions.

The basic contention seems unarguable; using design to push people toward decisions they might not make by themselves is the basis of many large platforms’ design decisions. Dark patterns are all about that.

Sunstein and Thaler published their theory at the post-financial crisis moment when governments were looking to reduce costs. As early as 2010, the UK’s Cabinet Office set up the Behavioural Insights Team to improve public compliance with government policies. The “Nudge Unit” has been copied widely across the world.

By 2013, it was being criticized for forcing job seekers to fill out a scientifically invalid psychometric test. In 2021, Observer columnist Sonia Sodha called its record “mixed”, deploring the expansion of nudge theory into complex, intractable social problems. In 2022, new research cast doubt on the whole idea that nudges have little effect on personal behavior.

The SIRP report cites the Government Communications Service, the outgrowth of decades of government work to gain public compliance with policy. The GCS itself notes its incorporation of marketing science and other approaches common in the commercial sector. Its 7,000 staff work in departments across government.

This has all grown up alongside the increasing adoption of digital marketing practices across the UK’s public sector, including the tax authorities (HMRC), the Department of Work and Pensions, and especially, the Home Office – and alongside the rise of sophisticated targeting tools for online advertising.

The report notes: “Police are able to develop ‘patchwork profiles’ built up of multiple categories provided by ad platforms and detailed location-based categories using the platform targeting categories to reach extremely specific groups.”

The report’s authors used the Meta Ad Library to study the ads, the audiences and profiles police targeted, and the cost. London’s Metropolitan Police, which a recent scathing report found endemically racist and misogynist, was an early adopter and is the heaviest studied user of digitally targeted ads on Meta.

Many of the cample campaigns these organizations run sound mostly harmless. Campaigns intended to curb domestic violence, for example, may aim at encouraging bystanders to come forward with concerns. Others focus on counter-radicalisation and security themes or, increasingly, preventing online harms and violence against women and girls.

As a particular example of the potential for abuse, the report calls out the Home Office Migrants on the Move campaign, a collaboration with a “migration behavior change” agency called Seefar. This targeted people in France seeking asylum in the UK and attempted to frighten them out of trying to cross the Channel in small boats. The targeting was highly specific, with many ads aimed at as few as 100 to 1,000 people, chosen for their language and recent travel in or through Brussels and Calais.

The report’s authors raise concerns: the harm implicit in frightening already-extremely vulnerable people, the potential for damaging their trust in authorities to help them, and the privacy implications of targeting such specific groups. In the report’s example, Arabic speakers in Brussels might see the Home Office ads but their French neighbors would not – and those Arabic speakers would be unlikely to be seeking asylum. The Home Office’s digital equivalent of May’s van ads, therefore, would be seen only by a selection of microtargeted individuals.

The report concludes: “We argue that this campaign is a central example of the potential for abuse of these methods, and the need for regulation.”

The report makes a number of recommendations including improved transparency, formalized regulation and oversight, better monitoring, and public engagement in designing campaigns. One key issue is coming up with better ways of evaluating the results. Surprise, surprise: counting clicks, which is what digital advertising largely sells as a metric, is not a useful way to measure social change.

All of these arguments make sense. Improving transparency in particular seems crucial, as does working with the relevant communities. Deterring crime doesn’t require tricks and secrecy; it requires collaboration and openness.

Illustrations: Theresa May’s notorious van ad telling illegal immigrants to go home.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. She is a contributing editor for the Plutopia News Network podcast. Follow on Mastodon

Review: Should You Believe Wikipedia?

Should You Believe Wikipedia? Online Communities and the Construction of Knowledge
By Amy S. Bruckman
Publisher: Cambridge
Print publication year: 2022
ISBN: 978-1-108780-704

Every Internet era has had its new-thing obsession. For a time in the mid-1990s, it was “community”. Every business, some industry thinkers insisted, would need to build a community of customers, suppliers, and partners. Many tried, and the next decade saw the proliferation of blogs, web boards, and, soon, multi-player online games. We learned that every such venture of any size attracts abuse that requires human moderators to solve. We learned that community does not scale. Then came Facebook and other modern social media, fueled by mobile phones, and the business model became data collection to support advertising.

Back at the beginning, Amy S. Bruckman, now a professor at Georgia Tech but then a student at MIT, set up the education-oriented MOO Crossing, in which children could collaborate on building objects as a way of learning to code. For 20 years, she has taught a course on designing communities. In Should You Believe Wikipedia?, Bruckman distills the lessons she’s learned over all that time, combining years of practical online experience with readable theoretical analysis based on sociology, psychology, and epistemology. Whether or not to trust Wikipedia is just one chapter in her study of online communities and the issues they pose.

Like pubs, cafes, and town squares, online communities are third spaces – that is, neutral ground where people can meet on equal terms. Clearly not neutral: many popular blogs, which tend to be personal or promotional, or the X formerly known as Twitter. Third places also need to be enclosed but inviting, visible from surrounding areas, and offering affordances for activity. In that sense, two of the most successful online communities are Wikipedia and OpenStreetMap, both of which pursue a common enterprise that contributors can feel is of global value. Facebook is home to probably hundreds of thousands of communities – families, activists, support groups, and so on – but itself is too big, too diffuse, and too lacking in shared purpose to be a community. Bruckman also cites as examples of productive communities open source software projects and citizen science.

Bruckman’s book has arrived at a moment that we may someday see as a watershed. Numerous factors – Elon Musk’s takeover and remaking of Twitter, debates about regulation and antitrust, increased privacy awareness – are making many people reevaluate what they want from online social spaces. It is a moment when new experiments might thrive.

Something like that is needed, Bruckman concludes: people are not being well served by the free market’s profit motives and current business models. She would like to see more of the Internet populated by non-profits, but elides the key hard question: what are the sustainable models for supporting such endeavors? Mozilla, one of the open source software-building communities she praises, is sustained by payments from Google, making it still vulnerable to the dictates of shareholders, albeit at one remove. It remains an open question if the Fediverse, currently chiefly represented by Mastodon, can grow and prosper in the long term under its present structure of volunteer administrators running their own servers and relying on users’ donations to pay expenses. Other established commercial community hosts, such as Reddit, where Bruckman is a moderator, have long failed to find financial sustainability.

Bruckman never quite answers the question in the title. It reflects the skepticism at Wikipedia’s founding that an encyclopedia edited by anyone who wanted to participate could be any good. As she explains, however, the fact that every page has its Talk page that details disputes and exposes prior versions provides transparency the search engines don’t offer. It may not be clear if we *should* believe Wikipedia, whose quality varies depending on the subject, but she does make clear why we *can* when we do.

Five seconds

Careful observers posted to Hacker News this week – and the Washington Post reported – that the X formerly known as Twitter (XFKAT?) appeared to be deliberately introducing a delay in loading links to sites the owner is known to dislike or views as competitors. These would be things like the New York Times and selected other news organizations, and rival social media and publishing services like Facebook, Instagram, Bluesky, and Substack.

The 4.8 seconds users clocked doesn’t sound like much until you remember, as the Post does, that a 2016 Google study found that 53% of mobile users will abandon a website that takes longer than three seconds to load. Not sure whether desktop users are more or less patient, but it’s generally agreed that delay is the enemy.

The mechanism by which XFKAT was able to do this is its built-in link shortener, t.co, through which it routes all the links users post. You can see this for yourself if you right-click on a posted link and copy the results. You can only find the original link by letting the t.co links resolve and copying the real link out of the browser address bar after the page has loaded.

Whether or not the company was deliberately delaying these connections, the fact is that it *can* – as can Meta’s platforms and many others. This in itself is a problem; essentially it’s a failure of network neutrality. This is the principle that a telecoms company should treat all traffic equally, and it is the basis of the egalitarian nature of the Internet. Regulatory insistence on network neutrality is why you can run a voice over Internet Protocol connection over broadband supplied by a telco or telco-owned ISP even though the services are competitors. Social media platforms are not subject to these rules, but the delaying links story suggests maybe they should be once they reach a certain size.

Link shorteners have faded into the landscape these days, but they were controversial for years after the first such service – TinyURL – was launched in 2002 (per Wikipedia). Critics cited several main issues: privacy, persistence, and obscurity. The latter refers to users’ inability to know where their clicks are taking them; I feel strongly about this myself. The privacy issue is that the link shorteners-in-the-middle are in a position to collect traffic data and exploit it (bad actors could also divert links from their intended destination). The ability to collect that data and chart “impact” is, of course, one reason shorteners were widely adopted by media sites of all types. The persistence issue is that intermediating links in this way creates one or more central points of failure. When the link shortener’s server goes down for any reason – failed Internet connection, technical fault, bankrupt owner company – the URL the shortener encodes becomes unreachable, even if the page itself is available as normal. You can’t go directly to the page, or even located a cached copy at the Internet Archive, without the original URL.

Nonetheless, shortened links are still widely used, for the same reasons why they were invented. Many URLs are very long and complicated. In print publications, they are visually overwhelming, and unwieldy to copy into a web address bar; they are near-impossible to proofread in footnotes and citations. They’re even worse to read out on broadcast media. Shortened links solve all that. No longer germane is the 140-character limit Twitter had in its early years; because the URL counted toward that maximum, short was crucial. Since then, the character count has gotten bigger, and URLs aren’t included in the count any more.

If you do online research of any kind you have probably long since internalized the routine of loading the linked content and saving the actual URL rather than the shortened version. This turns out to be one of the benefits of moving to Mastodon: the link you get is the link you see.

So to network neutrality. Logically, its equivalent for social media services ought to include the principle that users can post whatever content or links they choose (law and regulation permitting), whether that’s reposted TikTok videos, a list of my IDs on other systems, or a link to a blog advocating that all social media companies be forced to become public utilities. Most have in fact operated that way until now, infected just enough with the early Internet ethos of openness. Changing that unwritten social contract is very bad news even though no one believed XFKAT’s CEO when he insisted he was a champion of free speech and called the now-his site the “town square”.

If that’s what we want social media platforms to be, someone’s going to have to force them, especially if they begin shrinking and their owners start to feel the chill wind of an existential threat. You could even – though no one is, to the best of my knowledge – make the argument that swapping in a site-created shortened URL is a violation of the spirit of data protection legislation. After all, no one posts links on a social media site with the view that their tastes in content should be collected, analyzed, and used to target ads. Librarians have long been stalwarts in resisting pressure to disclose what their patrons read and access. In the move online in general, and to corporate social media in particular, we have utterly lost sight of the principle of the right to our own thoughts.

Illustrations: The New York City public library in 2006..

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series she is a contributing editor for the Plutopia News Network podcast. Follow on Wendy M. GrossmanPosted on Categories Media, Net life, UncategorizedTags , Leave a comment on Five seconds

The data grab

It’s been a good week for those who like mocking flawed technology.

Numerous outlets have reported, for example, that “AI is getting dumber at math”. The source is a study conducted by researchers at Stanford and the University of California Berkeley comparing GPT-3.5’s and GPT-4’s output in March and June 2023. The researchers found that, among other things, GPT-4’s success rate at identifying prime numbers dropped from 84% to 51%. In other words, in June 2023 ChatGPT-4 did little better than chance at identifying prime numbers. That’s psychic level.

The researchers blame “drift”, the problem that improving one part of a model may have unhelpful knock-on effects in other parts of the model. At Ars Technica, Benj Edwards is less sure, citing qualified critics who question the study’s methodology. It’s equally possible, he suggests, that as the novelty fades, people’s attempts to do real work surface problems that were there all along. With no access to the algorithm itself and limited knowledge of the training data, we can only conduct such studies by controlling inputs and observing the outputs, much like diagnosing allergies by giving a child a series of foods in turn and waiting to see which ones make them sick. Edwards advocates greater openness on the part of the companies, especially as software developers begin building products on top of their generative engines.

Unrelated, the New Zealand discount supermarket chain Pak’nSave offered an “AI” meal planner that, set loose, promptly began turning out recipes for “poison bread sandwiches”, “Oreo vegetable stir-fry”, and “aromatic water mix” – which turned out to be a recipe for highly dangerous chlorine gas.

The reason is human-computer interaction: humans, told to provide a list of available ingredients, predictably became creative. As for the computer…anyone who’s read Janelle Shane’s 2019 book, You Look LIke a Thing and I Love You, or her Twitter reports on AI-generated recipes could predict this outcome. Computers have no real world experience against which to judge their output!

Meanwhile, the San Francisco Chronicle reports, Waymo and Cruise driverless taxis are making trouble at an accelerating rate. The cars have gotten stuck in low-hanging wires after thunderstorms, driven through caution tape, blocked emergency vehicles and emergency responders, and behaved erratically enough to endanger cyclists, pedestrians, and other vehicles. If they were driven by humans they’d have lost their licenses by now.

In an interesting side note that reminds of the cars’ potential as a surveillance network, Axios reports that in a ten-day study in May Waymo’s driverless cars found that human drivers in San Francisco speed 33% of the time. A similar exercise in Phoenix, Arizona observed human drivers speeding 47% of the time on roads with a 35mph speed limit. These statistics of course bolster the company’s main argument for adoption: improving road safety.

The study should – but probably won’t – be taken as a warning of the potential for the cars’ data collection to become embedded in both law enforcement and their owners’ business models. The frenzy surrounding ChatGPT-* is fueling an industry-wide data grab as everyone tries to beef up their products with “AI” (see also previous such exercises with “meta”, “nano”, and “e”), consequences to be determined.

Among the newly-discovered data grabbers is Intel, whose graphics processing unit (GPU) drivers are collecting telemetry data, including how you use your computer, the kinds of websites you visit, and other data points. You can opt out, assuming you a) realize what’s happening and b) are paying attention at the right moment during installation.

Google announced recently that it would scrape everything people post online to use as training data. Again, an opt-out can be had if you have the knowledge and access to follow the 30-year-old robots.txt protocol. In practical terms, I can configure my own site, pelicancrossing.net, to block Google’s data grabber, but I can’t stop it from scraping comments I leave on other people’s blogs or anything I post on social media sites or that’s professionally published (though those sites may block Google themselves). This data repurposing feels like it ought to be illegal under data protection and copyright law.

In Australia, Gizmodo reports that the company has asked the Australian government to relax copyright laws to facilitate AI training.

Soon after Google’s announcement the law firm Clarkson filed a class action lawsuit against Google to join its action against OpenAI. The suit accuses Google of “stealing” copyrighted works and personal data,

“Google does not own the Internet,” Clarkson wrote in its press release. Will you tell it, or shall I?

Whatever has been going on until now with data slurping in the interests of bombarding us with microtargeted ads is small stuff compared to the accelerating acquisition for the purpose of feeding AI models. Arguably, AI could be a public good in the long term as it improves, and therefore allowing these companies to access all available data for training is in the public interest. But if that’s true, then the *public* should own the models, not the companies. Why should we consent to the use of our data so they can sell it back to us and keep the proceeds for their shareholders?

It’s all yet another example of why we should pay attention to the harms that are clear and present, not the theoretical harm that someday AI will be general enough to pose an existential threat.

Illustrations: IBM Watson, Jeopardy champion.

Wendy M. Grossman is the 2013 winner of the Enigma Award and contributing editor for the Plutopia News Network podcast. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Follow on Mastodon.

Watching YouTube

One of the reasons it’s so difficult to figure out what to do about misinformation, malinformation, and disinformation online is the difficulty of pinpointing how online interaction translates to action in the real world. The worst content on social media has often come from traditional media or been posted by an elected politician.

At least, that’s how it seems to text-based people like me. This characteristic, along with the quick-hit compression of 140 (later 280) characters, was the (minority) appeal of Twitter. It’s also why legacy media pays so little attention to what’s going on in game worlds, struggle with TikTok, and underestimate the enormous influence of YouTube. The notable exception is the prolific Chris Stokel-Walker, who’s written books about both YouTube and TikTok.

Stokel-Walker has said he decided to write YouTubers because the media generally only notices YouTube when there’s a scandal. Touring those scandals occupies much of filmmaker Alex Winter‘s now-showing biography of the service, The YouTube Effect.

The film begins by interviewing co-founder Steve Chen, who giggles a little uncomfortably to admit that he and co-founders Chad Hurley and Jawed Karim thought it could be a video version of Hot or Not?. In 2006, Google bought the year-old site for $1.65 billion in Google stock, to derision from financial commentators certain it had overpaid.

Winter’s selection of clips from early YouTube reminds of early movies, which pulled people into theaters with little girls having a pillow fight. Winter moves on through pioneering stars like Smosh and K-Pop, 2010’s Arab spring, the arrival of advertising and monetization, the rise of alt-right channels, Gamergate, the 2016 US presidential election, the Christchurch shooting, the horrors lurking in YouTube Kids, George Floyd, the multimillion-dollar phenomenon of Ryan Kaji, January 6, the 2020 Congressional hearings. Somewhere in the middle is the arrival of the Algorithm that eliminated spontaneous discovery in favor of guided user experience, and a brief explanation of the role of Section 230 of the Communications Decency Act in protecting platforms from liability for third-party content.

These stories are told by still images and video clips interlaced with interviews with talking heads like Caleb Cain, who was led into right-wing extremism and found his way back out; Andy Parker, father of Alison Parker, footage of whose murder he has been unable to get expunged; successful YouTuber (“ContraPoints”) Natalie Wynn; technology writer and video game developer Brianna Wu; Jillian C. York, author of Silicon Values; litigator Carrie Goldberg, who works to remediate online harms one lawsuit at a time; Anthony Padilla, co-founder of Smosh; and YouTube then-CEO Susan Wojcicki.

Not included among the interviewees: political commentators (though we see short clips of Alex Jones) or free speech fundamentalists. In addition, Winter sticks to user-generated content, ignoring the large percentage of YouTube’s library that is copies of professional media, many otherwise unavailable. Countries outside the US are mentioned only by York, who studies censorship around the world. Also missing is anyone from Google who could explain how YouTube fits into its overall business model.

The movie concludes by asking commentators to recommend changes. Parker wants families of murder victims to be awarded co-copyright and therefore standing to get footage of victims’ deaths removed. Hany Farid, a UC Berkeley professor who studies deepfakes, thinks it’s essential to change the business model from paying with data and engagement to paying with money – that is, subscriptions. Goldberg is afraid we will all become captives of Big Tech. A speaker whose name is illegible in my notes mentions antitrust law. Cain notes that there’s nothing humans have built that we can’t destroy. Wojcicki says only that technology offers “a tremendous opportunity to do good in the long-term”. York notes the dual-use nature of these technologies; their effects are both good and bad, so what you change “depends what you’re looking for”.

Cain gets the last word. “What are we speeding towards?” he asks, as the movie’s accelerating crescendo of images and clips stops on a baby’s face.

Unlike predecessors Coded Bias (2021) and The Great Hack (2019), The YouTube Effect is unclear about what it intends us to understand about YouTube’s impact on the world beyond the sheer size of audience a creator can assemble via the platform. The array of scandals, all of them familiar from mainstream headlines, makes a persuasive case that YouTube deserves Facebook and Twitter-level scrutiny. What’s missing, however, is causality. In fact, the film is wrongly titled: there is no one YouTube effect. York had it right: “fixing” YouTube requires deciding what you’re trying to change. My own inclination is to force change to the business model. The algorithm distorts our interactions, but it’s driven by the business model.

Perhaps this was predictable. Seven years on, we still struggle to pinpoint exactly how social media affected the 2016 US presidential election or the UK’s EU referendum vote. Letting it ride is dangerous, but so is government regulation. Numerous governments are leaning toward the latter.

Even the experts assembled at last week’s Cambridge Disinformation Summit reached no consensus. Some saw disinformation as an existential threat; others argued that disinformation has always been with us and humanity finds a way to live through it. It wouldn’t be reasonable to expect one filmmaker to solve a conundrum that is vexing so many. And yet it’s still disappointing not to have found greater clarity.

Illustrations: YouTube CEO (2014-2023) Susan Wojcicki (via The YouTube Effect).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Follow on Mastodon.