The data grab

It’s been a good week for those who like mocking flawed technology.

Numerous outlets have reported, for example, that “AI is getting dumber at math”. The source is a study conducted by researchers at Stanford and the University of California Berkeley comparing GPT-3.5’s and GPT-4’s output in March and June 2023. The researchers found that, among other things, GPT-4’s success rate at identifying prime numbers dropped from 84% to 51%. In other words, in June 2023 ChatGPT-4 did little better than chance at identifying prime numbers. That’s psychic level.

The researchers blame “drift”, the problem that improving one part of a model may have unhelpful knock-on effects in other parts of the model. At Ars Technica, Benj Edwards is less sure, citing qualified critics who question the study’s methodology. It’s equally possible, he suggests, that as the novelty fades, people’s attempts to do real work surface problems that were there all along. With no access to the algorithm itself and limited knowledge of the training data, we can only conduct such studies by controlling inputs and observing the outputs, much like diagnosing allergies by giving a child a series of foods in turn and waiting to see which ones make them sick. Edwards advocates greater openness on the part of the companies, especially as software developers begin building products on top of their generative engines.

Unrelated, the New Zealand discount supermarket chain Pak’nSave offered an “AI” meal planner that, set loose, promptly began turning out recipes for “poison bread sandwiches”, “Oreo vegetable stir-fry”, and “aromatic water mix” – which turned out to be a recipe for highly dangerous chlorine gas.

The reason is human-computer interaction: humans, told to provide a list of available ingredients, predictably became creative. As for the computer…anyone who’s read Janelle Shane’s 2019 book, You Look LIke a Thing and I Love You, or her Twitter reports on AI-generated recipes could predict this outcome. Computers have no real world experience against which to judge their output!

Meanwhile, the San Francisco Chronicle reports, Waymo and Cruise driverless taxis are making trouble at an accelerating rate. The cars have gotten stuck in low-hanging wires after thunderstorms, driven through caution tape, blocked emergency vehicles and emergency responders, and behaved erratically enough to endanger cyclists, pedestrians, and other vehicles. If they were driven by humans they’d have lost their licenses by now.

In an interesting side note that reminds of the cars’ potential as a surveillance network, Axios reports that in a ten-day study in May Waymo’s driverless cars found that human drivers in San Francisco speed 33% of the time. A similar exercise in Phoenix, Arizona observed human drivers speeding 47% of the time on roads with a 35mph speed limit. These statistics of course bolster the company’s main argument for adoption: improving road safety.

The study should – but probably won’t – be taken as a warning of the potential for the cars’ data collection to become embedded in both law enforcement and their owners’ business models. The frenzy surrounding ChatGPT-* is fueling an industry-wide data grab as everyone tries to beef up their products with “AI” (see also previous such exercises with “meta”, “nano”, and “e”), consequences to be determined.

Among the newly-discovered data grabbers is Intel, whose graphics processing unit (GPU) drivers are collecting telemetry data, including how you use your computer, the kinds of websites you visit, and other data points. You can opt out, assuming you a) realize what’s happening and b) are paying attention at the right moment during installation.

Google announced recently that it would scrape everything people post online to use as training data. Again, an opt-out can be had if you have the knowledge and access to follow the 30-year-old robots.txt protocol. In practical terms, I can configure my own site, pelicancrossing.net, to block Google’s data grabber, but I can’t stop it from scraping comments I leave on other people’s blogs or anything I post on social media sites or that’s professionally published (though those sites may block Google themselves). This data repurposing feels like it ought to be illegal under data protection and copyright law.

In Australia, Gizmodo reports that the company has asked the Australian government to relax copyright laws to facilitate AI training.

Soon after Google’s announcement the law firm Clarkson filed a class action lawsuit against Google to join its action against OpenAI. The suit accuses Google of “stealing” copyrighted works and personal data,

“Google does not own the Internet,” Clarkson wrote in its press release. Will you tell it, or shall I?

Whatever has been going on until now with data slurping in the interests of bombarding us with microtargeted ads is small stuff compared to the accelerating acquisition for the purpose of feeding AI models. Arguably, AI could be a public good in the long term as it improves, and therefore allowing these companies to access all available data for training is in the public interest. But if that’s true, then the *public* should own the models, not the companies. Why should we consent to the use of our data so they can sell it back to us and keep the proceeds for their shareholders?

It’s all yet another example of why we should pay attention to the harms that are clear and present, not the theoretical harm that someday AI will be general enough to pose an existential threat.

Illustrations: IBM Watson, Jeopardy champion.

Wendy M. Grossman is the 2013 winner of the Enigma Award and contributing editor for the Plutopia News Network podcast. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Follow on Mastodon.

Watson goes to Wimbledon

The launch of the Fediverse-compatible Meta app Threads seems to have slightly overshadowed the European Court of Justice’s ruling, earlier in the week. This ruling deserves more attention: it undermines the basis of Meta’s targeted advertising. In noyb’s initial reaction, data protection legal bulldog Max Schrems suggests the judgment will make life difficult for not just Meta but other advertising companies.

As Alex Scroxton explains at Computer Weekly, the ruling rejects several different claims by Meta that all attempt to bypass the requirement enshrined in the General Data Protection Regulation that where there is no legal basis for data processing users must actively consent. Meta can’t get by with claiming that targeted advertising is a part of its service users expect, or that it’s technically necessary to provide its service.

More interesting is the fact that the original complaint was not filed by a data protection authority but by Germany’s antitrust body, which sees Meta’s our-way-or-get-lost approach to data gathering as abuse of its dominant position – and the CJEU has upheld this idea.

All this is presumably part of why Meta decided to roll out Threads in many countries but *not* the EU, In February, as a consequence of Brexit, Meta moved UK users to its US agreements. The UK’s data protection law is a clone of GDPR and will remain so until and unless the British Parliament changes it via the pending Data Protection and Digital Information bill. Still, it seems the move makes Meta ready to exploit such changes if they do occur.

Warning to people with longstanding Instagram accounts who want to try Threads: if your plan is to try and (maybe) delete, set up a new Instagram account for the purpose. Otherwise, you’ll be sad to discover that deleting your new Threads account means vaping your old Instagram account along with it. It’s the Hotel California method of Getting Big Fast.

***

Last week the Irish Council for Civil Liberties warned that a last-minute amendment to the Courts and Civil Law (Miscellaneous) bill will allow Ireland’s Data Protection Commissioner to mark any of its proceedings “confidential” and thereby bar third parties from publishing information about them. Effectively, it blocks criticism. This is a muzzle not only for the ICCL and other activists and journalists but for aforesaid bulldog Schrems, who has made a career of pushing the DPC to enforce the law it was created to enforce. He keeps winning in court, too, which I’m sure must be terribly annoying.

The Irish DPC is an essential resource for everyone in Europe because Ireland is the European home of so many of American Big Tech’s subsidiaries. So this amendment – which reportedly passed the Oireachta (Ireland’s parliament) – is an alarming development.

***

Over the last few years Canadian law professor Michael Geist has had plenty of complaints about Canada’s Online News Act, aka C-18. Like the Australian legislation it emulates, C-18 requires intermediaries like Facebook and Google to negotiate and pay for licenses to link to Canadian news content. The bill became law on June 22.

Naturally, Meta and Google have warned that they will block links to Canadian news media from their services when the bill comes into force six months hence. They also intend to withdraw their ongoing programs to support the Canadian press. In response, the Canadian government has pulled its own advertising from Meta platforms Facebook and Instagram. Much hyperbolic silliness is taking place

Pretty much everyone who is not the Canadian government thinks the bill is misconceived. Canadian publishers will lose traffic, not gain revenues, and no one will be happy. In Australia, the main beneficiary appears to be Rupert Murdoch, with whom Google signed a three-year agreement in 2021 and who is hardly the sort of independent local media some hoped would benefit. Unhappily, the state of California wants in on this game; its in-progress Journalism Preservation Act also seeks to require Big Tech to pay a “journalism usage fee”.

The result is to continue to undermine the open Internet, in which the link is fundamental to sharing information. If things aren’t being (pay)walled off, blocked for copyright/geography, or removed for corporate reasons – the latest announced casualty is the GIF hosting site Gfycat – they’re being withheld to avoid compliance requirements or withdrawn for tax reasons. None of us are better off for any of this.

***

Those with long memories will recall that in 2011 IBM’s giant computer, Watson, beat the top champions at the TV game show Jeopardy. IBM predicted a great future for Watson as a medical diagnostician.

By 2019, that projected future was failing. “Overpromised and underdelivered,” ran a IEEE Spectrum headline. IBM is still trying, and is hoping for success with cancer diagnosis.

Meanwhile, Watson has a new (marketing) role: analyzing the draw and providing audio and text commentary for back-court tennis matches at Wimbledon and for highlights clips. For each match, Watson also calculates the competitors’ chances of winning and the favorability of their draw. For a veteran tennis watcher, it’s unsatisfying, though: IBM offers only a black box score, and nothing to show how that number was reached. At least human commentators tell you – albeit at great, repetitive length – the basis of their reasoning.

Illustrations: IBM’s Watson, which beat two of Jeopardy‘s greatest champions in 2011.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Follow on Twitter.

Own goals

There’s no point in saying I told you so when the people you’re saying it to got the result they intended.

At the Guardian, Peter Walker reports the Electoral Commission’s finding that at least 14,000 people were turned away from polling stations in May’s local elections because they didn’t have the right ID as required under the new voter ID law. The Commission thinks that’s a huge underestimate; 4% of people who didn’t vote said it was because of voter ID – which Walker suggests could mean 400,000 were deterred. Three-quarters of those lacked the right documents; the rest opposed the policy. The demographics of this will be studied more closely in a report due in September, but early indications are that the policy disproportionately deterred people with disabilities, people from certain ethnic groups, and people who are unemployed.

The fact that the Conservatives, who brought in this policy, lost big time in those elections doesn’t change its wrongness. But it did lead the MP Jacob Rees-Mogg (Con-North East Somerset) to admit that this was an attempt to gerrymander the vote that backfired because older voters, who are more likely to vote Conservative, also disproportionately don’t have the necessary ID.

***

One of the more obscure sub-industries is the business of supplying ad services to websites. One such little-known company is Criteo, which provides interactive banner ads that are generated based on the user’s browsing history and behavior using a technique known as “behavioral retargeting”. In 2018, Criteo was one of seven companies listed in a complaint Privacy International and noyb filed with three data protection authorities – the UK, Ireland, and France. In 2020, the French data protection authority, CNIL, launched an investigation.

This week, CNIL issued Criteo with a €40 million fine over failings in how it gathers user consent, a ruling noyb calls a major blow to Criteo’s business model.

It’s good to see the legal actions and fines beginning to reach down into adtech’s underbelly. It’s also worth noting that the CNIL was willing to fine a *French* company to this extent. It makes it harder for the US tech giants to claim that the fines they’re attracting are just anti-US protectionism.

***

Also this week, the US Federal Trade Commission announced it’s suing Amazon, claiming the company enrolled millions of US consumers into its Prime subscription service through deceptive design and sabotaged their efforts to cancel.

“Amazon used manipulative, coercive, or deceptive user-interface designs known as “dark patterns” to trick consumers into enrolling in automatically-renewing Prime subscriptions,” the FTC writes.

I’m guessing this is one area where data protection laws have worked, In my UK-based ultra-brief Prime outings to watch the US Open tennis, canceling has taken at most two clicks. I don’t recognize the tortuous process Business Insider documented in 2022.

***

It has long been no secret that the secret behind AI is human labor. In 2019, Mary L. Gray and Siddharth Suri documented this in their book Ghost Work. Platform workers label images and other content, annotate text, and solve CAPTCHAs to help train AI models.

At MIT Technology Review, Rhiannon Williams reports that platform workers are using ChatGPT to speed up their work and earn more. A team of researchers from the Swiss Federal Institute of Technology study (PDF)found that between 33% and 46% of the 44 workers they tested with a request to summarize 16 extracts from medical research papers used AI models to complete the task.

It’s hard not to feel a little gleeful that today’s “AI” is already eating itself via a closed feedback loop. It’s not good news for platform workers, though, because the most likely consequence will be increased monitoring to force them to show their work.

But this is yet another case in which computer people could have learned from their own history. In 2008, researchers at Google published a paper suggesting that Google search data could be used to spot flu outbreaks. Sick people searching for information about their symptoms could provide real-time warnings ten days earlier than the Centers for Disease Control could.

This actually worked, some of the time. However, as early as 2009, Kaiser Fung reported at Harvard Business Review in 2014, Google Flu Trends missed the swine flu pandemic; in 2012, researchers found that it had overestimated the prevalence of flu for 100 out of the previous 108 weeks. More data is not necessarily better, Fung concluded.

In 2013, as David Lazer and Ryan Kennedy reported for Wired in 2015 in discussing their investigation into the failure of this idea, GFT missed by 140% (without explaining what that means). Lazer and Kennedy find that Google’s algorithm was vulnerable to poisoning by unrelated seasonal search terms and search terms that were correlated purely by chance, and failed to take into account changing user behavior as when it introduced autosuggest and added health-related search terms. The “availability” cognitive bias also played a role: when flu is in the news, searches go up whether or not people are sick.

While the parallels aren’t exact, large language modelers could have drawn the lesson that users can poison their models. ChatGPT’s arrival for widespread use will inevitably thin out the proportion of text that is human-written – and taint the well from which LLMs drink. Everyone imagines the next generation’s increased power. But it’s equally possible that the next generation will degrade as the percentage of AI-generated data rises.

Illustrations: Drunk parrot seen in a Putney garden (by Simon Bisson).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Follow on Mastodon or Twitter.

Snowden at ten

As almost every media outlet has headlined this week, it is now ten years since Edward Snowden alerted the world to the real capabilities of the spy agencies, chiefly but not solely the US National Security Agency. What is the state of surveillance now? most of the stories ask.

Some samples: at the Open Rights Group executive director Jim Killock summarizes what Snowden revealed; Snowden is interviewed; the Guardian’s editor at the time, Alan Rusbridger, recounts events at the Guardian, which co-published Snowden’s discoveries with the Washington Post; journalist Heather Brooke warns of the increasing sneakiness of government surveillance; and Jessica Lyons Hardcastle outlines the impact. Finally, at The Atlantic, Ewen MacAskill, one of the Guardian journalists who worked on the Snowden stories, says only about 1% of Snowden’s documents were ever published.

As has been noted here recently, it seems as though everywhere you look surveillance is on the rise: at work, on privately controlled public streets, and everywhere online by both government and commercial actors. As Brooke writes and the Open Rights Group has frequently warned, surveillance that undermines the technical protections we rely on puts us all in danger.

The UK went on to pass the Investigatory Powers Act, which basically legalized what the security services were doing, but at least did add some oversight. US courts found that the NSA had acted illegally and in 2015 Congress made bulk collection of Americans’ phone records illegal. But, as Bruce Schneier has noted, Snowden’s cache of documents was aging even in 2013; now they’re just old. We have no idea what the secret services are doing now.

The impact in Europe was significant: in 2016 the EU adopted the General Data Protection Regulation. Until Snowden, data protection reform looked like it might wind up watering down data protection law in response to an unprecedented amount of lobbying by the technology companies. Snowden’s revelations raised the level of distrust and also gave Max Schrems some additional fuel in bringing his legal actions< against EU-US data deals and US corporate practices that leave EU citizens open to NSA snooping.

The really interesting question is this: what have we done *technically* in the last decade to limit government’s ability to spy on us at will?

Work on this started almost immediately. In early 2014, the World Wide Web Consortium and the Internet Engineering Task Force teamed up on a workshop called Strengthening the Internet Against Pervasive Monitoring (STRINT). Observing the proceedings led me to compare the size of the task ahead to boiling the ocean. The mood of the workshop was united: the NSA’s actions as outlined by Snowden constituted an attack on the Internet and everyone’s privacy, a view codified in RFC 7258, which outlined the plan to mitigate pervasive monitoring. The workshop also published an official report.

Digression for non-techies: “RFC” stands for “Request for Comments”. The thousands of RFCs since 1969 include technical specifications for Internet protocols, applications, services, and policies. The title conveys the process: they are published first as drafts and incorporate comments before being finalized.

The crucial point is that the discussion was about *passive* monitoring, the automatic, ubiquitous, and suspicionless collection of Internet data “just in case”. As has been said so many times about backdoors in encryption, the consequence of poking holes in security is to make everyone much more vulnerable to attacks by criminals and other bad actors.

So a lot of that workshop was about finding ways to make passive monitoring harder. Obviously, one method is to eliminate vulnerabilities, especially those the NSA planted. But it’s equally effective to make monitoring more expensive. Given the law of truly large numbers, even a tiny extra cost per user creates unaffordable friction. They called it a ten-year project, which takes us to…almost now.

Some things have definitely improved, largely through the expanded use of encryption to protect data in transit. On the web, Let’s Encrypt, now ten years old, makes it easy and cheap to obtain a certificate for any website. Search engines contribute by favoring encrypted (that is, HTTPS) web links over unencrypted ones (HTTP). Traffic between email servers has gone from being transmitted in cleartext to being almost all encrypted. Mainstream services like WhatsApp have added end-to-end encryption to the messaging used by billions. Other efforts have sought to reduce the use of fixed long-term identifiers such as MAC addresses that can make tracking individuals easier.

At the same time, even where there are data protection laws, corporate surveillance has expanded dramatically. And, as has long been obvious, governments, especially democratic governments, have little motivation to stop it. Data collection by corporate third parties does not appear in the public budget, does not expose the government to public outrage, and is available via subpoena any time government officials want. If you are a law enforcement or security service person, this is all win-win; the only data you can’t get is the data that isn’t collected.

In an essay reporting on the results of the work STRINT began as part of the ten-year assessment currently circulating in draft, STRINT convenor Stephen Farrell writes, “So while we got a lot right in our reaction to Snowden’s revelations, currently, we have a “worse” Internet.”

Illustrations: Edward Snowden, speaking to Glenn Greenwald in a screenshot from Laura Poitras’ film Prism from Praxis Films (via Wikimedia).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Follow on Mastodon or Twitter.

Microsurveillance

“I have to take a photo,” the courier said, raising his mobile phone to snap a shot of the package on the stoop in front of my open doorway.

This has been the new thing. I guess the spoken reason is to ensure that the package recipient can’t claim that it was never delivered, protecting all three of the courier, the courier company, and the shipper from fraud. But it feels like the unspoken reason is to check that the delivery guy has faithfully completed his task and continued on his appointed round without wasting time. It feels, in other words, like the delivery guy is helping the company monitor him.

I say this, and he agrees. I had, in accordance with the demands of a different courier, pinned a note to my door authorizing the deliverer to leave the package on the doorstep in my absence. “I’d have to photograph the note,” he said.

I mentioned American truck drivers, who are pushing back against in-cab cameras and electronic monitors. “They want to do that here, too,” he said. “They want to put in dashboard cameras.” Since then, in at least some cases – for example, Amazon – they have.

Workplace monitoring was growing in any case, but, as noted in 2021, the explosion in remote working brought by the pandemic normalized a level of employer intrusion that might have been more thoroughly debated in less fraught times. The Trades Union Congress reported in 2022 that 60% of employees had experiened being tracked in the previous years. And once in place, the habit of surveillance is very hard to undo.

When I was first thinking about this piece in 2021, many of these technologies were just being installed. Two years later, there’s been time for a fight back. One such story comes from the France-based company Teleperformance, one of those obscure, behind-the-scenes suppliers to the companies we’ve all heard of. In this case, the company in the shadows supplies remote customer service workers to include, just in the UK, the government’s health and education departments, NHS Digital, the RAF and Royal Navy, and the Student Loans Company, as well as Vodafone, eBay, Aviva, Volkswagen, and the Guardian itself; some of Teleperformance’s Albanian workers provide service to Apple UK

In 2021, Teleperformance demanded that remote workers in Colombia install in-home monitoring and included a contract clause requiring them to accept AI-powered cameras with voice analytics in their homes and allowing the company to store data on all members of the worker’s family. An earlier attempt at the same thing in Albania failed when the Information and Data Protection Commissioner stepped in.

Teleperformance tried this in the UK, where the unions warned about the normalization of surveillance. The company responded that the cameras would only be used for meetings, training, and scheduled video calls so that supervisors could check that workers’ desks were free of devices deemed to pose a risk to data security. Even so, In August 2021 Teleperformance told Test and Trace staff to limit breaks to ten minutes in a six-hour shift and to select “comfort break” on their computers (so they wouldn’t be paid for that time).

Other stories from the pandemic’s early days show office workers being forced to log in with cameras on for a daily morning meeting or stay active on Slack. Amazon has plans to use collected mouse movements and keystrokes to create worker profiles to prevent impersonation. In India, the government itself demanded that its accredited social health activists install an app that tracks their movements via GPS and monitors their uses of other apps.

More recently, Politico reports that Uber drivers must sign in with a selfie; they will be banned if the facial recognition verification software fails to find a match.

This week, at the Guardian Clea Skopoleti updated the state of work. In one of her examples, monitoring software calculates “activity scores” based on typing and mouse movements – so participating in Zoom meetings, watching work-related video clips, and thinking don’t count. Young people, women, and minority workers are more likely to be surveilled.

One employee Skopoleti interviews takes unpaid breaks to carve out breathing space in which to work; another reports having to explain the length of his toilet breaks. Another, a English worker in social housing, reports his vehicle is tracked so closely that a manager phones if they think he’s not in the right place or taking too long.

This is a surveillance-breeds-distrust-breeds-more-surveillance cycle. As Ellen Ullman long ago observed, systems infect their owners with the desire to do more and more with them. It will take time for employers to understand the costs in worker burnout, staff turnover, and absenteeism.

One way out is through enforcing the law: In 2020, the ICO investigated Barclay’s Bank, which was accused of spying on staff via software that tracked how they spent their time; the bank dropped it. In many of these stories, however, the surveillance suppliers say they operate within the law.

The more important way out is worker empowerment. In Colombia, Teleperformance has just guaranteed its 40,000 workers the right to form a union.

First, crucially, we need to remember that surveillance is not normal.

Illustrations: The boss tells Charlie Chaplin to get back to work in Modern Times (1936).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Follow on Mastodon or Twitter.

The arc of surveillance

“What is the point of introducing contestability if the system is illegal?” a questioner asked at this year’s Compiuters, Privacy, and Data Protection, or more or less.

This question could have been asked in any number of sessions where tweaks to surface problems leave the underlying industry undisturbed. In fact, the questioner raised it during the panel on enforcement, GDPR, and the newly-in-force Digital Markets Act. Maria Luisa Stasi explained the DMA this way: it’s about business models. It’s a step into a deeper layer.
.
The key question: will these new laws – the DMA, the recent Digital Services Act, which came into force in November, the in-progress AI Act – be enforced better than GDPR has been?

The frustration has been building all five years of GDPR’s existence. Even though this week, Meta was fined €1.2 billion for transferring European citizens’ data to the US, Noyb reports that 85% of its 800-plus cases remain undecided, 58% of them for more than 18 months. Even that €1.2 billion decision took ten years, €10 million, and three cases against the Irish Data Protection Commissioner to push through – and will now be appealed. Noyb has an annotated map of the various ways EU countries make litigation hard. The post-Snowden political will that fueled GDPR’s passage has had ten years to fade.

It’s possible to find the state of privacy circa 2023 depressing. In the 30ish years I’ve been writing about privacy, numerous laws have been passed, privacy has become a widespread professional practice and area of study in numerous fields, and the number of activists has grown from a literal handful to tens of thousands around the world. But overall the big picture is one of escalating surveillance of all types and by all sorts of players. At the 2000 Computers, Freedom, and Privacy conference, Neal Stephenson warned not to focus on governments. Watch the “Little Brothers”, he said. Google was then a tiny self-funded startup, and Mark Zuckerberg was 16. Stephenson was prescient.

And yet, that surveillance can be weirdly patchy. In a panel on children online, Leanda Barrington-Leach noted platforms’ selective knowledge: “How do they know I like red Nike trainers but don’t know I’m 12?” A partial answer came later: France’s CNIL has looked at age verification technologies and concluded that none are “mature enough” to both do the job and protect privacy.

In a discussion of deceptive practices, paraphrasing his recent paper, Mark Leiser pinpointed a problem: “We’re stuck with a body of law that looks at online interface as a thing where you look for dark patterns, but there’s increasing evidence that they’re being embedded in the systems architecture underneath and I’d argue we’re not sufficiently prepared to regulate that.”

As a response, Woody Hartzog and Neil Richards have proposed the concept of “data loyalty”. Similar to a duty of care, the “loyalty” in this case is owed by the platform to its users. “Loyalty is the requirement to make the interests of the trusted party [the platform] subservient to those of the trustee or vulnerable one [the user],” Hartzog explained. And the more vulnerable you are the greater the obligation on the powerful party.

The tone was set early with a keynote from Julie Cohen that highlighted structural surveillance and warned against accepting the Big Tech mantra that more technology naturally brings improved human social welfare..

“What happens to surveillance power as it moves into the information infrastructure?” she asked. Among other things, she concluded, it disperses accountability, making it harder to challenge but easier to embed. And once embedded, well…look how much trouble people are having just digging Huawei equipment out of mobile networks.

Cohen’s comments resonate. A couple of years ago, when smart cities were the hot emerging technology, it became clear that many of the hyped ideas were only really relevant to large, dense urban areas. In smaller cities, there’s no scope for plotting more efficient delivery routes, for example, because there aren’t enough options. As a result, congestion is worse in a small suburban city than in Manhattan, where parallel routes draw off traffic. But even a small town has scope for surveillance, and so some of us concluded that this was the technology that would trickle down. This is exactly what’s happening now: the Fusus technology platform even boasts openly of bringing the surveillance city to the suburbs.

Laws will not be enough to counter structural surveillance. In a recent paper, Cohen wrote, “Strategies for bending the arc of surveillance toward the safe and just space for human wellbeing must include both legal and technical components.”

And new approaches, as was shown by an unusual panel on sustainability, raised by the computational and environmental costs of today’s AI. This discussion suggested a new convergence: the intersection, as Katrin Fritsch put it, of digital rights, climate justice, infrastructure, and sustainability.

In the deception panel, Roseamunde van Brakel similarly said we need to adopt a broader conception of surveillance harm that includes social harm and risks for society and democracy and also the impact on climate of use of all these technologies. Surveillance, in other words, has environmental costs that everyone has ignored.

I find this convergence hopeful. The arc of surveillance won’t bend without the strength of allies..

Illustrations: CCTV camera at 22 Portobello Road, London, where George Orwell lived.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Follow on Mastodon or Twitter.

Appropriate privacy

At a workshop this week, one of the organizers posed a question that included the term “appropriate”. As in: “lawful access while maintaining appropriate user privacy”. We were there to think about approaches that could deliver better privacy and security over the next decade, with privacy defined as “the embedding of encryption or anonymization in software or devices”.

I had to ask: What work is “appropriate” doing in that sentence?

I had to ask because last weekend’s royal show was accompanied by preemptive arrests well before events began – at 7:30 AM. Most of the arrested were anti-monarchy protesters armed with luggage straps and placards, climate change protesters whose T-shirts said “Just Stop Oil”, and volunteers for the Night Stars on suspicion that the rape whistles they hand out to vulnerable women might be used to disrupt the parading horses. All of these had coordinated with the Metropolitan Police in advance or actually worked with them…which made no difference. All were held for many hours. Since then, the news has broken that an actual monarchist was arrested, DNA-sampled, fingerprinted, and held for 13 hours just for standing *near* some protesters.

It didn’t help the look of the thing that several days before the Big Show, the Met tweeted a warning that: “Our tolerance for any disruption, whether through protest or otherwise, will be low.”

The arrests were facilitated by the last-minute passage of the Public Order Act just days before with the goal of curbing “disruptive” protests. Among the now-banned practices is “locking on” – that is, locking oneself to a physical structure, a tactic the suffragettes used. among many others in campaigning for women’s right to vote. Because that right is now so thoroughly accepted, we tend to forget how radical and militant the Suffragists had to be to get their point across and how brutal the response was. A century from now, the mainstream may look back and marvel at the treatment meted out to climate change activists. We all know they’re *right*, whether or not we like their tactics.

Since the big event, the House of Lords has published its report on current legislation. The government is seeking to expand the Public Order Act even further by lowering the bar for “serious disruption” from “significant” and “prolonged” to “more than minor” and may include the cumulative impact of repeated protests in the same area. The House of Lords is unimpressed by these amendments via secondary legislation, first because of their nature, and second because they were rejected during the scrutiny of the original bill, which itself is only days old. Secondary legislation gets looked at less closely; the Lords suggest that using this route to bring back rejected provisions “raises possible constitutional issues”. All very Polite for accusing the government of abusing the system.

In the background, we’re into the fourth decade of the same argument between governments and technical experts over encryption. Technical experts by and large take the view that opening a hole for law enforcement access to encrypted content fatally compromises security; law enforcement by and large longs for the old days when they could implement a wiretap with a single phone call to a major national telephone company. One of the technical experts present at the workshop phrased all this gently by explaining that providing access enlarges the attack surface, and the security of such a system will always be weaker because there are more “moving parts”. Adding complexity always makes security harder.

This is, of course, a live issue because of the Online Safety bill, a sprawling mess of 262 pages that includes a requirement to scan public and private messaging for child sexual abuse material, whether or not the communications are encrypted.

None of this is the fault of the workshop we began with, which is part of a genuine attempt to find a way forward on a contentious topic, and whose organizers didn’t have any of this in mind when they chose their words. But hearing “appropriate” in that way at that particular moment raised flags: you can justify anything if the level of disruption that’s allowed to trigger action is vague and you’re allowed to use “on suspicion of” indiscriminately as an excuse. “Police can do what they want to us now,” George Monbiot writes at the Guardian of the impact of the bill.

Lost in the upset about the arrests was the Met’s decision to scan the crowds with live facial recognition. It’s impossible to overstate the impact of this technology. There will be no more recurring debates about ID cards because our faces will do the job. Nothing has been said about how the Met used it on the day, whether its use led to arrests (or on what grounds), or what the Met plans to do with the collected data. The police – and many private actors – have certainly inhaled the Silicon Valley ethos of “ask forgiveness, not permission”.

In this direction of travel, many things we have taken for granted as rights become privileges that can be withdrawn at will, and what used to be public spaces open to all become restricted like an airport or a small grocery store in Whitley Bay. This is the sliding scale in which “appropriate user privacy” may be defined.

Illustrations: Protesters at the coronation (by Alisdair Hickson at Wikimedia .

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Follow on Mastodon or Twitter.

The privacy price of food insecurity

One of the great unsolved questions continues to be: what is my data worth? Context is always needed: worth to whom, under what circumstances, for what purpose? Still, supermarkets may give us a clue.

At Novara Media, Jake Hurfurt, who runs investigations for Big Brother Watch, has been studying suprmarket loyalty cards. He finds that increasingly only loyalty card holders have access to special offers, which used to be open to any passing customer.

Tesco now and Sainsburys soon, he says, “are turning the cost-of-living crisis into a cost-of-privacy crisis”,

Neat phrasing, but I’d say it differently: these retailers are taking advantage of the cost-of-living crisis to extort desperate people ito giving up their data. The average value of the discounts might – for now – give a clue to the value supermarkets place on it.

But not for long, since the pattern going forward is a predictable one of monopoly power: as the remaining supermarkets follow suit and smaller independent shops thin out under the weight of rising fuel bills and shrinking margins, and people have fewer choices, the savings from the loyalty card-only special offers will shrink. Not so much that they won’t be worth having, but it seems obvious they’ll be more generous with the discounts – if “generous” is the word – in the sign-up phase than they will once they’ve achieved customer lock-in.

The question few shoppers are in a position to answer while they’re strying to lower the cost of filling their shopping carts is what the companies do with the data they collect. BBW took the time to analyze Tesco’s and Sainsburys’ privacy policies, and found that besides identity data they collect detailed purchase histories as well as bank accounts and payment information…which they share with “retail partners, media partners, and service providers”. In Tesco’s case, these include Facebook, Google, and, for those who subscribe to them, Virgin Media and Sky. Hyper-targeted personal ads right there on your screen!

All that sounds creepy enough. But consider what could well come next. Also this week, a cross-party group of 50 MPs and peers and cosinged by BBW, Privacy International and Liberty, wrote to Frasers Group deploring that company’s use of live facial recognition in its stores, which include Sports Direct and the department store chain House of Fraser. Frasers Group’s purpose, like retailers and pub chains were trialing a decade ago , is effectively to keep out people suspected of shoplifting and bad behavior. Note that’s “suspected”, not “convicted”.

What happens as these different privacy invasions start to combine?

A store equipped with your personal shopping history and financial identity plus live facial recognition cameras, knows the instant you walk into the store who you are, what you like to buy, and how valuable a customer your are. Such a system, equipped with some sort of socring, could make very fine judgments. Such as: this customer is suspected of stealing another customer’s handbag, but they’re highly profitable to us, so we’ll let that go. Or: this customer isn’t suspected of anything much but they look scruffy and although they browse they never buy anything – eject! Or even: this journalist wrote a story attacking our company. Show them the most expensive personalized prices. One US entertainment company is already using live facial recognition to bar entry to its venues to anyone who works for any law firm involved in litigation against it. Britain’s data protection laws should protect us against that sort of abuse, but will they survive the upcoming bonfire of retained EU law?

And, of course, what starts with relatively anodyne product advertising becomes a whole lot more sinister when it starts getting applied to politics, voter manipulation and segmentation, and the “pre-crime” systems

Add the possibilities of technology that allows retailers to display personalized pricing in-store, just like an online retailer could do in the privacy of your own browser, Could we get to a scenario where a retailer, able to link your real world identity and purchasing power to your online nd offline movements could perform a detailed calculation of what you’d be willing to pay for a particular item? What would surge pricing for the last remaining stock of the year’s hottest toy on Christmas Eve look like?

This idea allows me to imagine shopping partnerships, where the members compare prices and the partner with the cheapest prices buys that item for the whole group. In this dystopian future, I imagine such gambits would be banned.

Most of this won’t affect people rich enough to grandly refuse to sign up for loyalty cards, and none of it will affect people rich and eccentric enough to do source everything from local, independent shops – and, if they’re allowed, pay cash.

Four years ago, Jaron Lanier toured with the proposal that we should be paid for contributing to commercial social media sites. The problem with this idea was and is that payment creates a perverse incentive for users to violate their own privacy even more than they do already, and that fair payment can’t be calculated when the consequences of disclosure are perforce unknown.

The supermarket situation is no different. People need food security and affordability, They should not have to pay for that with their privacy.

Illustrations: .London supermarket checkout, 2006 (via Wikimedia.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Follow on Mastodon or Twitter.

Breaking badly

This week, the Online Safety Bill reached the House of Lords, which will consider 300 amendments. There are lots of problems with this bill, but the one that continues to have the most campaigning focus is the age-old threat to require access to end-to-end encrypted messaging services.

At his blog, security consultant Alec Muffett predicts the bill will fail in implementation if it passes. For one thing, he cites the argument made by Richard Allan, Baron of Hallam that the UK government wants the power to order decryption but will likely only ever use it as a threat to force the technology companies to provide other useful data. Meanwhile, the technology companies have pushed back with an open letter saying they will withdraw their encrypted products from the UK market rather than weaken them.

In addition, Muffett believes the legally required secrecy when a service provider is issued with a Technical Capability Notice to provide access to communications, which was devised for the legacy telecommunications world, is impossible in today’s world of computers and smartphones. Secrecy is no longer possible, given the many researchers and hackers who make it their job to study changes to apps, and who would surely notice and publicize new decryption capabilities. The government will be left with the choice of alienating the public or failing to deliver its stated objectives.

At Computer Weekly, Bill Goodwin points out that undermining encryption will affect anyone communicating with anyone in Britain, including the Ukrainian military communicating with the UK’s Ministry of Defence.

Meanwhile, this week Ed Caesar reports at The New Yorker on law enforcement’s successful efforts to penetrate communications networks protected by Encrochat and Sky ECC. It’s a reminder that there are other choices besides opening up an entire nation’s communications to attack.

***

This week also saw the disappointing damp-squib settlement of the lawsuit brought by Dominion Voting Systems against Fox News. Disappointing, because it leaves Fox and its hosts free to go on wreaking daily havoc across America by selling their audience rage-enhanced lies without even an apology. The payment that Fox has agreed to – $787 million – sounds like a lot, but a) the company can afford it given the size of its cash pile, and b) most of it will likely be covered by insurance.

If Fox’s major source of revenues were advertising, these defamation cases – still to come is a similar case brought by Smartmatic – might make their mark by alienating advertisers, as has been happening with Twitter. But it’s not; instead, Fox is supported by the fees cable companies pay to carry the channel. Even subscribers who never watch it are paying monthly for Fox News to go on fomenting discord and spreading disinformation. And Fox is seeking a raise to $3 per subscriber, which would mean more than $1,8 billion a year just from affiliate revenue.

All of that insulates the company from boycotts, alienated advertisers, and even the next tranche of lawsuits. The only feedback loop in play is ratings – and Fox News remains the most-watched basic cable network.

This system could not be more broken.

***

Meanwhile, an era is ending: Netflix will mail out its last rental DVD in September. As Chris Stokel-Walker writes at Wired, the result will be to shrink the range of content available by tens of thousands of titles because the streaming library is a fraction of the size of the rental library.

This reality seems backwards. Surely streaming services ought to have the most complete libraries. But licensing and lockups mean that Netflix can only host for streaming what content owners decree it may, whereas with the mail rental service once Netflix had paid the commercial rental rate to buy the DVD it could stay in the catalogue until the disk wore out.

The upshot is yet another data point that makes pirate services more attractive: no ads, easy access to the widest range of content, and no licensing deals to get in the way.

***

In all the professions people have been suggesting are threatened by large language model-based text generation – journalism, in particular – no one to date has listed fraudulent spiritualist mediums. And yet…

The family of Michael Schumacher is preparing legal action against the German weekly Die Aktuelle for publishing an interview with the seven-time Formula 1 champion. Schumacher has been out of the public eye since suffering a brain injury while skiing in 2013. The “interview” is wholly fictitious, the quotes created by prompting an “AI” chat bot.

Given my history as a skeptic, my instinctive reaction was to flash on articles in which mediums produced supposed quotes from dead people, all of which tended to be anodyne representations bereft of personality. Dressing this up in the trappings of “AI” makes such fakery no less reprehensible.

An article in the Washington Post examines Google’s C4 data set scraped from 15 million websites and used to train several of the highest profile large language models. The Post has provided a search engine, which tells us that my own pelicancrossing.net, which was first set up in 1996, has contributed 160,000 words or phrases (“tokens”), or 0.0001% of the total. The obvious implication is that LLM-generated fake interviews with famous people can draw on things they’ve actually said in the past, mixing falsity and truth into a wasteland that will be difficult to parse.

Illustrations: The House of Lords in 2011 (via Wikimedia).

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Follow on Twitter.

Unclear and unpresent dangers

Monthly computer magazines used to fret that their news pages would be out of date by the time the new issue reached readers. This week in AI, a blog posting is out of date before you hit send.

This – Friday – morning, the Italian data protection authority, Il Garante, has ordered ChatGPT to stop processing the data of Italian users until it complies with the General Data Protection Regulation. Il Garante’s objections, per Apple’s translation, posted by Ian Brown: ChatGPT provides no legal basis for collecting and processing its massive store of the personal data used to train the model, and that it fails to filter out users under 13.

This may be the best possible answer to the complaint I’d been writing below.

On Wednesday, the Future of Life Institute published an open letter calling for a six-month pause on developing systems more powerful than Open AI’s current state of the art, GPT4. Barring Elon Musk, Steve Wozniack, and Skype co-founder Jaan Tallinn, most of the signatories are unfamiliar names to most of us, though the companies and institutions they represent aren’t – Pinterest, the MIT Center for Artificial Intelligence, UC Santa Cruz, Ripple, ABN-Amro Bank. Almost immediately, there was a dispute over the validity of the signatures..

My first reaction was on the order of: huh? The signatories are largely people who are inventing this stuff. They don’t have to issue a call. They can just *stop*, work to constrain the negative impacts of the services they provide, and lead by example. Or isn’t that sufficiently performative?

A second reaction: what about all those AI ethics teams that Silicon Valley companies are disbanding? Just in the last few weeks, these teams have been axed or cut at Microsoft and Twitch; Twitter of course ditched such fripperies last November in Musk’s inaugural wave of cost-cutting. The letter does not call to reinstate these.

The problem, as familiar critics such as Emily Bender pointed out almost immediately, is that the threats the letter focuses on are distant not-even-thunder. As she went on to say in a Twitter thread, the artificial general intelligence of the Singularitarian’s rapture is nowhere in sight. By focusing on distant threats – longtermism – we ignore the real and present problems whose roots are being continuously more deeply embedded into the new-building infrastructure: exploited workers, culturally appropriated data, lack of transparency around the models and algorithms used to build these systems….basically, all the ways they impinge upon human rights.

This isn’t the first time such a letter has been written and circulated. In 2015, Stephen Hawking, Musk, and about 150 others similarly warned of the dangers of the rise of “superintelligences”. Just a year later, in 2016, Pro Publica investigated the algorithm behind COMPAS, a risk-scoring criminal justice system in use in US courts in several states. Under Julia Angwin‘s scrutiny, the algorithm failed at both accuracy and fairness; it was heavily racially biased. *That*, not some distant fantasy, was the real threat to society.

“Threat” is the key issue here. This is, at heart, a letter about a security issue, and solutions to security issues are – or should be – responses to threat models. What is *this* threat model, and what level of resources to counter it does it justify?

Today, I’m far more worried by the release onto public roads of Teslas running Full Self Drive helmed by drivers with an inflated sense of the technology’s reliability than I am about all of human work being wiped away any time soon. This matters because, as Jessie Singal, author of There Are No Accidents, keeps reminding us, what we call “accidents” are the results of policy decisions. If we ignore the problems we are presently building in favor of fretting about a projected fantasy future, that, too, is a policy decision, and the collateral damage is not an accident. Can’t we do both? I imagine people saying. Yes. But only if we *do* both.

In a talk this week for a group at the French international research group AI Act. This effort began well before today’s generative tools exploded into public consciousness, and isn’t likely to conclude before 2024. It is, therefore, much more focused on the kinds of risks attached to public sector scandals like COMPAS and those documented in Cathy O’Neil’s 2017 book Weapons of Math Destruction, which laid bare the problems with algorithmic scoring with little to tether it to reality.

With or without a moratorium, what will “AI” look like in 2024? It has changed out of recognition just since the last draft text was published. Prediction from this biological supremacist: it still won’t be sentient.

All this said, as Edwards noted, even if the letter’s proposal is self-serving, a moratorium on development is not necessarily a bad idea. It’s just that if the risk is long-term and existential, what will six months do? If the real risk is the hidden continued centralization of data and power, then those six months could be genuinely destructive. So far, it seems like its major function is as a distraction. Resist.

Illustrations: IBM’s Watson, which beat two of Jeopardy‘s greatest champions in 2011. It has since failed to transform health care.

Wendy M. Grossman is the 2013 winner of the Enigma Award. Her Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Follow on Mastodon or Twitter.