Goodhart’s Law (of AI)
If you’d like an essay-formatted version of this post to read or share, here’s a link to it on pluralistic.net, my surveillance-free, ad-free, tracker-free blog:
https://pluralistic.net/2025/08/10/elite-disunity/#awoken-giants
One way to think about AI’s unwelcome intrusion into our lives can be summed up with Goodhardt’s Law: “When a measure becomes a target, it ceases to be a good measure”:
https://en.wikipedia.org/wiki/Goodhart%27s_law
Goodhart’s Law is a harsh mistress. It’s incredibly exciting to discover a new way of measuring aspects of a complex system in a way that lets you understand (and thus control) it. In 1998, Sergey Brin and Larry Page realized that all the links created by everyone who’d ever made a webpage represented a kind of latent map of the value and authority of every website. We could infer that pages that had more links pointing to them were considered more noteworthy than pages that had fewer inbound links. Moreover, we could treat those heavily linked-to pages as authoritative and infer that when they linked to another page, it, too, was likely to be important.
This insight, called “PageRank,” was behind Google’s stunning entry into the search market, which was easily one of the most exciting technological developments of the decade, as the entire web just snapped into place as a useful system for retrieving information that had been created by a vast, uncoordinated army of web-writers, hosted in a distributed system without any central controls.
Then came the revenge of Goodhart’s Law. Before Google became the dominant mechanism for locating webpages, the only reason for anyone to link to a given page or site was because there was something there they thought you should see. Google aggregated all those “I think you should see this” signals and turned them into a map of the web’s relevance and authority.
But making a link to a webpage is easy. Once there was another reason to make a link between two web-pages – to garner traffic, which could be converted into money and/or influence – then bad actors made a lot of spurious links between websites. They created linkfarms, they spammed blog comments, they hacked websites for the sole purpose of adding a bunch of human-invisible, Google-scraper-readable links to pages.
The metric (“how many links are there to this page?”) became a target (“make links to this page”) and ceased to be a useful metric.
Goodhart’s Law is still a plague on Google search quality. “Reputation abuse” is a webcrime committed by venerable sites like Forbes, Fortune and Better Homes and Gardens, who abuse the authority imparted by tons of inbound links accumulated over decades by creating spammy, fake product-review sites stuffed with affiliate links, that Google ranks more highly than real, rigorous review sites because of all that accumulated googlejuice:
https://pluralistic.net/2024/05/03/keyword-swarming/#site-reputation-abuse
Goodhart’s Law is 50 years old, but policymakers are woefully ignorant of it and continue to operate as though it doesn’t apply to them. This is especially pronounced when policymakers are determined to Do Something about a public service that has been starved of funding kicked around as a political football to the point where it has degraded and started to outrage the public. When this happens, policymakers are apt to blame public servants – rather than themselves – for this degradation, and then set out to Bring Accountability to those public employees.
The NHS did this with ambulance response times, which are very bad, and that fact is, in turn, very bad. The reason ambulance response times suck isn’t hard to winkle out: there’s not enough money being spent on ambulances, drivers, and medics. But that’s not a politically popular conclusion, especially in the UK, which has been under brutal and worsening austerity since the Blair years (don’t worry, eventually they’ll do enough austerity and things will really turn around, because, as the old saying goes, “Good policymaking consists of doing the same thing over and over and expecting a different outcome).”
Instead of blaming inadequate funding for poor ambulance response times, politicians blamed “inefficiency,” driven by a poor motivation. So they established a metric: ambulances must arrive within a certain number of minutes (and they set a consequence: massive cuts to any ambulance service that didn’t meet the metric).
Now, “an ambulance where it’s needed within a set amount of time” may sound like a straightforward metric, and it was – retrospectively. As in, we could tell that the ambulance service was in trouble because ambulances were taking half an hour or more to arrive. But prospectively, after that metric became a target, it immediately ceased to be a good metric. That’s because ambulance services, faced with the impossible task of improving response times without spending money, started to dispatch ambulance motorbikes that couldn’t carry 95% of the stuff needed to respond to a medical emergency, and had no way to get patients back to hospitals. These motorbikes were able to meet the response-time targets…without improving the survival rates of people who summoned ambulances:
https://timharford.com/2014/07/underperforming-on-performance/
Goodhardt’s Law seems like a guideline for another kind of enshittification. Just establish a metric that enforceably measures “goodness”, and watch the system wreck itself trying to conform to that metric.
The main difference between the weather being uncomfortably cold, and the weather being uncomfortably hot, is that the things you can do in the cold to warm yourself up (hot food/beverage, blankets, cuddles, nice clothes like sweaters, thick scarves and snazzy jackets, getting exercise) are very pleasant and very effective, and the things you can do in the heat to cool yourself down don’t do shit.
Religious extremists dont find god. They find racism and misogyny.
Important things about the nature of reality:
Lemmings in real life do not randomly bolt to their deaths like lemmings do in cartoons. Small children randomly bolt to their deaths like lemmings in cartoons.
Piranhas in real life do not flock to pick anything to the bone in a span of seconds like piranhas do in cartoons. Domestic chickens pick anything to the bone in a span of seconds like piranhas do in cartoons.
Solomon Island Prehensile-tailed Skink (Corucia zebrata), family Scincidae, endemic to the Solomon Islands archipelago
- Arboreal, herbivorous, crepuscular.
- The largest known species of skink, they can grow to a total length of 32 inches (81 cm).
- Live-bearing, they provide parental care for the young, after birth, as well. Females are known to be fiercely protective of the young.
- They are actually social, and live in extended family groups.
- This has been scientifically proven to be one of the best lizards, and I love them.
photograph by Tara Biron
(via bogleech)
Jacob Epstein, Torso in Metal from ‘The Rock Drill’, 1913-15, London, Tate Britain. Photo from July 2025.
(via mostlysignssomeportents)