Comment by MR_Bulldops
2 days ago
Do you have an example that doesn't involve an objective metric? Of course objective metrics won't turn bad. They're more measurements than metrics, really.
2 days ago
Do you have an example that doesn't involve an objective metric? Of course objective metrics won't turn bad. They're more measurements than metrics, really.
I'd like to push back on this a little, because I think it's important to understanding why Goodhart's Law shows up so frequently.
*There are no /objective/ metrics*, only proxies.
You can't measure a meter directly, you have to use a proxy like a tape measure. Similarly you can't measure time directly, you have to use a stop watch. In a normal conversation I wouldn't be nitpicking like this because those proxies are so well aligned with our intended measures and the lack of precision is generally inconsequential. But once you start measuring anything with precision you cannot ignore the fact that you're limited to proxies.
The difference of when we get more abstract in our goals is not too dissimilar. Our measuring tools are just really imprecise. So we have to take great care to understand the meaning of our metrics and their limits, just like we would if we were doing high precision measurements with something more "mundane" like distance.
I think this is something most people don't have to contend with because frankly, very few people do high precision work. And unfortunately we often use algorithms as black boxes. But the more complex a subject is the more important an expert is. It looks like they are just throwing data into a black box and reading the answer, but that's just a naive interpretation.
This isn't what Goodhart's law is about.
Sure, if you get a ruler from the store it might be off by a fraction of a percent in a way that usually doesn't matter and occasionally does, but even if you could measure distance exactly that doesn't get you out of it.
Because what Goodhart's law is really about is bureaucratic cleavage. People care about lots of diverging and overlapping things, but bureaucratic rules don't. As soon as you make something a target, you've created the incentive to make that number go up at the expense of all the other things you're not targeting but still care about.
You can take something which is clearly what you actually want. Suppose you're commissioning a spaceship to take you to Alpha Centauri and then it's important that it go fast because otherwise it'll take too long. We don't even need to get into exactly how fast it needs to go or how to measure a meter or anything like that, we can just say that going fast is a target. And it's a valid target; it actually needs to do that.
Which leaves you already in trouble. If your organization solicits bids for the spaceship and that's the only target, you better not accept one before you notice that you also need things like "has the ability to carry occupants" and "doesn't kill the occupants" and "doesn't cost 999 trillion dollars" or else those are all on the chopping block in the interest of going fast.
So you add those things as targets too and then people come up with new and fascinating ways to meet them by sacrificing other things you wanted but didn't require.
What's really happening here is that if you set targets and then require someone else to meet them, they will meet the targets in ways that you will not like. It's the principal-agent problem. The only real way out of it is for principals to be their own agents, which is exactly the thing a bureaucracy isn't.
I agree with you, in a way.
I've just taken another step to understand the philosophy of those bureaucrats. Clearly they have some logic, right? So we have to understand why they think they can organize and regulate from the spreadsheet. Ultimately it comes down to a belief that the measurements (or numbers) are "good enough" and that they have a good understanding of how to interpret them. Which with many bureaucracies that is the belief that no interpretation is needed. But we also see that behavior with armchair experts who try to use data to evidence their conclusion rather than interpret data and conclude from that interpretation.
Goodhart had focused on the incentive structure of the rule, but that does not tell us how this all happens and why the rule is so persistent. I think you're absolutely right that there is a problem with agents, and it's no surprise that when many introduce the concept of "reward hacking" that they reference Goodhart's Law. Yes, humans can typically see beyond the metric and infer the intended outcome, but ignore this because they don't care and so fixate on the measurement because that gives them the reward. Bureaucracies no doubt amplify this behavior as they are well known to be soul crushing.
But we should also be asking ourselves if the same effect can apply in settings where we have the best of intentions and all the agents are acting in good faith and trying to interpret the measure instead of just game it. The answer is yes. Idk, call it Godelski's Corollary if you want (I wouldn't), but it this relates to Goodhart's Law at a fundamental level. You can still have metric hacking even when agents aren't aware or even intending to do so. Bureaucracy is not required.
4 replies →