← Back to context Comment by Jimmc414 1 day ago Goodhart’s Law in reverse, what can’t be gamed gets rejected. 2 comments Jimmc414 Reply stephen_cagle 19 hours ago You've almost buffer overrun Goodhart's Law into the https://en.wikipedia.org/wiki/McNamara_fallacy . :] cbg0 20 hours ago SWE-bench verified was created in collaboration with OpenAI. It's also an open dataset so prone to contamination, meaning it can be gamed.
stephen_cagle 19 hours ago You've almost buffer overrun Goodhart's Law into the https://en.wikipedia.org/wiki/McNamara_fallacy . :]
cbg0 20 hours ago SWE-bench verified was created in collaboration with OpenAI. It's also an open dataset so prone to contamination, meaning it can be gamed.
You've almost buffer overrun Goodhart's Law into the https://en.wikipedia.org/wiki/McNamara_fallacy . :]
SWE-bench verified was created in collaboration with OpenAI. It's also an open dataset so prone to contamination, meaning it can be gamed.