← Back to context

Comment by BoiledCabbage

1 day ago

> Tests that a "do nothing" AI can pass aren't intrinsically invalid but they should certainly be only a very small number of the tests. I'd go with low-single-digit percentage, not 38%. But I would say it should be above zero; we do want to test for the AI being excessively biased in the direction of "doing something", which is a valid failure state.

There is a simple improvement here: give the agent a "do nothing" button. That way it at least needs to understand the task well enough to know it should press the do nothing button.

Now a default agent that always presses it still shouldn't score 38%, but that's better than a NOP agent scoring 38%.