← Back to context

Comment by godelski

8 hours ago

  >  Anyone understand what's going on with the contradictory results between the text and tables?

Well Figure 1 would also disagree. It shows a FPR of 47.5%.

From Sec 3, end of second to last paragraph

  | The protocol is deterministic given fixed RNG seeds, caches model outputs

by program hash, and *bounds false positives via the chosen percentile and gap parameters.*

I believe this is a choice, though I think it is suspect that the FPR is pushed this high to get the TP results.

Disclaimer: I only gave this a very cursory skim so don't rely on me too much