Comment by steve_avery
22 days ago
I'd be interested, but they don't even list any anthropic model on their code review benchmark, so I feel like they haven't really tested their benchmark on SOTA models.
22 days ago
I'd be interested, but they don't even list any anthropic model on their code review benchmark, so I feel like they haven't really tested their benchmark on SOTA models.
Whenever I see this, I make the (almost always correct) assumption that the SOTA models had an advantage, with the alternative explanation being a complete lack of awareness of the state of AI (which is very very rare for a tool like this).
With SOTA missing, it also is a strong indicator that someone like you is not the target audience.