Comment by swalsh

1 day ago

I built something similar for my own workflow. Works okay. The hard part is as you scale, you end up with compounded false affirmatives. Model adds some fallback mechanism that makes it work, tests pass, etc. The nice part is you can ask models to review the code from others, call out fallbacks, hard coding, stuff like that. It does a good job at identifying buried bodies. But if you dig up a buried body, I'd manually confirm it was properly disposed of as the models usually hid the body in the first place because they needed some input they didn't have, got confused or ran into an issue.