Comment by gruez

16 hours ago

That's not really a fair test because you're leading the model pretty hard, even if the prompt doesn't specifically say there's a bug to be found. It's basically the same objections that people raised in the thread where someone claimed current models are just as good as mythos.

4 comments

gruez

naruhodo 4 hours ago

I don't agree, and I'd like to understand your point of view.

To me, asking if a function has something wrong with it is just a very basic code review - something that should happen with every function. A competent, security conscious engineer would respond the same way as the model, unsurprisingly, since the model is... modelling competence.

saagarjha 1 hour ago

Code review that finds problems in all code is useless.

shay_ker 16 hours ago

right exactly, but clearly it's possible to elicit the behavior we want in the model, which means the capabilities are there!

Matumio 14 hours ago

The more interesting question is, how many issues will this prompt report to you in random code that is perfectly fine?