Comment by varispeed

1 day ago

The models used for apps like Codex, are they designed to mimic human behaviour - as in they deliberately create errors in code that then you have to spend time debugging and fixing or it is natural flaw and that humans also do it is a coincidence?

This keeps bothering me, why they need several iterations to arrive at correct solution instead of doing it first time. The prompts like "repeat solving it until it is correct" don't help.

11 comments

varispeed

embedding-shape 1 day ago

> as in they deliberately create errors in code that then you have to spend time debugging and fixing

No, all the models are designed to be "helpful", but different companies see that as different things.

If you're seeing the model deliberately creating errors so you have something to fix, then that sounds like something is fundamentally wrong in your prompt.

Besides that, I'm guessing "repeat solving it until it is correct" is a concise version of your actual prompt, or is that verbatim what you prompt the model? If so, you need to give it more details to actually be able to execute something like that.

varispeed 1 day ago
> then that sounds like something is fundamentally wrong in your prompt.
I am holding it wrong?
- embedding-shape 1 day ago
  
  Some things take a bit of skill to use, yes. Like not everyone can play music with a guitar, you need to train a bit before it sounds OK.
koakuma-chan 1 day ago
> If you're seeing the model deliberately creating errors so you have something to fix, then that sounds like something is fundamentally wrong in your prompt.
No, all these models are just bad for anything that they weren't RLed for, and decent for things they were. Decent, because people who evaluate them aren't experts.
- embedding-shape 1 day ago
  
  > No, all these models are just bad for anything that they weren't RLed for, and decent for things they were
  Are you claiming that the models are RLed to intentionally adding errors to our programs when you use them, or what's the argument you're trying to make here? Otherwise I don't see how it's relevant to how I said.
  
  6 replies →