Comment by redox99

3 months ago

There are some misconceptions here.

It's incorrect to think because it is trained on buggy human code it will make these mistakes. It predicts the most likely token. Let's say 100 programmers write a function, most (unless it's something very tricky), won't forget to free that particular function. So the most likely tokens are those which do not leak.

In addition, this is not GPT 3. There's a massive amount of reinforcement learning at play, which reinforces good code, particularly verifiably good (which includes no leaks). And also a massive amount of synthetic data which can also be generated in a way that is provably correct.

1 comment

redox99

Capricorn2481 3 months ago

> Let's say 100 programmers write a function, most (unless it's something very tricky), won't forget to free that particular function. So the most likely tokens are those which do not leak.

You don't free a function.

And this would only be true if the function is the same content with minor variations, which is why LLMs are better suited for very small examples. Because bigger examples are less likely to be semantically similar, and so there is less data to determine the "correct" next token.

> There's a massive amount of reinforcement learning at play, which reinforces good code, particularly verifiably good (which includes no leaks)

This is a really dubious claim. Where are you getting this? Do you have some information on how these models are trained on C code specifically? How do you know whether the code they train on has no leaks?

There are huge projects that everyone depends on that have memory bugs in them right now. And these are actual experts missing these bugs, what makes you think the people at OpenAI are creating safer data than the people whose livelihoods actually depend on it?

This thread is full of people sharing how easy it is to make memory bugs with an LLM, and that has been my experience as well.