← Back to context

Comment by tuhlatte

5 days ago

Now I'm confused -- you're claiming you meant "good enough code" when your previous definition was such that even mathematical proofs could be "terrible"? That doesn't make sense to me. In software engineering, "good enough" has reasonably clear criteria: passes tests, performs adequately, follows conventions, etc. While these are imperfect proxies, they're sufficient for most real-world applications, and crucially -- measurable. And my claim is that they will be more than adequate to get LLMs to produce good code.

And again, diffusion models aren't relevant here. The original comment was about LLMs producing buggy code -- not RL's general limitations in other domains. Diffusion models' tensors aren't written by hand.

  > Now I'm confused ... that even mathematical proofs could be "terrible"? That doesn't make sense to me.

You know there's plenty of ways to prove things, right? Like there's not a single proof. Here's a few proofs for pi being irrational[0]. The list is not comprehensive.

Take that like you do with code. They all generate the same final output. They're all correct. But is one better than another? Yes, yes it is. But which one that is depends on context.

  > and crucially -- measurable

This is probably a point of contention. Measuring is far more difficult than people think. A lot of work goes into creating measurements and we get a nice ruler at the end. The problem isn't just that initial complexity, it is that every measure is a proxy. Even your meter stick doesn't measure a meter. What distinguishes the engineer from the hobbyist is the knowledge of alignment.

  How well does my measure align with what I intend to measure?

That's a very hard problem. How often do you ask yourself that? I'm betting not enough. Frankly, most things aren't measurable.

[0] https://proofwiki.org/wiki/Pi_is_Irrational#:~:text=Hence%20...