← Back to context

Comment by csallen

4 days ago

This applies to AI, too, albeit in different ways:

1. You can iteratively improve the rules and prompts you give to the AI when coding. I do this a lot. My process is constantly improving, and the AI makes fewer mistakes as a result.

2. AI models get smarter. Just in the past few months, the LLMs I use to code are making significantly fewer mistakes than they were.

But my gripe with your first point is that by the time I write an exact detailed step-by-step prompt for them, I could have written the code by hand. Like there is a reason we are not using fuzzy human language in math/coding, it is ambiguous. I always feel like doing those funny videos where you have to write exact instructions on how to make a peanut butter sandwich, getting deliberately misinterpreted. Except it is not fun at all when you are the one writing the instructions.

2. It's very questionable that they will get any smarter, we have hit the plateau of diminishing returns. They will get more optimized, we can run them more times with more context (e.g. chain of thought), but they fundamentally won't get better at reasoning.

  • > by the time I write an exact detailed step-by-step prompt for them, I could have written the code by hand

    The improved prompt or project documentation guides every future line of code written, whether by a human or an AI. It pays dividends for any long term project.

    > Like there is a reason we are not using fuzzy human language in math/coding

    Math proofs are mostly in English.

That you don't know when it will make a mistake and that it is getting harder to find them are not exactly encouraging signs to me.

  • Do you mean something by "getting harder to find them" that is different from "they are making fewer dumb errors"?

    • There are definitely dumb errors that are hard for human reviewers to find because nobody expects them.

      One concrete example is confusing value and pointer types in C. I've seen people try to cast a `uuid` variable into a `char` buffer to, for example, memset it, by doing `(const char *)&uuid)`. It turned out, however, that `uuid` was not a value type but rather a pointer, and so this ended up just blasting the stack because instead of taking the address of the uuid storage, it's taking the address of the pointer to the storage. If you're hundreds of lines deep and are looking for more complex functional issues, it's very easy to overlook.

And you can build automatic checks that reinforce correct behavior for when the lessons haven’t been learned, by bot or human.