← Back to context

Comment by conradev

5 days ago

I’m learning live how to use these things better, and I haven’t seen practical guides like:

- Split things into small files, today’s model harnesses struggle with massive files

- Write lots of tests. When the language model messes up the code (it will), it can use the tests to climb out. Tests are the best way to communicate behavior.

- Write guides and documentation for complex tasks in complex codebases. Use a language model for the first pass if you’re too lazy. Useful for both humans and LLMs

It’s really: make your codebase welcoming for junior engineers

> it can use the tests to climb out

Or not. I watched Copilot's agent mode get stuck in a loop for most of an hour (to be fair, I was letting it continue to see how it handles this failure case) trying to make a test pass.

  • Yeah! When that happens I usually stop it and tap in a bigger model to “think” and get out of the loop (or fix it myself)

    I’m impressed with this latest generation of models: they reward hack a lot less. Previously they’d change a failing unit test, but now they just look for reasonable but easy ways out in the code.

    I call it reward hacking, and laziness is not the right word, but “knowing what needs to be done and not doing it” is the general issue here. I see it in junior engineers occasionally, too.