← Back to context

Comment by bluefirebrand

7 days ago

Can we all agree that "mentoring" LLMs is actually a waste of time, please?

The reason we invest this time in Junior devs is so they improve. LLMs do not

I had a fascinating conversation about this the other day. An engineer was telling me about his LLM process, which is effectively this:

1. Collaborate on a detailed spec

2. Have it implement that spec

3. Spend a lot of time on review and QA - is the code good? Does the feature work well?

4. Take lessons from that process and write them down for the LLM to use next time - using CLAUDE.md or similar

That last step is the interesting one. You're right: humans improve, LLMs don't... but that means it's on us as their users to manage the improvement cycle by using every feature iteration as as opportunity to improve how they work.

I've heard similar things from a few people now: by constantly iterating on their CLAUDE.md - adding extra instructions every time the bot makes a mistake, telling it to do things like always write the tests first, run the linter, reuse the BaseView class when building a new application view, etc - they get wildly better results over time.

  • I don't buy your last sentence at all.

    AGENTS.md is just a place to put stuff you don't want to tell LLMs over and over again. They're not magical instructions LLMs follow 100% of the time, they don't carry any additional importance over what you put into the prompt manually. Your carefully curated AGENTS.md is only really useful at the very beginning of the conversation, but the longer the conversation gets, the less important those tokens on the top are. Somewhere around 100k tokens AGENTS.md might as well not exit, I constantly have to "remind it" of the very first paragraph there.

    Go start a conversation and contradict what's written in AGENTS.md half way through the problem. Which of the two contradicting statements will take preference? The latter one! Therefore, all the time you've spent curating your AGENTS.md is the time you've wasted thinking you're "teaching" LLMs anything.

    • Whether the tokens are created manually or programmatically isn't really relevant here. The order and amount of tokens is, in combination with the ingestion -> output logic which the LLM API / inference engine operates on. Many current models definitely have the tendency to start veering off after 100k tokens, which makes context pruning important as well.

      What if you just automatically append the .md file at the end of the context, instead of prepending at the start, and add a note that the instructions in the .md file should always be prioritized?

    • > Your carefully curated AGENTS.md is only really useful at the very beginning of the conversation, but the longer the conversation gets, the less important those tokens on the top are.

      If that's genuinely causing you problems you can restart your session frequently to avoid the context rot.

      2 replies →

  • Totally agree on this. It has delivered a substantial value for me in my projects. The models are always going to give back results optimized for using minimal computing resources in the provider's infrastructure. To overcome this I see some using/suggesting, running the AI in self correction loops, the pro being least human intervention.

    However, personally I have got very good results by taking the approach of using the AI with continuous interaction and also allowing implementation only after a good amount of time deliberating on design/architecture. I almost always append 'do not implement before we discuss and finalize the design' or 'clarify your assumptions, doubts or queries before implementation'.

    When I asked Gemini to give a name for such an interaction it suggested 'Dialog Driven Development' also contrasted it against 'vide coding'. Transcript summary and AI disclaimer written by Gemini below

    https://gingerhome.github.io/gingee-docs/docs/ai-disclaimer.... https://gingerhome.github.io/gingee-docs/docs/ai-transcript/...

  • I’m finding that whether this process works well is a measure (and a function) of how well-factored and disciplined a codebase is in the first place. Funnily enough, LLMs do seem to have a better time extending systems that are well-engineered for extensibility.

    That’s the part which gives me optimism, and even more enjoyment of the craft — that quality pays back so immediately, makes it that much easier to justify the extra effort, and having these tools at our disposal reduces the ‘activation energy’ for necessary re-work that may before have just seemed too monumental.

    If a codebase is in a good shape for people to produce high-quality work, then so too can the machines. Clear, up-to-date, close-to-the-code, low redundancy documentation; self-documenting code and tests, that prioritizes expression of intent over cleverness; consistent patterns of abstraction that don’t necessitate jarring context switches from one area to the next; etc.

    All this stuff is so much easier to lay down with an agent loaded up on the relevant context too.

    Edit: oh, I see you said as much in the article :)

  • > but that means it's on us as their users to manage the improvement cycle by using every feature iteration as as opportunity to improve how they work

    This doesn't interest me at all honestly

    And every change to the model might invalidate all of this work?

    No thank you

> Can we all agree that "mentoring" LLMs is actually a waste of time, please?

Sorry, we can't. While it's true that you can't really modify the underlying model, updating your AGENTS.md (or whatever) with your expected coding style, best practices, common gotchas etc is a type of mentoring.

  • > updating your AGENTS.md (or whatever) with your expected coding style, best practices, common gotchas etc is a type of mentoring

    We'll have to agree to disagree, because I don't think that has anything remotely in common with mentoring

    • > We'll have to agree to disagree

      Fair enough. But don't you think giving a junior a handbook you wrote is mentoring? They may not be able to memorise it, but they now have a handbook that they can look up things.

> LLMs do not

Maybe not in the session you interact with, however we are in a 'learning' phase now where I'm confident enough usage of AI coding agents is tracked and analyzed by its developers; this feedback cycle can in theory produce newer and better generations of AI coding agents.

"AI" has been so inconsistent. On one day it anticipates almost every line I am coding, the next day it's like we've never worked together before.