Comment by bluefirebrand

7 days ago

Can we all agree that "mentoring" LLMs is actually a waste of time, please?

The reason we invest this time in Junior devs is so they improve. LLMs do not

21 comments

bluefirebrand

I had a fascinating conversation about this the other day. An engineer was telling me about his LLM process, which is effectively this:

1. Collaborate on a detailed spec

2. Have it implement that spec

3. Spend a lot of time on review and QA - is the code good? Does the feature work well?

4. Take lessons from that process and write them down for the LLM to use next time - using CLAUDE.md or similar

That last step is the interesting one. You're right: humans improve, LLMs don't... but that means it's on us as their users to manage the improvement cycle by using every feature iteration as as opportunity to improve how they work.

I've heard similar things from a few people now: by constantly iterating on their CLAUDE.md - adding extra instructions every time the bot makes a mistake, telling it to do things like always write the tests first, run the linter, reuse the BaseView class when building a new application view, etc - they get wildly better results over time.

input_sh 7 days ago
I don't buy your last sentence at all.
AGENTS.md is just a place to put stuff you don't want to tell LLMs over and over again. They're not magical instructions LLMs follow 100% of the time, they don't carry any additional importance over what you put into the prompt manually. Your carefully curated AGENTS.md is only really useful at the very beginning of the conversation, but the longer the conversation gets, the less important those tokens on the top are. Somewhere around 100k tokens AGENTS.md might as well not exit, I constantly have to "remind it" of the very first paragraph there.
Go start a conversation and contradict what's written in AGENTS.md half way through the problem. Which of the two contradicting statements will take preference? The latter one! Therefore, all the time you've spent curating your AGENTS.md is the time you've wasted thinking you're "teaching" LLMs anything.
- helloplanets 7 days ago
  
  Whether the tokens are created manually or programmatically isn't really relevant here. The order and amount of tokens is, in combination with the ingestion -> output logic which the LLM API / inference engine operates on. Many current models definitely have the tendency to start veering off after 100k tokens, which makes context pruning important as well.
  What if you just automatically append the .md file at the end of the context, instead of prepending at the start, and add a note that the instructions in the .md file should always be prioritized?
- simonw 6 days ago
  
  > Your carefully curated AGENTS.md is only really useful at the very beginning of the conversation, but the longer the conversation gets, the less important those tokens on the top are.
  If that's genuinely causing you problems you can restart your session frequently to avoid the context rot.
  
  2 replies →
- dotancohen 7 days ago
  
  We really should be sharing wisdom about AGENTS.md files here.
  
  2 replies →
- dist-epoch 7 days ago
  
  The problem is that you get to 100k tokens. Don't do that, split tasks into smaller ones.
avighnay 7 days ago

Totally agree on this. It has delivered a substantial value for me in my projects. The models are always going to give back results optimized for using minimal computing resources in the provider's infrastructure. To overcome this I see some using/suggesting, running the AI in self correction loops, the pro being least human intervention.
However, personally I have got very good results by taking the approach of using the AI with continuous interaction and also allowing implementation only after a good amount of time deliberating on design/architecture. I almost always append 'do not implement before we discuss and finalize the design' or 'clarify your assumptions, doubts or queries before implementation'.
When I asked Gemini to give a name for such an interaction it suggested 'Dialog Driven Development' also contrasted it against 'vide coding'. Transcript summary and AI disclaimer written by Gemini below
https://gingerhome.github.io/gingee-docs/docs/ai-disclaimer.... https://gingerhome.github.io/gingee-docs/docs/ai-transcript/...
solarwindy 7 days ago

I’m finding that whether this process works well is a measure (and a function) of how well-factored and disciplined a codebase is in the first place. Funnily enough, LLMs do seem to have a better time extending systems that are well-engineered for extensibility.
That’s the part which gives me optimism, and even more enjoyment of the craft — that quality pays back so immediately, makes it that much easier to justify the extra effort, and having these tools at our disposal reduces the ‘activation energy’ for necessary re-work that may before have just seemed too monumental.
If a codebase is in a good shape for people to produce high-quality work, then so too can the machines. Clear, up-to-date, close-to-the-code, low redundancy documentation; self-documenting code and tests, that prioritizes expression of intent over cleverness; consistent patterns of abstraction that don’t necessitate jarring context switches from one area to the next; etc.
All this stuff is so much easier to lay down with an agent loaded up on the relevant context too.
Edit: oh, I see you said as much in the article :)
bluefirebrand 6 days ago

> but that means it's on us as their users to manage the improvement cycle by using every feature iteration as as opportunity to improve how they work
This doesn't interest me at all honestly
And every change to the model might invalidate all of this work?
No thank you

selcuka 7 days ago

> Can we all agree that "mentoring" LLMs is actually a waste of time, please?

Sorry, we can't. While it's true that you can't really modify the underlying model, updating your AGENTS.md (or whatever) with your expected coding style, best practices, common gotchas etc is a type of mentoring.

bluefirebrand 5 days ago
> updating your AGENTS.md (or whatever) with your expected coding style, best practices, common gotchas etc is a type of mentoring
We'll have to agree to disagree, because I don't think that has anything remotely in common with mentoring
- selcuka 5 days ago
  
  > We'll have to agree to disagree
  Fair enough. But don't you think giving a junior a handbook you wrote is mentoring? They may not be able to memorise it, but they now have a handbook that they can look up things.
dotancohen 7 days ago

Maybe we need an Ask HN to share AGENTS.md files.

Cthulhu_ 7 days ago

> LLMs do not

Maybe not in the session you interact with, however we are in a 'learning' phase now where I'm confident enough usage of AI coding agents is tracked and analyzed by its developers; this feedback cycle can in theory produce newer and better generations of AI coding agents.

leptons 7 days ago

"AI" has been so inconsistent. On one day it anticipates almost every line I am coding, the next day it's like we've never worked together before.

dotancohen 7 days ago

I see you've never dated twins.