← Back to context

Comment by energy123

1 day ago

I find the hardest thing is explaining what you want to the LLM. Even when you think you've done it well, you probably haven't. It's like a genie, take care with what you wish for.

I put great effort into maintaining a markdown file with my world model (usecases x principles x requirements x ...) pertaining to the project, with every guardrail tightened as much as possible, and every ambiguity and interaction with the user or wider world explained. This situates the project in all applicable contexts. That 15k token file goes into every prompt.

> It's like a genie, take care with what you wish for.

I used to be stuck with this thought. But I came across this delightful documentation RAG project and got to chat with the devs. Idea was that people can ask natural language questions and they get shown the relevant chunk of docs for that query. They were effectively pleading to a genie if I understood it right. Worse yet, the genie/LLM model kept updating weekly from the cloud platform they were using.

But the devs were engineers. They had a sample set of docs and sample set of questions that they knew the intended chunk for. So after model updates they ran the system through this test matrix and used it as feedback for tuning the system prompt. They said they had been doing it for a few months with good results, search remaining capable over time despite model changes.

While these agents.md etc. appear to be useful, I'm not sure they're going to be the key for long-term success. Maybe with a model change it becomes much less effective and the previous hours spent on it become wasteful.

I think something more verifiable/strict is going to be the secret sauce for llm agents. Engineering. I have heard claude code has decent scaffolding. Haven't gotten the chance to play with it myself though.

I liked the headline from some time ago that 'what if LLMs are just another piece of technology'?

>I find the hardest thing is explaining what you want to the LLM.

Honestly this isn't that much different then explaining to human programmers. Quite often we assume the programmer is going to automatically figure out the ambiguous things, but commonly it leads to undefined behavior or bugs in the product.

Most of the stuff I do is as a support engineer working directly with the client on identifying bugs, needed features, and short failings in the application. After a few reports I've made going terribly wrong when the feature came out I've learned to overly detailed and concise.

> That 15k token file goes into every prompt.

Same here. Large AGENTS.md file in current project.

Today I started experimenting splitting into smaller SKILL.md files but I'm weary that the agent might mistakenly decide to not load some files.

Do I read correctly that your md file is 15k tokens? how many words is that? that's a lot!

  • 20k words by the 0.75 words/token rule of thumb.

    It's a lot, but for quick projects I don't do this. Only for one important project that I have ownership of for over a year.

    Maintaining this has been worth it. It makes the codebase more stable, it's like the codebase slowly converges to what I want (as defined in the doc) the more inferences I run, rather than becoming spaghetti.