← Back to context

Comment by btown

1 day ago

I'm unsure of its accuracy/provenance/outdatedness, but this purportedly extracted system prompt for Claude Code provides a lot more detail about TODO iteration and how powerful it can be:

https://gist.github.com/wong2/e0f34aac66caf890a332f7b6f9e2ba...

https://gist.github.com/wong2/e0f34aac66caf890a332f7b6f9e2ba...

I find it fascinating that while in theory one could just append these as reasoning tokens to the context, and trust the attention algorithm to find the most recent TODO list and attend actively to it... in practice, creating explicit tools that essentially do a single-key storage are far more effective and predictable. It makes me wonder how much other low-hanging fruit there is with tool creation for storing language that requires emphasis and structure.

I find in coding + investigating there's a lot of mileage to being fancier on the todo list. Eg, we make sure timestamps, branches, outcomes, etc are represented. It's impressive how far they get with so little!

For coding, I actually fully take over the todo list in codex + claude: https://github.com/graphistry/pygraphistry/blob/master/ai/pr...

In Louie.ai, for investigations, we're experimenting with enabling more control of it, so you can go with the grain, vs that kind of wholecloth replacement

  • Ooh, am I reading correctly that you're using the filesystem as the storage for a "living system prompt" that also includes a living TODO list? That's pretty cool!

    And on a separate note - it looks like you're making a system for dealing with graph data at scale? Are you using LLMs primarily to generate code for new visualizations, or also to reason directly about each graph in question? To tie it all together, I've long been curious whether tools can adequately translate things from "graph space" to "language space" in the context of agentic loops. There seems to be tremendous opportunity in representing e.g. physical spaces as graphs, and if LLMs can "imagine" what would happen if they interacted with them in structured ways, that might go a long way towards autonomous systems that can handle truly novel environments.

    • yep! So all repos get a (.gitignore'd) folder of `plans/<task>/plan.md` work histories . That ends up being quite helpful in practice: calculating billable hours of work, forking/auditing/retrying, easier replanning, etc. At the same time, I rather be with-the-grain of the agentic coder's native systems for plans + todos, eg, alignment with the models & prompts. We've been doing this way b/c we find the native to be weaker than what these achieve, and to hard to add these kind of things to them.

      RE:Other note, yes, we have 2 basic goals:

      1. Louie to make graphs / graphistry easier. Especially when connected to operational databases (splunk, kusto, elastic, big query, ...). V1 was generating graphistry viz & GFQL queries. We're now working on louie inside of graphistry, for more dynamic control of the visual analysis environment ("filter to X and color Y as Z"), and as you say, to go straight to the answer too ("what's going on with account/topic X"). We spent years trying to bring jupyter notebooks etc to operational teams as a way to get graph insights to their various data, and while good for a few "data 1%'ers", too hard for most, and Louie has been a chance to rethink that.

      2. Louie has been seeing wider market interest beyond graph, basically "AI that investigates" across those operational DBs (& live systems). You can think of it as vibe coding is code-oriented, while louie is vibe investigating that is more data-oriented. Ex: Native plans don't think in unit tests but cross-validation, and instead of grepping 1,000 files, we get back a dataframe of 1M query results and pass that between the agents for localized agentic retrieval on that vs rehammering db. The CCC talk gives a feel for this in the interactive setting.

aren't the system prompt of Claude public in the doc at https://platform.claude.com/docs/en/release-notes/system-pro... ?

  • The system prompt of claude code changes constantly. I use this site to see what has changed between versions: https://cchistory.mariozechner.at/

    It is a bit weird why anthropic doesn't make that available more openly. Depending on your preferences there is stuff in the default system prompt that you may want to change.

    I personally have a list of phrases that I patch out from the system prompt after each update by running sed on cc's main.js

From elsewhere in that prompt:

> Only use emojis if the user explicitly requests it. Avoid adding emojis to files unless asked.

When did they add this? Real shame because the abundance of emojis in a readme was a clear signal of slop.