Comment by adithyareddy

23 days ago

You can also show context in the statusline within claude code: https://code.claude.com/docs/en/statusline#context-window-us...

8 comments

adithyareddy

nipponese 23 days ago

Follow up Q: what are you supposed to do when the context becomes too large? Start a new conversation/context window and let Claude start from scratch?

d4rkp4ttern 23 days ago

Context filling up is sort of the Achilles heel of CLI agents. The main remedy is to have it output some type of handoff document and then run /compact which leaves you with a summary of the latest task. It sort of works but by definition it loses information, and you often find yourself having to re-explain or re-generate details to continue the work.
I made a tool[1] that lets you just start a new session and injects the original session file path, so you can extract any arbitrary details of prior work from there using sub-agents.
[1] aichat tool https://github.com/pchalasani/claude-code-tools?tab=readme-o...
kcoddington 23 days ago

Either have Claude /compact or have it output things to a file it can read in on the next session. That file would be a summary of progress for work on a spec or something similar. Also good to prime it again with the Readme or any other higher level context
theptip 23 days ago

It’s a good idea to have Claude write down the execution plan (including todos). Or you can use something like Linear / GH Issues to track the big items. Then small/tactical todos are what you track in session todos.
This approach means you can just kill the session and restart if you hit limits.
(If you hit context limits you probably also want to look into sub-agents to help prevent context bloat. For example any time you are running and debugging unit tests, it’s usually best to start with a subagent to handle the easy errors. )
pbhjpbhj 23 days ago
It feels like one could produce a digest of the context that works very similarly but fits in the available context window - not just by getting the LLM to use succinct language, but also mathematically; like reducing a sparse matrix.
There might be an input that would produce that sort of effect, perhaps it looks like nonsense (like reading zipped data) but when the LLM attempts to do interactive in it the outcome is close to consuming the context?
- docjay 23 days ago
  
``` §CONV_DIGEST§ T1:usr_query@llm-ctx-compression→math-analog(sparse-matrix|zip)?token-seq→nonsense-input→semantic-equiv-output? T2:rsp@asymmetry_problem:compress≠decompress|llm=predict¬decode→no-bijective-map|soft-prompts∈embedding-space¬token-space+require-training|gisting(ICAE)=aux-model-compress→memory-tokens|token-compress-fails:nonlinear-distributed-mapping+syntax-semantic-entanglement|works≈lossy-semantic-distill@task-specific+finetune=collapse-instruction→weights §T3:usr→design-full-python-impl§ T4:arch_blueprint→ DIR:src/context_compressor/{core/(base|result|pipeline)|compressors/(extractive|abstractive|semantic|entity_graph|soft_prompt|gisting|hybrid)|embeddings/(providers|clustering)|evaluation/(metrics|task_performance|benchmark)|models/(base|openai|anthropic|local)|utils/(tokenization|text_processing|config)} CLASSES:CompressionMethod=Enum(EXTRACTIVE|ABSTRACTIVE|SEMANTIC_CLUSTERING|ENTITY_GRAPH|SOFT_PROMPT|GISTING|HYBRID)|CompressionResult@(original_text+compressed_text+original_tokens+compressed_tokens+method+compression_ratio+metadata+soft_vectors?)|TokenCounter=Protocol(count|truncate_to_limit)|EmbeddingProvider=Protocol(embed|embed_single)|LLMBackend=Protocol(generate|get_token_limit)|ContextCompressor=ABC(token_counter+target_ratio=0.25+min_tokens=50+max_tokens?→compress:abstract)|TrainableCompressor(ContextCompressor)+(train+save+load) COMPRESSORS:extractive→(TextRank|MMR|LeadSentence)|abstractive→(LLMSummary|ChainOfDensity|HierarchicalSummary)|semantic→(ClusterCentroid|SemanticChunk|DiversityMaximizer)|entity→(EntityRelation|FactList)|soft→(SoftPrompt|PromptTuning)|gist→(GistToken|Autoencoder)|hybrid→(Cascade|Ensemble|Adaptive) EVAL:EvaluationResult@(compression_ratio+token_reduction+embedding_similarity+entailment_score+entity_recall+fact_recall+keyword_overlap+qa_accuracy?+reconstruction_bleu?)→composite_score(weights)|CompressionEvaluator(embedding_provider+llm?+nli?)→evaluate|compare_methods PIPELINE:CompressionPipeline(steps:list[Compressor])→sequential-apply|AdaptiveRouter(compressors:dict+classifier?)→content-based-routing DEPS:numpy|torch|transformers|sentence-transformers|tiktoken|networkx|sklearn|spacy|openai|anthropic|pandas|pydantic+optional(accelerate|peft|datasets|sacrebleu|rouge-score) ```
AlexMoffat 23 days ago

I ask it to write a markdown file describing how it should go about performing the task. Then have it read the file next time. Works well for things like creating tests for controller methods where there is a procedure it should follow that was probably developed over a session with several prompts and feedback on its output.
facorreia 23 days ago

Start in plan mode, generating a markdown file with the plan, keep it up to date as it is executed, and after each iteration commit, clear the context and tell it to read the plan and execute the next step.