Comment by austinbaggio

2 months ago

Appreciate it - yeah, you're right, models don't work well when you just give it a giant dump of memory. We store memories in a small DB - think key/value pair with embeddings Every time you ask Claude something, the skill:

1. Embeds the current request.

2. Runs a semantic + timestamp-weighted search over your past sessions. Returns only the top N items that look relevant to this request.

3. Those get injected into the prompt as context (like extra system/user messages), so Claude sees just enough to stay oriented without blowing context limits.

Think of it like: Attention over your historical work, more so than brute force recall. Context on demand basically giving you an infinite context window. Bookmark + semantic grep + temporal rank. It doesn’t “know everything all the time.” It just knows how to ask its own past: “What from memory might matter for this?”