Show HN: Atom – open-source AI agent with "visual" episodic memory
5 hours ago (github.com)
Hey HN,
I’ve been building Atom (https://github.com/rush86999/atom), an open-source, self-hosted AI automation platform.
I built this because while tools like OpenClaw are excellent for one-off scripts and personal tasks, I found them difficult to use for complex business workflows (e.g., managing invoices or SaaS ops). The main issue was State Blindness: the agent would fire a command and assume it worked, without "seeing" if the UI or state actually updated.
I just shipped a new architecture to solve this called Canvas AI Accessibility.
The Technical Concept: Instead of relying on token-heavy screenshots or raw HTML, I built a hidden semantic layer—essentially a "Screen Reader" for the LLM.
Hidden Visual Description: When the agent works, the system generates a structured, hidden description of the visual state.
Episodic Memory: The agent "reads" this layer to verify its actions. Crucially, it snapshots this state into a vector database (LanceDB).
Maturity/Governance: Before an agent is promoted from "Student" to "Autonomous," it must demonstrate it can recall these past visual states to avoid repeating errors.
Atom vs. OpenClaw: I view them as complementary. OpenClaw is the "Hands" (great for raw execution/terminal), while Atom is the "Brain" (handling state, memory, and audit trails). Atom uses Python/FastAPI vs OpenClaw's Node.js, and focuses heavily on this governance/memory layer.
The repo is self-hosted and includes the new Canvas architecture. I’d love feedback on the implementation of the hidden accessibility layer—is anyone else using "synthetic accessibility trees" for agent grounding?
No comments yet
Contribute on Hacker News ↗