Show HN: Atom – open-source AI agent with "visual" episodic memory

5 hours ago (github.com)

Hey HN,

I’ve been building Atom (https://github.com/rush86999/atom), an open-source, self-hosted AI automation platform.

I built this because while tools like OpenClaw are excellent for one-off scripts and personal tasks, I found them difficult to use for complex business workflows (e.g., managing invoices or SaaS ops). The main issue was State Blindness: the agent would fire a command and assume it worked, without "seeing" if the UI or state actually updated.

I just shipped a new architecture to solve this called Canvas AI Accessibility.

The Technical Concept: Instead of relying on token-heavy screenshots or raw HTML, I built a hidden semantic layer—essentially a "Screen Reader" for the LLM.

Hidden Visual Description: When the agent works, the system generates a structured, hidden description of the visual state.

Episodic Memory: The agent "reads" this layer to verify its actions. Crucially, it snapshots this state into a vector database (LanceDB).

Maturity/Governance: Before an agent is promoted from "Student" to "Autonomous," it must demonstrate it can recall these past visual states to avoid repeating errors.

Atom vs. OpenClaw: I view them as complementary. OpenClaw is the "Hands" (great for raw execution/terminal), while Atom is the "Brain" (handling state, memory, and audit trails). Atom uses Python/FastAPI vs OpenClaw's Node.js, and focuses heavily on this governance/memory layer.

The repo is self-hosted and includes the new Canvas architecture. I’d love feedback on the implementation of the hidden accessibility layer—is anyone else using "synthetic accessibility trees" for agent grounding?