← Back to context

Comment by 9cb14c1ec0

18 hours ago

Ok, real question. What products are people actually building with agent frameworks? I get the utility of AI coding tools and generic chat apps, but that is the extent of utility that I've been able to get from AI. I'm looking for examples that are real businesses, not toys.

I use a custom framework for creating basic but useful tools that work with sensitive data. There are cases in my organization where I like the idea of people using Claude or similar to assist with a process, but Claude Desktop or Claude Code doesn't offer the safety or security we need (in part because the people using it are unconstrained, in part because the harnesses aren't perfect and the LLMs can make bad choices).

This provides a harness that's a state machine with very explicit directives, and it uses Deno as the runtime to constrain network, filesystem, environment, and other types of access at runtime as needed.

Kind of like using skills in Claude Code to teach it how to do something, but with extremely tight guard rails. Like, you can only write a specific file when in a specific state, otherwise that tool isn't even callable.

It requires understanding the problem that's being solved quite well. This often leads to realizing it can be automated without a harness. Finding cases where an LLM is genuinely crucial to enabling the automation is difficult.

A good example of one recently was getting a local LLM to define schemas for an internal tool based on existing research data. It looks at the data, figures out the semantics of the data, relationships, and how that maps to the target schema. This is impossible to automate without this semantic inference. It then uses duckdb to perform transformations from raw data to the appropriate schema, and finally, tests the schema in the validator with the data. It makes a very complex, often unappealing and confusing process very easy. Once it's done, the data is in better shape than we ever got it to by hand. This is partially because of a validator I created, but also because the LLM can identify patterns really well and retain a massive spec while it works.

You could do it with all kinds of existing harnesses but this one lets us comfortably define processes we trust and lets us operate on data our partners would never allow into the cloud or on OpenAI/Anthropic's servers in particular.

> I'm looking for examples that are real businesses, not toys.

These tools are used within a real business (specifically a coastal science NGO) and they aren't toys, so hopefully that's useful information. Based on my experience so far, and it could be my lack of imagination, I have no idea how you'd use these as the foundation for a business. I find more cases that can be automated without an LLM than I do with one, and they tend to be so niche and strange that no one else would ever need them and they can't be generalized.

We're building https://brooked.io/. In the same way that Cursor provides a lot of features on top of the base agents, we want to do the same for spreadsheets. There are many workflows that benefit from having an agent available - resolving cell values from a prompt, writing functions, sheet insights, alerting, debugging.