Comment by _pdp_

5 days ago

Our agentic builder has a single tool.

It is called graphql.

The agent writes a query and executes it. If the agent does not know how to do particular type of query then it can use graphql introspection. The agent only receives the minimal amount of data as per the graphql query saving valuable tokens.

It works better!

Not only we don't need to load 50+ tools (our entire SDK) but it also solves the N+1 problem when using traditional REST APIs. Also, you don't need to fall back to write code especially for query and mutations. But if you need to do that, the SDK is always available following graphql typed schema - which helps agents write better code!

While I was never a big fan of graphql before, considering the state of MCP, I strongly believe it is one of the best technologies for AI agents.

I wrote more about this here if you are interested: https://chatbotkit.com/reflections/why-graphql-beats-mcp-for...

Whoa there, you don't need to be so sadistic to your team. It's not GraphQL, but having a document describing how your API works, including types, that is important.

I expect you could achieve the same with a comprehensive OpenAPI specification. If you want something a bit stricter I guess SOAP would work too, LLMs love XML after all.

  • We have well described OpenAPI and GraphQL specifications already. :)

    Being AI-first means we are naturally aligned with that kind of structured documentation. It helps both humans and robots.

One of my agents is kinda like this too. The only operation is SPARQL query, and the only accessible state is the graph database.

Since most of the ontologies I'm using are public, I just have to namedrop them in prompt; no schemas and little structure introspection needed. At worst, it can just walk and dump triples to figure out structure; it's all RDF triples and URIs.

One nice property: using structured outputs, you can constrain outputs of certain queries to only generate valid RDF to avoid syntax errors. Probably can do similar stuff with GraphQL.

Isn't the challenge that introspecting graphql will lead to either a) a very long set of definitions consuming many tokens or b) many calls to drill into the introspection?

  • In my experience, this was the limitation we ran into with this approach. If you have a large API this will blow up your context.

    I have had the best luck with hand-crafted tools that pre-digest your API so you don't have to waste tokens or deal with context rot bugs.

  • Well either that or stuff the tool usage examples into the prompt for every single request. If you have only 2-3 tools GraphQL is certainly not necessary - but it wont blow up the context either. If you have 50+ tools, I don't see any other way to be honest, unless you create your own tool discovery solution - which is what GraphQL does really well with the caveat that whatever you decide to do is certainly not natural to these LLMs.

    Keep in mind that all LLMs are trained on many GraphQL examples because the technology has been in existence since 2015. While anything custom might just work it is certainly not part of the model training set unless you fine-tune.

    So yes, if I need to decide on formats I will go for GraphQL, SQL and Markdown.

This is actually a really good use of graphql!

IMO the biggest pain points of graphql are authorization/rate limiting, caching, and mutations... But for selective context loading none of those matter actually. Pretty cool!

1000%

2 years ago I gave a talk on Vector DB's and LLM use.

https://www.youtube.com/watch?v=U_g06VqdKUc

TLDR but it shows how you could teach an LLM your GraphQL query language to let it selectively load context into what were very small context windows at the time.

After that the MCP specification came out. Which from my vantage point is a poor and half implemented version of what GraphQL already is.

your use-case is NOT Everyones use-case..(working in depth across one codebase or api but instead sampling dozens of abilities across the web or with other systems) thats the thing

how is that going to work with my use case, do a web search, do a local api call, do a graphql search, do an integration with slack, do a message etc..

  • Does it matter ? if it's well defined, each of those would be a node in the graph, or can you elaborate ? Dozens seems not that much, for a graph where a higher level node would be slack, and the agent only loads further if it needs anything related with slack. Or I'm not understanding.

I do think that using graphql will solve a lot of problems for people but it's super surprising how many people absolutely hate it.

  • GraphQL is just a typed schema (good) with a server capable of serving any subset of the entire schema at a time (pain in the ass).

    • It doesn’t actually require that second part. Every time I’ve used it in a production system, we had an approved list of query shapes that were accepted. If the client wanted to use a new kind of query, it was performance tested and sometimes needed to be optimized before approval for use.

      If you open it up for any possible query, then give that to uncontrolled clients, it’s a recipe for disaster.

      22 replies →

> It works better!

> I strongly believe it is one of the best technologies for AI agents

Do you have any quantitative evidence to support this?

Sincere question. I feel it would add some much needed credibility in a space where many folks are abusing the hype wave and low key shilling their products with vibes instead of rigor.

  • I have thought about this for all of thirty seconds, but it wouldn't shock me if this was the case. The intuition here is about types, and the ability to introspect them. Agents really love automated guardrails. It makes sense to me that this would work better than RESTish stuff, even with OpenAPI.

    • Better than rest is a low bar though. Ultimately agents should rarely be calling raw rest and graphql apis, which are meant for programmatic use.

      Agents should be calling one level of abstraction higher.

      Eg calling a function to “find me relevant events in this city according to this users preferences” instead of “list all events in this city”.

    • Same in terms of time spent. The hypothesis graphql is superior passes the basic sniff test. Assuming graphql does what it says on the tin, which my understanding is it does based on my work with Ent, then the claim it’s better for tool and api use by agents follows from common sense.

    • This is a task I think is suited for a sub agent that is small in size. It can can take the context beating to query for relevant tools and return only what is necessary to the main agent thread.

  • I've seen a similar setup with an llm loop integrated with clojure. In clojure, code is data, so the llm can query, execute, and modify the program directly

  • If you knew GraphQL, you may immediately see it - you ask for specific nested structure of the data, which can span many joins across different related collections. This is not the case with common REST API or CLI for example. And introspection is another good reason.

Reading this was such an immediate "aha" for me. Of course we should be using GraphQL for this. Damn. Where was this comment three months ago!