Comment by IgorPartola

15 hours ago

There is an LLM API. You send it a system prompt and the conversation history. If the last message is a user message the agent will send back a response. It can also send back a “thinking” message before it sends a response and it can also send back a structured message with one or more function calls for functions you defined in your API request (things like “ls(): list files”).

The harness is the part that makes the API calls, interacts with the user, makes the function calls, and keeps track of the conversation memory.

You can also use the LLM to summarize the conversation into a single shorter message so you get compaction. And instead of statically defining which functions are available to the LLM you can create an MCP server which allows the LLM to auto-discover functions it can call and what they do.