Comment by TZubiri

12 hours ago

Nothing yet, agents analyze code which is textual.

The way they analyze binaries now is by using textual interfaces of command tools, and the tools used are mostly the ones supported by Foundation Models at training time, mostly you can't teach it new tools at inference, they must be supported at training. So most providers are focused on the same tools and benchmarking against them, and binary analysis is not in the zeitgeist right now, it's about production more than understanding.

The entire MCP ecosystem disagrees with your assertion that “you can’t teach it new tools at inference.” Sorry you’re just wrong.

  • Nono, you of course CAN teach tool use at inference, but it's different than doing so at training time, and the models are trained to call specific tools right now.

    Also MCP is not an Agent protocol, it's used in a different category. MCP is used when the user has a chatbot, sends a message, gets a response. Here we are talking about the category of products we would describe as Code Agents, including Claude Code, ChatGPT Codex, and the specific models that are trained for use in such contexts.

    The idea is that of course you can tell it about certain tools in inference, but in code production tasks the LLM is trained to use string based tools such as grep, and not language specific tools like Go To Definition.

    My source on this is Dax who is developing an Open Source clone of Claude Code called OpenCode

    • Claude code and cursor agent and all the coding agents can and do run MCP just fine. MCP is effectively just a prompt that says “if you want to convert a binary to hex call the ‘hexdump’ tool passing in the filename” and then a promise to treat specially formatted responses differently. Any modern LLM that can reason and solve math problems will understand and use the tools you give it. Heck I’ve even seen LLMs that were never trained to reason make tool calls.

      You say they’re better with the tools they’re trained on. Maybe? But if so not much. And maybe not. Because custom tools are passed as part of the prompt and prompts go a long way to override training.

      LLMs reason in text. (Except for the ones that reason in latent space.) But they can work with data in any file format as long as they’re given tools to do so.