Comment by oofbey

9 hours ago

What’s the state of the art of reverse engineering source code from binaries in the age of agentic coding? Seems like something agents should be pretty good at, but haven’t read anything about it.

8 comments

oofbey

roughly 9 hours ago

I think there’s a good possibility that the technology that is LLMs could be usefully trained to decode binaries as a sort of squint-and-you-can-see-it translation problem, but I can’t imagine, eg, pre-trained GPT being particularly good at it.

JasonADrury 9 hours ago

I've been working on this, the results are pretty great when using the fancier models. I have successfully had gpt5.2 complete fairly complex matching decompilation projects, but also projects with more flexible requirements.

TZubiri 9 hours ago

Nothing yet, agents analyze code which is textual.

The way they analyze binaries now is by using textual interfaces of command tools, and the tools used are mostly the ones supported by Foundation Models at training time, mostly you can't teach it new tools at inference, they must be supported at training. So most providers are focused on the same tools and benchmarking against them, and binary analysis is not in the zeitgeist right now, it's about production more than understanding.

oofbey 9 hours ago
The entire MCP ecosystem disagrees with your assertion that “you can’t teach it new tools at inference.” Sorry you’re just wrong.
- TZubiri 4 hours ago
  
  Nono, you of course CAN teach tool use at inference, but it's different than doing so at training time, and the models are trained to call specific tools right now.
  Also MCP is not an Agent protocol, it's used in a different category. MCP is used when the user has a chatbot, sends a message, gets a response. Here we are talking about the category of products we would describe as Code Agents, including Claude Code, ChatGPT Codex, and the specific models that are trained for use in such contexts.
  The idea is that of course you can tell it about certain tools in inference, but in code production tasks the LLM is trained to use string based tools such as grep, and not language specific tools like Go To Definition.
  My source on this is Dax who is developing an Open Source clone of Claude Code called OpenCode
  
  1 reply →

refulgentis 9 hours ago

Agents are sort of irrelevant to this discussion, no?

Like, it's assuredly harder for an agent than having access to the code, if only because there's a theoratical opportunity to misunderstand the decompile.

Alternatively, it's assuredly easier for an agent because given execution time approaches infinity, they can try all possible interpretations.

oofbey 8 hours ago

Agents meaning an AI iteratively trying different things to try to decompile the code. Presumably in some kind of guess and check loop. I don’t expect a typical LLM to be good at this on its first attempt. But I bet Cursor could make a good stab at it with the right prompt.