Comment by jdefr89
1 month ago
Vulnerability Researcher/Reverse Eng here... Aspects about it generating an API for read/write primitives are simply it regurgitating tons of APIs that exist already. Its still cool, but its not like it invented the primitives or any novel technique. Also, this toy JS is similar to binaries you'd find in a CTF. Of course it will be able to solve majority of those. I am curious though.. Latest OpenAI models don't seem to want to generate any real exploit code. Is there a prompt jail break or something being used here?
I had similar questions when reading the original article. I’m also interested in how the agent is constructed. From my experience, it can be very difficult to implement exploits without access to debugging tools, so I’m curious whether pwndbg or similar tools are included in the agent’s toolset and, if so, how they are integrated. Existing open-source GDB MCPs don’t work very well unless further optimized, at least the last time I checked.