← Back to context

Comment by OsrsNeedsf2P

3 days ago

Dumb question about reverse engineering binaries: is there a way to only do it piecemeal? I'm eventually waiting for LLMs and harnesses to get good enough to reverse engineer BFME (old Lord of the Rings game that still has an active modding community), but it's a multi GB sized game that would have to be done in bite-sized pieces.

Basically; can you reverse engineer in bite sized pieces, and recompile/customize their behavior, without needing to do it all at once?

Most decomp projects (that I know of) are Ship of Theseus style projects where the minimum unit is a function, give or take alignment requirements and quirks of the compiler. On the MIPS side, tools like Splat and SPIM can help identify function and even source file boundaries, generate inline ASM C files[0], and write linker scripts to build a matching binary. You can then go through and replace the ASM functions one at a time until you just have C left.

0 - for example: https://github.com/Xeeynamo/sotn-decomp/blob/master/src/boss...

I've just recently finished replaying BFME1's campaigns. The installation process was "fun" in its own way. There were some quirks in game that could definitely be fixed. I haven't used any unofficial patches though. But I would like to see and maybe even help with reverse engineering both of them. The second game gave me even more trouble yesterday. Would be nice to have community versions of the exes.

Yes, quite easily. It requires some setup, but the basic idea is that you create a DLL and a simple loader program which injects it into your target process. You can then use a hooking library like MinHook to replace individual functions with your own implementations. If the target application is in C++, you can additionally do vtable hooking and replace functions even easier (though it will always be a combination of the two techniques).

Have you tried? I've haven't tried anything huge but I've had LLMs decompile SNES ROMs for me.

Most of those GB are probably data rather than executable code, it might not be quite as bad as you're imagining.