← Back to context

Comment by uneven9434

3 days ago

More modern choices are JADX (https://github.com/skylot/jadx) or Vineflower (https://github.com/Vineflower/vineflower). If you want a paid, higher-quality option, try JEB (https://www.pnfsoftware.com/).

I wanted to suggest Fernflower. I have a lot of experience with it, because it's what Jetbrains uses in Intellij. I have only seen it generate sensible code.

I took a quick peek at Vineflower first, and it's a fork of Fernflower. So would recommend that for anyone who might stumble on this in the future who is looking for a decompiler.

Any of these modern choices include features using LLMs to further decompile the decompiled code? Seems like an obvious direction, even just to infer variable names.

  • >Seems like an obvious direction, even just to infer variable names.

    when debugging symbols are included (sort of the default) the local variables are already present; LLM would be the last thing I'd consider

    • Yeah, I mean duh, of course? Why infer when you have the proper names? I don't understand what you're trying to point out here...

  • i have no idea why nobody is doing it - it is such an obvious use case of LLMs. i guess the reveng market is much smaller than most people realized?

    then again, who needs reveng when you can use said LLMs to write new software "just in time" with the same API.

    reveng also was one of those industries that always had a very suspicious crowd of people - i dont mean malicious, i mean... a lot of them drew a disturbing amount of pleasure from doing incredibly labourious work, sort of like someone who enjoys putting together an airfix model over many months with a microscopic brush and tweezers.

    so i wonder if a lot of them perversely enjoy starting at reams of bytes and putting together this 10,000 piece puzzle, and having an llm solve it for them is a deep affront to their tastes.

    • Is it really an obvious use case of LLMs? Traditional byte code to source decompilers are faster, use less memory, and are deterministic. Using a LLM to decompile code makes as much sense as using a LLM to compile code.

      That said there are probably ways a LLM could improve a decompiler in a way that does not impact its correctness. Like deriving class and variables names based on context, when symbols are missing or obfuscated.

    • ... until you realize that the LLM-generated code doesn't even compile, or you need a PhD to write all the prompts needed to have a prototype instead of the real thing.

      3 replies →