I wanted to suggest Fernflower. I have a lot of experience with it, because it's what Jetbrains uses in Intellij. I have only seen it generate sensible code.
I took a quick peek at Vineflower first, and it's a fork of Fernflower. So would recommend that for anyone who might stumble on this in the future who is looking for a decompiler.
Any of these modern choices include features using LLMs to further decompile the decompiled code? Seems like an obvious direction, even just to infer variable names.
i have no idea why nobody is doing it - it is such an obvious use case of LLMs. i guess the reveng market is much smaller than most people realized?
then again, who needs reveng when you can use said LLMs to write new software "just in time" with the same API.
reveng also was one of those industries that always had a very suspicious crowd of people - i dont mean malicious, i mean... a lot of them drew a disturbing amount of pleasure from doing incredibly labourious work, sort of like someone who enjoys putting together an airfix model over many months with a microscopic brush and tweezers.
so i wonder if a lot of them perversely enjoy starting at reams of bytes and putting together this 10,000 piece puzzle, and having an llm solve it for them is a deep affront to their tastes.
if you want to an online java decompiler for a quick analysis, I recommend https://slicer.run/, it has a sleek UI and provides support for a variety of decompilers (including the likes of Vineflower, CFR, JASM, Procyon). For more in-depth analysis, https://github.com/Col-E/Recaf is probably my first choice
A great tool for digging into obscure jar and class files. I used it many times to track down very obscure bugs in Java based products. Often you will have a vendor saying that your issue is not real or not reproducible on their end. But with this kind of tool you can peek behind the curtains and figure out how to trigger some condition 100% of the time.
Sadly it's not maintained anymore and even the intellijidea-derived decompilers are better nowadays (used to be horrible until a few years ago).
In addition to the limitation to classfiles built for Java8, it sadly has a hard time decompiling new language features even if compiled for a Java8 target. And then there is the well known bug that decompiling full jars in bulk does not get you the same output you see in the UI but orders of magnitude worse... jd was great until it lasted, helped me solve a lot of issues with verdors over the years.
The most annoying thing in intellij (fernflower is it) is that it does not maintain correct line numbers, so when debugging, there is a divergence. Still you need to download sources but not always they are available
I think this is popping up in Hacker News because the concept of decompilers has become a bit more acceptable recently. (strokes beard)Time was, decompilation was said to be Impossible (as my wise friend syke said: most things people say are impossible are just tedious). Then, it just became "something you could only do in a targeted, single-application fashion.)
Somewhere in there, Alan Kaye laughed and handed everyone dynamic code.
These days, with AI in tow, decompilation is becoming the sort of thing that could be in the toolchain, replacing IDA and such. Why debug and examine when you can literally decompile?!
So, maybe, that idea being considered to be newly on the table, someone felt the need to post a counter-point, proving once again that everything old is new again.
Hats off for decomiling Java apps that mostly predate generics and annotations... both of which were added in 5.
I'm not sure you lived the same history I did. Decompiling for intermediate languages has always been a thing. Hell, back in college as an intern I was looking at the assembly of a decompiled C# binary, and back in highschool using intellij's Java decompiler to poke at some game applets to see if there we hacking opportunities. This was back when ruinscape didn't have a paid version
Is there anything especially hard about decompiling (to) Java?
.NET/C# decompilers are widespread and generally work well (there is one built into Visual Studio nowdays, JetBrains have their own, there were a bunch of stand-alone tools too back in the the day).
< disclaimer - I wrote CFR, which is one of the original set of 'modern' java decompilers >
Generic erasure is a giant pain in the rear. C# doesn't do this. You don't actually keep any information about generics in the bytecode, however some of the metadata is present. BUT IT COULD BE FULL OF LIES.
There's also a huge amount of syntactic sugar in later java versions - take for example switch expressions.
My personal experience with both is that decompilers work great for easy code. I still have both Java and C# projects that I wish I decompiled even to worst possible, but almost compilable code. Instead getting just decompiler errors or code where all variables got the same letter/name and of course different types...
I think I've tried all available free tools and some paid in Java case. Finally I just deducted logic and reverse engineered the most important path.
One of the use case of décompilation is bug hunting / vulnerability research. And that’s still one of the use cases where AI isn’t that good because you must be precise.
I’m not saying that won’t change but I still see a bright future for reversing tools, with or without AI sidekicks (like the BN plugin)
I used codex 5.1 yesterday to point at a firmware blob and let it extract and explore it targeting a specific undisclosed vulnerability and it managed (after floundering for a bit) to read the Lua bytecode and identify and exploit the vuln on a device running the firmware.
If anything, vulnerability research should be good target for AI because failure to find an exploit isn't costly (and easily verified) but 1 in N success is very useful.
>Hats off for decomiling Java apps that mostly predate generics and annotations... both of which were added in 5.
the 1st very famous and good decompiler was written in C. Other than that generics and annotation didn't not make the work easier at all decmopilation wise
There is a maintained fork of this called jd-gui-duo which includes more features and more decompilers (JADX, Vineflower, Fernflower, CFR, Procyon, ...)
Or when you're too lazy to hunt down the sources, both for internal and external dependencies. Just Ctrl+click the method and have a quick look at the decompiled implementation, usually good enough.
More modern choices are JADX (https://github.com/skylot/jadx) or Vineflower (https://github.com/Vineflower/vineflower). If you want a paid, higher-quality option, try JEB (https://www.pnfsoftware.com/).
I wanted to suggest Fernflower. I have a lot of experience with it, because it's what Jetbrains uses in Intellij. I have only seen it generate sensible code.
I took a quick peek at Vineflower first, and it's a fork of Fernflower. So would recommend that for anyone who might stumble on this in the future who is looking for a decompiler.
Any of these modern choices include features using LLMs to further decompile the decompiled code? Seems like an obvious direction, even just to infer variable names.
>Seems like an obvious direction, even just to infer variable names.
when debugging symbols are included (sort of the default) the local variables are already present; LLM would be the last thing I'd consider
1 reply →
i have no idea why nobody is doing it - it is such an obvious use case of LLMs. i guess the reveng market is much smaller than most people realized?
then again, who needs reveng when you can use said LLMs to write new software "just in time" with the same API.
reveng also was one of those industries that always had a very suspicious crowd of people - i dont mean malicious, i mean... a lot of them drew a disturbing amount of pleasure from doing incredibly labourious work, sort of like someone who enjoys putting together an airfix model over many months with a microscopic brush and tweezers.
so i wonder if a lot of them perversely enjoy starting at reams of bytes and putting together this 10,000 piece puzzle, and having an llm solve it for them is a deep affront to their tastes.
5 replies →
if you want to an online java decompiler for a quick analysis, I recommend https://slicer.run/, it has a sleek UI and provides support for a variety of decompilers (including the likes of Vineflower, CFR, JASM, Procyon). For more in-depth analysis, https://github.com/Col-E/Recaf is probably my first choice
How do you rate procyon vs these?
A great tool for digging into obscure jar and class files. I used it many times to track down very obscure bugs in Java based products. Often you will have a vendor saying that your issue is not real or not reproducible on their end. But with this kind of tool you can peek behind the curtains and figure out how to trigger some condition 100% of the time.
It had better be really old Java code. This decompiler supports only through Java 8. We're on Java 24 now.
Java 8 is your everyday corporate code ...
10 replies →
Sadly it's not maintained anymore and even the intellijidea-derived decompilers are better nowadays (used to be horrible until a few years ago).
In addition to the limitation to classfiles built for Java8, it sadly has a hard time decompiling new language features even if compiled for a Java8 target. And then there is the well known bug that decompiling full jars in bulk does not get you the same output you see in the UI but orders of magnitude worse... jd was great until it lasted, helped me solve a lot of issues with verdors over the years.
The most annoying thing in intellij (fernflower is it) is that it does not maintain correct line numbers, so when debugging, there is a divergence. Still you need to download sources but not always they are available
I've only seen that with transient dependencies that are instantiated via Reflections
Vineflower is probably what you want nowadays
I think this is popping up in Hacker News because the concept of decompilers has become a bit more acceptable recently. (strokes beard)Time was, decompilation was said to be Impossible (as my wise friend syke said: most things people say are impossible are just tedious). Then, it just became "something you could only do in a targeted, single-application fashion.)
Somewhere in there, Alan Kaye laughed and handed everyone dynamic code.
These days, with AI in tow, decompilation is becoming the sort of thing that could be in the toolchain, replacing IDA and such. Why debug and examine when you can literally decompile?!
So, maybe, that idea being considered to be newly on the table, someone felt the need to post a counter-point, proving once again that everything old is new again.
Hats off for decomiling Java apps that mostly predate generics and annotations... both of which were added in 5.
I'm not sure you lived the same history I did. Decompiling for intermediate languages has always been a thing. Hell, back in college as an intern I was looking at the assembly of a decompiled C# binary, and back in highschool using intellij's Java decompiler to poke at some game applets to see if there we hacking opportunities. This was back when ruinscape didn't have a paid version
Is there anything especially hard about decompiling (to) Java?
.NET/C# decompilers are widespread and generally work well (there is one built into Visual Studio nowdays, JetBrains have their own, there were a bunch of stand-alone tools too back in the the day).
< disclaimer - I wrote CFR, which is one of the original set of 'modern' java decompilers >
Generic erasure is a giant pain in the rear. C# doesn't do this. You don't actually keep any information about generics in the bytecode, however some of the metadata is present. BUT IT COULD BE FULL OF LIES.
There's also a huge amount of syntactic sugar in later java versions - take for example switch expressions.
https://www.benf.org/other/cfr/switch_expressions.html
and OH MY GOD FINALLY
https://www.benf.org/other/cfr/finally.html
3 replies →
My personal experience with both is that decompilers work great for easy code. I still have both Java and C# projects that I wish I decompiled even to worst possible, but almost compilable code. Instead getting just decompiler errors or code where all variables got the same letter/name and of course different types...
I think I've tried all available free tools and some paid in Java case. Finally I just deducted logic and reverse engineered the most important path.
One of the use case of décompilation is bug hunting / vulnerability research. And that’s still one of the use cases where AI isn’t that good because you must be precise.
I’m not saying that won’t change but I still see a bright future for reversing tools, with or without AI sidekicks (like the BN plugin)
I used codex 5.1 yesterday to point at a firmware blob and let it extract and explore it targeting a specific undisclosed vulnerability and it managed (after floundering for a bit) to read the Lua bytecode and identify and exploit the vuln on a device running the firmware.
1 reply →
If anything, vulnerability research should be good target for AI because failure to find an exploit isn't costly (and easily verified) but 1 in N success is very useful.
>Hats off for decomiling Java apps that mostly predate generics and annotations... both of which were added in 5.
the 1st very famous and good decompiler was written in C. Other than that generics and annotation didn't not make the work easier at all decmopilation wise
Is AI really useful in decompiling or does it just create similar code that does the same as the original?
Or you can use https://jar.tools/ - online java decompiler I built some time ago. Runs in your browser
[flagged]
yes, because you don't need to install anything on your machine
This one haven't been updated for 5 years and do not support any newer java features.
which new feature are not supported?
There is a maintained fork of this called jd-gui-duo which includes more features and more decompilers (JADX, Vineflower, Fernflower, CFR, Procyon, ...)
https://github.com/nbauma109/jd-gui-duo
next, add a feature that does a pass with an llm that makes local variable names more realistic and adds comments.
What is the use of decompiling, is there any real time use case?
Or when you're too lazy to hunt down the sources, both for internal and external dependencies. Just Ctrl+click the method and have a quick look at the decompiled implementation, usually good enough.
Anytime you have to debug closed source libraries. Or even make your own implementation.
I wish I could use it to recompile itself