Comment by snake42

5 months ago

You can always lift machine code to assembly. Its a 1 to 1 process.

2 comments

snake42

No you cannot. While it is 1 to 1, you still need to know where to start as if you start at the wrong place data will be interrupted as an asm instruction and things will decode legally - but invalidly. It is worse on CISC (like x86) where instructions are different length and so you can jump to the middle byte of a long instruction and decode a shorter instruction. (RISC sometimes starts to get CISC features as they add more instructions as well).

If the code was written reasonably you can usually find enough clues to figure out where to start decoding and thus get a reasonable assembly output, but even then you often need to restart the decoding several times because the decoder can get confused at function boundaries depending on what other data gets embedded and where it is embedded. Be glad self modifying code was going out of style in the 1980's and is mostly a memory today as that will kill any disassembly attempts. All the other tricks that Mel used (https://en.wikipedia.org/wiki/The_Story_of_Mel) also make your attempts at lifting machine code to assembly impossible.

Akronymus 5 months ago

It definitely isnt a 1:1 process, as there are multiple ways to encode the same instruction (with possibly even having some subtle side effects based on the encoding)

https://youtu.be/eunYrrcxXfw