← Back to context

Comment by snake42

2 days ago

You can always lift machine code to assembly. Its a 1 to 1 process.

No you cannot. While it is 1 to 1, you still need to know where to start as if you start at the wrong place data will be interrupted as an asm instruction and things will decode legally - but invalidly. It is worse on CISC (like x86) where instructions are different length and so you can jump to the middle byte of a long instruction and decode a shorter instruction. (RISC sometimes starts to get CISC features as they add more instructions as well).

If the code was written reasonably you can usually find enough clues to figure out where to start decoding and thus get a reasonable assembly output, but even then you often need to restart the decoding several times because the decoder can get confused at function boundaries depending on what other data gets embedded and where it is embedded. Be glad self modifying code was going out of style in the 1980's and is mostly a memory today as that will kill any disassembly attempts. All the other tricks that Mel used (https://en.wikipedia.org/wiki/The_Story_of_Mel) also make your attempts at lifting machine code to assembly impossible.