Comment by joexbayer
1 day ago
I think the two best places to learn about it is the osdev forum and osdev wiki. There are so many useful resources to get you started there.
There are some tutorials out there too, but a lot of them have bugs and you basically recreate their OS.
Regarding studying opcodes, I never went that deep, closest I got was looking them up for my C compiler, so I know the most common basic ones.
https://forum.osdev.org/ https://wiki.osdev.org/Expanded_Main_Page
An x86 disassembler is not that hard, as long as you stick to a single mode and ignore the SIMD alphabet soup.
You have a short loop that scans through the prefixes, checks for a REX prefix (if you handle 64-bit mode), reads the opcode (1-3 bytes), reads the MOD/RM byte if there is one (use a table lookup), reads the SIB byte if there is one (table), reads offset if there is one (table), reads immediate if there is one (table).
It's probably easiest if you use an "expanded/normalized" opcode internally so the 1-3 opcode bytes + the 3 extra bits from some MOD/RM bytes + prefix info (for certain SIMD instructions) map to a single 16-bit opcode (likely around a couple hundred to a thousand opcodes in total).
You have a table that maps those to mnemonics + operand info. You have some tables that map 0-7 (or 0-15) to AL/AH/... and AX/BX/CX/... and CS/DS/ES/... and various system registers.
Not that much code all in all. Several tables. You can squeeze them and bit pack them to your heart's content if you want.
Once you have that, a simple assembler isn't so hard.
I understood. Thank you very much for the information. It's definitely much helpful to me.