Comment by renox

1 year ago

Whoa, someone else who doesn't believe that the RISC-V ISA is 'perfect'! I'm curious: how the discussions on the bitfield extract have been going? Because it does really seem like an obvious oversight and something to add as a 'standard extension'.

What's your take on

1) unaligned 32bit instructions with the C extension?

2) lack of 'trap on overflow' for arithmetic instructions? MIPS had it..

IMHO they made a mistake by not allowing immediate data to follow instructions. You could encode 8 bit constants within the opcode, but anything larger should be properly supported with immediate data. As for the C extension, I think that was also inferior because it was added afterward. I'd like to see a re-encoding of the entire ISA in about 10 years once things are really stable.

  • The main problem with what you’re saying is that none of the lessons learned are new. They were all well-known before this ISA was designed, so if the designers had any intention of learning from the past, they had every opportunity to do so.

The handling of misaligned loads/stores in RISC-V is also can be considered a disappointing point: https://github.com/riscv/riscv-isa-manual/issues/1611 It oozes with preferring convenience of hardware developers and "flexibility" over making practical guarantees needed by software developers. It looks like the MIPS patent on misaligned load/store instructions has played its negative role. The patent expired in 2019, but it seems we are stuck with the current status quo nevertheless.

1. aarch64 does this right. RISCV tries to be too many things at once, and predictably ends up sucking at everything. Fast big cores should just stick to fixed size instrs for faster decode. You always know where instrs start, and every cacheline has an integer number of instrs. microcontroler cores can use compressed intrs, since it matters there, while trying to parallel-codec instrs does not matter there. Trying to have one arch cover it all is idiotic.

2. nobody uses it on mips either, so it is likely of no use.

  • > Fast big cores should just stick to fixed size instrs for faster decode.

    How much faster, though? RISC-V decode is not crazy like x86, you only need to look at the first byte to know how long the instruction is (the first two bits if you limit yourself to 16 and 32-bit instructions, 5 bits if you support 48-bits instructions, 6 bits if you support 64-bits instructions). Which means, the serial part of the decoder is very very small.

    The bigger complain about variable length instruction is potentially misaligned instructions, which does not play well with cache lines (a single instruction may start in a cache line and end at the next, making hardware a bit more hairy).

    And there’s an advantage to compressed instructions even on big cores: less pressure on the instruction cache, and correspondingly fewer cache misses.

    Thus, it’s not clear to me that fixed size instructions is the obvious way to go for big cores.

    • Another argument against the C extension is that it uses a big chunk of the opcode space, which may be better used for other extensions with 32-bit instructions.

      8 replies →

    • Frankly, there is no advantage to compressed instructions in a high performance CPU core as a misaligned instruction can span a memory page boundary, which will generate a memory fault, potentially a TLB flush, and, if the memory page is not resident in memory, will require an I/O operation. Which is much worse than crossing a cache line. It is a double whammy when both occur simultaneously.

      One suggested solution has been filling in gaps with NOP's, but then the compiler would have to track the page alignment, which would not work anyway if a system supports pages of varying sizes (ordinary vs huge pages).

      The best solution is perhaps to ignore compressed instructions when targeting high performance cores and confine their usage to where they belong: power efficient or low performance microcontrollers.

      2 replies →

  • Fixed size instructions are not absolutely necessary, but keeping them naturally aligned is just better even if that means using C instructions a bit less often. It's especially messy that 32-bit instructions can span a page.

  • >2. nobody uses it on mips either, so it is likely of no use.

    Sure but at the time Rust, Zig didn't exist, these two languages have a mode which detects integer overflow..