← Back to context

Comment by newpavlov

2 months ago

>Misaligned loads and stores are Zicclsm

Nope. See https://github.com/llvm/llvm-project/issues/110454 which was linked in the first issue. The spec authors have managed to made a mess even here.

Now they want to introduce yet another (sic!) extension Oilsm... It maaaaaay become part of RVA30, so in the best case scenario it will be decades before we will be able to rely on it widely (especially considering that RVA23 is likely to become heavily entrenched as "the default").

IMO the spec authors should've mandated that the base load/store instructions work only with aligned pointers and introduced misaligned instructions in a separate early extension. (After all, passing a misaligned pointer where your code does not expect it is a correctness issue.) But I would've been fine as well if they mandated that misaligned pointers should be always accepted. Instead we have to deal the terrible middle ground.

>atomic memory operations are made mandatory in Ziccamoa

In other words, forget about potential performance advantages of load-link/store-conditional instructions. `compare_exchange` and `compare_exchange_weak` will always compile into the same instructions.

And I guess you are fine with the page size part. I know there are huge-page-like proposals, but they do not resolve the fundamental issue.

I have other minor performance-related nits such `seed` CSR being allowed to produce poor quality entropy which means that we have bring a whole CSPRNG if we want to generate a cryptographic key or nonce on a low-powered micro-controller.

By no means I consider myself a RISC-V expert, if anything my familiarity with the ISA as a systems language programmer is quite shallow, but the number of accumulated disappointments even from such shallow familiarity has cooled my enthusiasm for RISC-V quite significantly.

RISC-V truly is the RyanAir of processors: Oh, you want FP maths? That's an optional extra, did you check that when you booked? And was that single or double-precision, all optional extras at an extra charge. Atomic instructions, that's an extra too, have your credit card details handy. Multiply and divide? Yeah, extras. Now, let me tell you about our high-end customer options, packed SIMD and user-level interrupts, only for business class users. And then there's our first-class benefits, hypervisor extensions for big spenders, and even more, all optional extras.

  • So it's modular. This is normally considered a good thing. It means you don't have to pay for features you don't need.

    The ISA is open so there's no greedy corporation trying to upsell you. I mean there's an implementation and die area cost for each extension but it's not being set at an artificial level by a monopolist.

    • There's a good chance you're actually paying more for the features you don't need. Preparing an EUV mask set costs something like 30 million dollars (that figure may be out of date, i.e. it could be more now). So instead of a single mask set with everything on the device, whether you need it or not, you're paying $30 million for each special-snowflake variant. This is why vendors do a one-size-fits-all version of many of their products and then disable the extra functionality for the cheaper market segments, because it's much, much cheaper than making separate reduced-functionality devices.

    • It's a good thing in many cases but not if you're going to be running applications distributed as binaries. Maybe if we go the Gentoo route of everybody always recompiling everything for their own system?

      6 replies →

    • But that means a port of Linux can’t be to RISC-V, it has to be to a specific implementation of RISC-V, or if sufficient (which seems still debatable) to a specific common RISC-V profile.

      6 replies →

  • I don't agree with that comparison.

    RyanAir is about exploiting consumers, with bait-and-switch and shitty terms and conditions.

    RISC-V's modularity is about giving choice to hardware designers, so they can pick and choose just those features that their solution needs, and even allow for custom extensions.

    RISC-V's modularity is for academia. 1) for education, where students learn/use/work on simple processors, 2) for research in new types of hardware and extensions, where ease of implementation or ease of creating a custom extension is important.

    • Extensiosn are not just for academia. If I am building a microcontroller to control the storage media I am selling (eg. hard drives), why do I need to implement a bunch of features I am not going to use? What about my flow rate monitor? Or my pacemaker?

      In some of these, less silicon means less power means more better. Like that last example.

  • Then x86_64 is the cable television service of processors. "Oh, you want channel 5? Then you have to buy this bundle with 40 other channels you will never watch, including 7 channels in languages you do not speak."

  • >Multiply and divide

    And where it actually mattered they did not introduce a separate extension. Integer division is significantly more complex than multiplication, so it may make sense for low-end microcontrollers to implement in hardware only the latter.

  • RyanAir is the least expensive right? And it still gets you there?

    I would be ok with that if it was a valid analogy.

    It is valid in microcontroller land. There, the chip and the software are provided by the same party. So you can select for exactly the RISC-V features you need and save yourself some silicon. That sounds like a win to me.

    At the application level, like a server or a desktop, that would be a disaster because I get my hardware and software from different people. How do the software guys know what hardware to target? Well, that is exacly why RVA23 exists.

    What does RVA23 mean? It is the RISC-V "Application" profile. It allows you to build software to a single hardware target and trust that hardware makers will target the same proifle. RVA23 is like saying x86-64v4. Both are simple names for a long list of extensions (flags) and assumptions that you expect the hardware to honour. So, when Ubuntu 26.04 says it requires RVA23, it means that all the software built on it can assume those features. No a la carte.

    The reason RVA23 is geting so much attention is that it has essentially the same feature set as modern ARM64 or x86-64. Software will be able to target this profile for a long time. There may be a new profile in a few years time, like RVA30, but hardware that implements that will still run RVA23 software (just as x86-64v4 hardware will run x86-64v1 software). Hardware built for profiles before RVA23 may be missing features modern applications expect.

    I guess you could say that RVA23 is British Airways Business Class.

    If you really want to support hardware designed before RVA23, almost everything you would want to run pre-built software on supports RVA20. And again, your RVA20 stuff will run fine on RVA23 hardware (but with fewer features--like no vectors). So maybe no in-flight meal, but it will get you there.

I think having separate unaligned load/store instructions would be a much worse design, not least because they use a lot of the opcode space. I don't understand why you don't just have an option to not generate misaligned loads for people that happen to be running on CPUs where it's really slow. You don't need to wait for a profile for that.

As for `seed`, if you're running on a microcontroller you can just look up the data sheet to see if it's seed entropy is sufficient. By the time you get to CPUs where portable code is important a CSPRNG is probably fine.

I agree about page size though. Svnapot seems overly complicated and gives only a fraction of the advantages of actually bigger pages.

  • >As for `seed`, if you're running on a microcontroller you can just look up the data sheet to see if it's seed entropy is sufficient.

    It's a terrible attitude to have towards programmers, but looking at misaligned ops, I guess we can see a pattern from RISC-V authors here.

    Most programmers do not target a concrete microcontroller and develop every line of code from scratch. They either develop portable libraries (e.g. https://docs.rs/getrandom) or build their projects using those libraries.

    The whole raison d'être of an ISA is to provide a portable contract between hardware vendors and programmers . RISC-V authors shirk this responsibility with "just look at your micro specs, lol" attitude.

  • The option to generate or not generate misaligned loads/stores does exist (-mno-strict-align / -mstrict-align). But of course that's a compile-time option, and of course the preferred state would be to have use of them on by default, but RVA23 doesn't sufficiently guarantee/encourage them not being unreasonably-slow, leaving native misaligned loads/stores still effectively-unusable (and off by default on clang/gcc on -march=rva23u64).

    aka, Zicclsm / RVA23 are entirely-useless as far as actually getting to make use of native misaligned loads/stores goes.

    • > RVA23 doesn't guatantee them not being unreasonably-slow

      Right but it doesn't guarantee that anything is unreasonably slow does it? I am free to make an RVA23 compliant CPU with a div instruction that takes 10k cycles. Does that mean LLVM won't output div? At some point you're left with either -mcpu=<specific cpu> and falling back to reasonable assumptions about the actual hardware landscape.

      Do ARM or x86 make any guarantees about the performance of misaligned loads/stores? I couldn't find anything.

      17 replies →

  • RISC-V is not particularly good at using opcode space, unfortunately.

    • I don't think it's too bad. The compressed extension was arguably a mistake (and shouldn't be in RVA23 IMO), but apart from that there aren't any major blunders. You're probably thinking about how JAL(R) basically always uses x1/x5 (or whatever it is), but I don't think that's a huge deal.

      About 1/3 of the opcode space is used currently so there's a decent amount of space left.