Comment by rep_lodsb

6 days ago

You're right, didn't account for that. Though even when declared volatile, the counter variable would be on the stack, and thus already in the CPU cache (at least 32K according to the datasheet)?

Looking at the assembly code for both versions of this delay loop might clear it up.

The only thing volatile does is to assure that the value is read from memory each time (which implicitly also forbids optimizations). Whether that memory is in a CPU cache is purely a hardware issue and outside the C specification. If you read something like a hardware register, you yourself need to take care in some way that a hardware cache will not give you old values (by mapping it into a non-cached memory area, or by forcing a cache update). If you for-loop over something that acts as a compiler barrier, all that 'volatile' on the counter variable will do is potentially make the for-loop slower.

There's really just very few reasons to ever use 'volatile'. In fact, the Linux kernel even has its own documentation why you should usually not use it:

https://www.kernel.org/doc/html/latest/process/volatile-cons...

  • doesnt volatile also ensure the address is not changed for the read by compiler (as it might optimise data layout otherwise)? (so you can be sure when using mmio etc. it wont read from wrong place?)

    • "volatile", according to the standard, simply is: "An object that has volatile-qualified type may be modified in ways unknown to the implementation or have other unknown side effects. Therefore any expression referring to such an object shall be evaluated strictly according to the rules of the abstract machine."

      Or simpler: don't assume anything what you think you might know about this object, just do as you're told.

      And yes, that for instance prohibits putting a value from a memory address into a register for further use, which would be a simple case of data optimization. Instead, a fresh retrieval from memory must be done on each access.

      However, if your system has caching or an MMU is outside of the spec. The compiler does not care. If you tell the compiler to give you the byte at address 0x1000, it will do so. 'volatile' just forbids the compiler to deduce the value from already available knowledge. If a hardware cache or MMU messes with that, that's your problem, not the compiler's.