← Back to context

Comment by kaba0

3 years ago

Obviously the 3 objects you allocate with shared_ptr in C++ won’t be a performance bottleneck, but then we are not comparing apples to oranges.

Your single threaded RC will still have to write back to memory, no one thinks that incrementing an integer is the slow part — destroying cache is.

Even when I had this in the hot path 10 years ago and was owning hundreds of objects in a particle filter, handing out ownership copies and creating new ones ended up taking ~5% (ie making it a contiguous vector without any shared_ptr). It can be expensive but in those case you probably shouldn’t be using shared_ptr.

Oh, and the cost of incrementing an integer by itself (non atomically) is stupid fast. Like you can do a billion of them per second. The CPU doesn’t actually write that immediately to RAM and you’re not putting a huge amount of extra cache pressure vs all the other things your program is doing normally.

> Your single threaded RC will still have to write back to memory

I think you mean mem or cache, and there's a good chance it will remain in cache and not be flushed to ram for short lived objects.

> no one thinks that incrementing an integer is the slow part — destroying cache is.

agreed

  • If you write to cache, then depending on architecture that change has to be made visible to every other thread as well. Reading is not subject to such a constraint.

    • MESI cache coherency (and its derivatives) [1] means that you can have exclusive writes to cache if and only if no other core tries to access that data. I would think most if not all microarchitectures have moved to MESI (or equivalent) cache coherency protocols as they avoid unnecessary writes to memory.

      [1]: https://en.wikipedia.org/wiki/MESI

      3 replies →

    • Only for atomics. Single-threaded RC counts are fine and WebKit uses hybrid RC where you use an atomic shared ptr to share across threads and then you downgrade it to a non atomic version in-thread (and can hand out another atomic copy at any time).

      Atomics are rarely needed as you should really try to avoid sharing ownership across threads and instead change your design to avoid that if you can.

      1 reply →