← Back to context

Comment by dragontamer

4 hours ago

NUMA aware threading is somewhat rare but it does exist.

Its just reaching into the high arts of high-performance that fewer-and-fewer programmers know about. I myself am not an HPC expert, I just like to study this stuff on the side as a hobby.

So NUMA-awareness is when your code knows that &variable1 is located in one physical location, while &variable2 is somewhere else.

This is possible because NUMA-aware allocators (numa_alloc in Linux, VirtualAlloc in Windows) can take parameters that guarantee an allocation within a particular NUMA zone.

Now that you know certain variables are tied together in physical locations, you can also tie threads together with affinity to those same NUMA locations. And with a bit of effort, you can ensure that threads that are in one workpool share the same NUMA zones.

---------

Now code-awareness of shared caches is less common. But following the same models of "abstracted work pools of thread-affinity + NUMA awareness of data", programmers have been able to ensure Zen1 cores to be working together with the same L3 cache.

L2 cache with E-cores is new, but not a new concept in general. (IE: the same mechanisms and abstractions we used for thread-affinity checks on Zen cores sharing L3 cache, or multi-socket CPUs being NUMA Aware... all would still work for L2 cache).

I don't know if the libraries support that. But I bet Intel's library (TBB) and their programmers are working on keeping their abstractions clean and efficient.

> I don't know if the libraries support that. But I bet Intel's library (TBB) and their programmers are working on keeping their abstractions clean and efficient.

Intel can declare in ACPI a set of nodes, the distances between nodes, and then Linux/libnuma/etc pick it up.

So, e.g. in AMD's SLIT tables, the local node is 10; within the same partition are 11; within the same socket are 12; distant sockets are >=20.

There's fancier, more detailed tables (e.g. HMAT) and some code out there that uses them, but it's kind of beyond the scope of libnuma.