Code Yarns β€πŸ‘¨β€πŸ’»
Tech Blog ❖ Personal Blog

NUMA

πŸ“… 2021-Feb-26 ⬩ ✍️ Ashwin Nanjappa ⬩ 🏷️ numa ⬩ πŸ“š Archive

NUMA (Non Uniform Memory Access) is a multiprocessor system where it does not take the same amount of time to access any given memory location. The system memory (RAM) is split contiguously between the CPUs (first half to CPU0 and second half to CPU1 on a 2-CPU NUMA) such that accessing memory associated with a different CPU might take longer than accessing memory of one’s own CPU. It takes longer because the data needs to travel through the interconnect between the CPUs. Most of the NUMA magic is supported by the CPUs itself, so that normal load-store machine instructions do all this behind the scenes.

$ numastat
                           node0           node1           node2           node3
numa_hit                 2098759          453124          626116          663329
numa_miss                      0               0               0               0
numa_foreign                   0               0               0               0
interleave_hit             62341           62613           62331           62595
local_node               2067691          369541          543901          566566
other_node                 31068           83583           82215           96763
$ sudo cat /proc/sys/kernel/numa_balancing
1

Echo a 0 or 1 to that file to toggle the tuning.

Alternatively, find the PCI bus IDs of the GPUs (as shown here) and check the NUMA node of that PCI bus ID like this:

$ cat /sys/bus/pci/devices/0000:01:00.0/numa_node
3

$ cat /sys/bus/pci/devices/0000:c2:00.0/numa_node
0

Β© 2023 Ashwin Nanjappa β€’ All writing under CC BY-SA license β€’ 🐘 Mastodon β€’ πŸ“§ Email