The Thread Affinity and Pinning

Choosing where a thread runs

Thread affinity, also called pinning, tells the scheduler to run a given thread only on a specific core or set of cores. By default the OS is free to move a thread between cores to balance load, but pinning overrides that and holds it in place.

Why pin a thread

Warm caches since the thread keeps reusing the same core's caches instead of arriving cold on a new core.
Steady latency because migration stalls and cache reloads are avoided, tightening the tail.
NUMA locality so a thread stays near the memory bank attached to its core, avoiding slower remote access.

The tradeoffs

Pinning is not free of risk. A pinned thread cannot move to an idle core when its own core is busy, so a poor pinning plan can leave some cores overloaded while others idle. It also fights the OS load balancer, which assumed it could spread work. Affinity shines in latency sensitive systems, such as trading or packet processing, where predictable cache behavior beats flexible balancing.

Key idea

Thread affinity pins a thread to chosen cores to keep caches warm and latency steady, at the cost of flexibility when a pinned core is busy while others are idle.

The Thread Affinity and Pinning

Choosing where a thread runs

Why pin a thread

The tradeoffs

Key idea

Check yourself