Cache Line Padding

The remedy for false sharing

When per thread variables share a cache line they cause false sharing. Cache line padding is the deliberate technique of placing each hot variable on its own line so writes never collide.

How padding works

Find the cache line size, commonly sixty four bytes.
Add filler bytes after a variable so the next hot variable starts on a fresh line.
Or align the variable to a line boundary using language alignment features.

A frequent pattern is an array of per thread counters where each entry is padded out to a full line. Each core then writes only its own line and no invalidations bounce between cores.

The trade off

Padding costs memory. A counter that needed four bytes may now occupy sixty four. This is usually worth it for a few very hot variables but wasteful if applied everywhere, so it is reserved for contended fields identified by profiling.

Key idea

Cache line padding spaces hot variables onto separate lines to eliminate false sharing, trading some memory for fewer coherence invalidations.

Cache Line Padding

The remedy for false sharing

How padding works

The trade off

Key idea

Check yourself