The hidden cost
Coherence works at the granularity of a whole cache line, typically sixty four bytes, not individual variables. If two unrelated variables sit in the same line, two cores writing them appear to share data even though they do not.
What goes wrong
- Core one writes variable A and gains Modified ownership of the line.
- Core two writes variable B in the same line and must invalidate core one.
- The line bounces back and forth on every write.
This is false sharing. The program is logically correct but the line ping pongs across the interconnect, destroying performance. No real data is shared, only the line.
Spotting and fixing it
False sharing often shows up as a parallel loop that scales poorly even with no locks. The fix is to ensure independent variables land in separate cache lines, commonly by adding padding or aligning per thread counters.
Key idea
False sharing happens when independent variables share one cache line, so the line bounces between cores even though no real data is shared.