The Reproducibility Seeds

Randomness everywhere

Training touches many random sources: weight init, data shuffling, dropout, and augmentation. To reproduce a run you must fix the seeds that drive them all.

Set seeds for the language, the array library, and the framework.
Control data ordering and split seeds.
Record library versions and hardware where relevant.

Seeds are not the whole story

A fixed seed makes one run repeatable, but a single seed can also mislead. A good score on seed 42 may be luck.

Report results across several seeds with mean and spread.
Distinguish a real gain from seed noise.
Note that some GPU operations remain nondeterministic.

What to pin

Pinning seeds plus versions makes a run rerunnable.

Key idea

Fixing seeds across the language, libraries, and framework makes a run repeatable, but report results over several seeds so you can tell a real improvement from random variation.

The Reproducibility Seeds

Randomness everywhere

Seeds are not the whole story

What to pin

Key idea

Check yourself