Randomness everywhere
Training touches many random sources: weight init, data shuffling, dropout, and augmentation. To reproduce a run you must fix the seeds that drive them all.
- Set seeds for the language, the array library, and the framework.
- Control data ordering and split seeds.
- Record library versions and hardware where relevant.
Seeds are not the whole story
A fixed seed makes one run repeatable, but a single seed can also mislead. A good score on seed 42 may be luck.
- Report results across several seeds with mean and spread.
- Distinguish a real gain from seed noise.
- Note that some GPU operations remain nondeterministic.
What to pin
Pinning seeds plus versions makes a run rerunnable.
Key idea
Fixing seeds across the language, libraries, and framework makes a run repeatable, but report results over several seeds so you can tell a real improvement from random variation.