The problem it solves
Models are trained in one framework but often need to run somewhere else, perhaps a mobile device, a server runtime, or specialized hardware. Rewriting a model for each target is brittle. ONNX, the Open Neural Network Exchange, is a shared format that lets a model move between tools.
What ONNX is
An ONNX file is a computation graph. It stores nodes, each an operator like a matrix multiply or a convolution, the edges of tensors between them, and the trained weights as initializers. Operators come from a versioned set called the opset.
- A framework exports its model to the ONNX graph.
- A separate runtime loads that graph and executes it, choosing optimized kernels for the local hardware.
What to watch for
- Opset mismatch: if the runtime is older than the exported opset, some operators may be unsupported.
- Unsupported operators: a custom layer may have no ONNX equivalent, so export fails or needs a fallback.
- ONNX separates the model definition from execution, so the same file can run on many backends through graph optimizations.
Key idea
ONNX is a portable computation graph that decouples training framework from execution, so a model exported once can run on many runtimes, as long as the opset and operators are supported.