The ONNX Interchange Format

The problem it solves

Models are trained in one framework but often need to run somewhere else, perhaps a mobile device, a server runtime, or specialized hardware. Rewriting a model for each target is brittle. ONNX, the Open Neural Network Exchange, is a shared format that lets a model move between tools.

What ONNX is

An ONNX file is a computation graph. It stores nodes, each an operator like a matrix multiply or a convolution, the edges of tensors between them, and the trained weights as initializers. Operators come from a versioned set called the opset.

A framework exports its model to the ONNX graph.
A separate runtime loads that graph and executes it, choosing optimized kernels for the local hardware.

What to watch for

Opset mismatch: if the runtime is older than the exported opset, some operators may be unsupported.
Unsupported operators: a custom layer may have no ONNX equivalent, so export fails or needs a fallback.
ONNX separates the model definition from execution, so the same file can run on many backends through graph optimizations.

Key idea

ONNX is a portable computation graph that decouples training framework from execution, so a model exported once can run on many runtimes, as long as the opset and operators are supported.

The ONNX Interchange Format

The problem it solves

What ONNX is

What to watch for

Key idea

Check yourself