← Lessons

quiz vs the machine

Gold1380

Machine Learning

The ONNX Runtime

A portable model format and engine that runs across many backends.

4 min read · core · beat Gold to climb

A shared model format

Different frameworks save models in their own formats, which complicates deployment. ONNX is an open standard that represents a model as a graph of standard operators, so a model trained in one framework can run elsewhere.

The runtime

ONNX Runtime is an engine that loads an ONNX graph and executes it efficiently. Its key feature is execution providers: pluggable backends for different hardware such as CPUs, GPUs, or specialized accelerators.

  • The runtime applies graph optimizations like fusion and constant folding.
  • It then partitions the graph across available execution providers.
  • Each provider runs the parts it supports best.

From training to deployment

Why teams use it

ONNX decouples the training framework from the serving stack. You can train freely and then deploy the same exported graph on many targets without rewriting the model, while the runtime handles backend specific optimization.

Key idea

ONNX is a portable model graph format and ONNX Runtime executes it across pluggable execution providers, separating training framework from deployment hardware.

Check yourself

Answer to earn rating on the learn ladder.

1. What is the role of execution providers in ONNX Runtime?

2. What problem does the ONNX format solve?