Upload is only the start
A raw upload cannot play on every device. YouTube runs an asynchronous pipeline that transcodes each video into many resolutions and codecs, then publishes when ready.
Chunked transcoding
The video is split into segments that are transcoded in parallel across a fleet of workers. Splitting work this way turns a long serial encode into many short parallel jobs.
- Upload lands in object storage
- A queue dispatches transcode jobs per segment and resolution
- Workers write renditions back to storage
- The video flips to published once core renditions are ready
Decoupling with a queue
A message queue sits between upload and transcoding so spikes in uploads do not overwhelm workers. The uploader gets a fast response while encoding proceeds in the background.
The pipeline is asynchronous because encoding is slow and bursty, and the user should not wait for it.
Key idea
Decouple upload from transcoding with a queue, split each video into segments encoded in parallel, and publish once the core renditions are ready.