← Lessons

quiz vs the machine

Silver1040

Machine Learning

Image Representation and Channels

How a picture becomes a grid of numbers a network can read.

4 min read · intro · beat Silver to climb

Image Representation and Channels

To a computer an image is just numbers. Each picture is a grid of pixels, and each pixel holds one or more intensity values. Understanding this layout is the first step in computer vision.

Width height and channels

A color image is usually stored as a three dimensional array with shape height by width by channels. The first two dimensions locate a pixel, and the third holds its color.

  • A grayscale image has one channel, a single brightness value per pixel.
  • A standard color image has three channels for red, green, and blue.
  • Some images add a fourth alpha channel for transparency.

Pixel values

Each value typically ranges from 0 to 255 when stored as bytes. Zero means dark and 255 means full intensity for that channel. Networks often rescale these to a 0 to 1 range or normalize them so training is more stable.

Why channels matter

A convolutional network treats channels as separate feature planes. The first layer sees three color planes, and deeper layers build up many channels that each represent a learned feature rather than a raw color.

Key idea

An image is a height by width by channels tensor of pixel intensities, and color is encoded as separate channel planes.

Check yourself

Answer to earn rating on the learn ladder.

1. What is the typical shape of a color image tensor?

2. How many channels does a standard RGB image have?