← Lessons

quiz vs the machine

Silver1080

Machine Learning

The Pooling and Stride Recap

Downsampling shrinks feature maps for efficiency and invariance.

3 min read · intro · beat Silver to climb

Downsampling

Convolutional networks shrink spatial resolution as they go deeper. Two common tools are pooling and stride.

  • Max pooling takes the largest value in each small window.
  • Average pooling takes the mean over the window.
  • Stride moves the filter more than one step at a time, skipping positions.

Why downsample

  • It reduces the number of activations, saving compute and memory.
  • It grows the effective receptive field quickly.
  • It adds a little translation invariance, since small shifts often leave the pooled value unchanged.

A stride of two roughly halves each spatial dimension, much like a pooling window of two.

Key idea

Pooling and strided convolution both downsample feature maps, trading spatial detail for efficiency, larger receptive fields, and modest shift invariance.

Check yourself

Answer to earn rating on the learn ladder.

1. What does max pooling output for a window?

2. What effect does a stride of two have on spatial size?