Pooling Layers

What it is

A pooling layer reduces the size of a feature map by summarizing small regions into a single value. It usually follows convolution layers in a vision network.

Common kinds

The two common forms summarize a small window such as two by two.

Max pooling keeps the largest value in each window, which keeps the strongest activation
Average pooling keeps the mean of the window, which smooths the signal

A pooling window with a stride of two halves the height and width, so the feature map gets smaller while keeping the channel count.

Why it helps

Pooling has several effects.

It lowers the spatial size, so later layers do less work
It gives some invariance to small shifts, since a pattern that moves a little still lands in the same pooled cell
It widens the area that deeper neurons can see, called the receptive field

Many modern networks also use global average pooling at the end, which collapses each channel to one number before the classifier.

Key idea

Pooling shrinks feature maps by summarizing regions, cutting compute and adding tolerance to small shifts.

What it is

Common kinds

Why it helps

Key idea

Check yourself