The Receptive Field
The receptive field of a neuron is the region of the input image that can influence its value. It tells you how much context a feature has access to and is a key design concern in vision networks.
Growing with depth
A neuron in the first layer sees only a small patch the size of the kernel. As layers stack, each neuron pools information from neurons below it, so the effective region grows.
- A single three by three layer has a receptive field of three by three.
- Stacking layers, using stride, and pooling all enlarge it.
- Deep neurons can see most of the image.
Why it matters
To recognize a large object the network needs neurons whose receptive field covers it. If the field is too small, the network sees only fragments and cannot reason about global shape.
Ways to grow it
- Deeper networks accumulate field size layer by layer.
- Strided convolution and pooling expand it quickly by downsampling.
- Dilated convolution spreads kernel taps apart to cover more area without more weights.
Designers balance receptive field against resolution: large fields capture context but coarse downsampling can lose fine detail needed for small objects.
Key idea
The receptive field is the input region a neuron can see, and it grows with depth, stride, pooling, and dilation to capture larger context.