Filters and Feature Maps
A single kernel detects one pattern, but a real layer uses many. Each kernel, also called a filter, produces its own output grid called a feature map. Stacking these maps gives the layer many channels of learned features.
From filters to maps
When a convolutional layer has thirty two filters, it outputs thirty two feature maps. Each map highlights where its filter found a match, such as a vertical edge, a color blob, or a corner.
- One filter spans all input channels and outputs one feature map.
- The number of filters sets the output channel count.
- Deeper layers combine simple maps into more abstract ones.
A hierarchy of features
Early layers learn simple parts like edges and textures. Middle layers combine those into shapes such as eyes or wheels. Late layers respond to whole objects. This feature hierarchy is why deep networks see so well.
Because each filter has its own weights, the network discovers a diverse set of detectors during training rather than relying on hand designed features.
Key idea
Each filter produces one feature map, and stacking many maps builds a hierarchy of learned features from edges up to objects.