Classic CNN Architectures
A handful of landmark architectures defined how convolutional networks are built. Studying them shows the patterns that still appear in modern designs.
The early designs
- LeNet was an early network for digit recognition, alternating convolution and pooling before dense layers.
- AlexNet scaled this up, won a major image contest, and popularized ReLU activations and dropout.
These showed that depth plus large data plus GPUs could beat hand crafted vision features.
The deeper era
- VGG stacked many small three by three convolutions, proving that depth with simple uniform blocks works well.
- GoogLeNet introduced the inception module that runs several kernel sizes in parallel and concatenates them.
Common pattern
Most classic networks follow the same flow: convolution and pooling layers gradually shrink spatial size while growing channel count, then a classifier head turns the final features into class scores.
The lasting lessons are that small stacked kernels are efficient, that depth helps until training gets hard, and that parameter count must be managed as networks grow.
Key idea
Classic CNNs like LeNet AlexNet VGG and inception shrink space while growing channels, and they taught that stacked small kernels and depth work well.