The goal
Neural Architecture Search, or NAS, automates designing the network itself, choosing things like layer types, widths, and connections. The aim is to find structures that beat hand designed ones for a given task and budget.
Three ingredients
- Search space: the set of allowed architectures, such as a stack of cells whose operations are chosen from a fixed menu.
- Search strategy: how candidates are proposed, using reinforcement learning, evolution, or gradient based methods.
- Performance estimation: how each candidate is scored, ideally without fully training it.
Why naive NAS is brutal
Early NAS trained thousands of full models, costing enormous compute. Modern methods cut this cost sharply.
- Weight sharing trains one supernet that contains all candidate paths, so sub architectures inherit weights instead of training from scratch.
- Differentiable search relaxes the discrete choice into continuous weights over operations, so gradient descent can pick the structure.
- Proxy tasks estimate quality from a smaller dataset or fewer epochs.
The risk is that a cheap proxy may rank architectures differently than full training would, so the found design underperforms when scaled up.
Key idea
NAS automates network design through a search space, a search strategy, and a performance estimator; weight sharing and differentiable methods make it affordable, but cheap proxies can mislead the ranking.