The core trick
The bootstrap estimates how much a metric would vary if you could collect new datasets, using only the one sample you have. It treats your sample as a stand in for the population and resamples from it.
How it runs
- Resample with replacement to build a new dataset of the same size.
- Recompute the metric, such as accuracy or AUC, on that resample.
- Repeat thousands of times to build a distribution of the metric.
The spread of that distribution reflects the uncertainty in your estimate.
Building the interval
A common percentile interval takes the middle ninety five percent of the bootstrap values. The lower bound is the two point five percentile and the upper bound is the ninety seven point five percentile. This gives a range you can report alongside a point estimate.
Why it helps
The bootstrap needs no formula for the metric distribution, so it works for awkward statistics like AUC where closed form intervals are hard. Its main assumption is that your sample represents the population, so it cannot rescue data that is biased or too small.
Key idea
The bootstrap resamples your data with replacement to build a distribution of a metric, and percentiles of that distribution give a confidence interval without a closed form formula.