Layers of refinement
The medallion architecture structures a lakehouse into three named layers, each raising data quality and usefulness as data flows forward.
- Bronze holds raw ingested data, kept as close to the source as possible. It is append only and acts as a replayable history of everything that arrived.
- Silver holds cleaned, deduplicated, and conformed data. Types are fixed, bad rows are filtered, and tables are joined into validated entities.
- Gold holds business level aggregates and features shaped for specific dashboards, reports, or models.
Why the layering helps
Each layer has a clear contract, so failures are isolated and ownership is clear. Because bronze keeps raw history, you can rebuild silver and gold whenever transformation logic changes, without re ingesting from sources. Consumers read gold and never touch messy raw data.
Key idea
The medallion architecture flows data through bronze raw, silver cleaned, and gold aggregated layers, isolating responsibility and letting you rebuild downstream tables from raw history.