Row Versus Column Layout
A traditional row store keeps all values of one row together. A column store instead keeps all values of one column together, storing each column as its own contiguous sequence.
Why Columns Help Analytics
Analytic queries often touch a few columns across millions of rows, such as summing one amount column. A column store reads only those columns from disk, skipping the rest entirely. A row store would have to read whole rows, wasting I O on unneeded fields.
Compression Wins
Storing a column together places many similar values side by side, which compresses extremely well.
- Run length encoding collapses long runs of repeated values.
- Dictionary encoding replaces repeated strings with small integer codes.
- Better compression means fewer bytes read, so I O drops further.
Engines also process compressed data directly and use vectorized execution that operates on batches of column values at once.
The Cost
Column stores pay on writes. Inserting one row means touching every column file, so they favor bulk loads and append only ingestion over single row updates. This is why they power data warehouses rather than transactional systems.
Key idea
Column oriented storage groups each column together so analytic queries read only needed columns and compress them heavily, at the cost of slow single row writes.