Rows versus columns
A row store keeps all fields of a record together, which suits transactional reads of whole rows. A column store keeps each field's values together across all rows. Analytics usually scans a few columns over many rows, so columnar layout reads far less data.
Why columnar wins for analytics
- Selective reads: a query touching three of fifty columns reads only those three, skipping the rest entirely.
- Better compression: a column holds values of one type with low variety, so encodings like dictionary and run length shrink it dramatically.
- Vectorized execution: processing a column as a batch lets the CPU work on many values per instruction.
The trade off
Columnar formats are slow for writing or updating single full rows, since one record is scattered across many column blocks. They favor append and scan workloads over point updates.
Key idea
Columnar processing stores fields together so analytic queries read only needed columns and compress them tightly, trading single row update speed for fast wide scans.