Rows vs Columns
A traditional database stores a row's fields together on disk. A columnar store keeps all values of one column together instead. For analytics that read a few columns across millions of rows, this layout reads far less data.
Why It Is Fast For Analytics
- A query that sums one column reads only that column, not whole rows.
- Values in a column share a type, so compression works extremely well.
- Repeated values enable tricks like run length encoding and dictionaries.
- The engine can scan compressed blocks and skip irrelevant chunks.
The Trade Off
- Reading a single full row means gathering values from many column files.
- Inserting one row touches every column, so writes are slower.
- This is why columnar stores power OLAP while row stores power OLTP.
Key idea
Columnar storage groups each column together, so analytical scans read and compress only the columns they need at the cost of slower single row reads and writes.