The Wide Column Store

Wide column stores group columns into families and let each row hold a different sparse set of columns.

A Different Shape of Table

A wide column store, such as Cassandra or HBase, looks like a table but behaves very differently. Each row is identified by a key and can hold a huge and sparse set of columns. Two rows in the same table need not share the same columns.

Column Families and Partitions

Columns are grouped into column families that are stored together on disk.
A partition key decides which node owns a row, spreading data across the cluster.
Within a partition, rows are sorted by a clustering key, which makes range scans inside a partition efficient.

Why It Scales

Writes are appended and are cheap, so the model handles very high write volume.
Data is distributed by partition key, so the cluster scales horizontally by adding nodes.
Queries that hit a single partition are fast because the relevant columns sit together.

The Constraint

You must design tables around the queries you will run. There are no flexible joins and limited ad hoc filtering, so a query that does not align with the partition and clustering keys can be slow or impossible.

Key idea

A wide column store spreads sparse rows across partitions by key and sorts them by a clustering key, scaling writes while forcing query driven table design.

The Wide Column Store

A Different Shape of Table

Column Families and Partitions

Why It Scales

The Constraint

Key idea

Check yourself