A Different Shape of Table
A wide column store, such as Cassandra or HBase, looks like a table but behaves very differently. Each row is identified by a key and can hold a huge and sparse set of columns. Two rows in the same table need not share the same columns.
Column Families and Partitions
- Columns are grouped into column families that are stored together on disk.
- A partition key decides which node owns a row, spreading data across the cluster.
- Within a partition, rows are sorted by a clustering key, which makes range scans inside a partition efficient.
Why It Scales
- Writes are appended and are cheap, so the model handles very high write volume.
- Data is distributed by partition key, so the cluster scales horizontally by adding nodes.
- Queries that hit a single partition are fast because the relevant columns sit together.
The Constraint
You must design tables around the queries you will run. There are no flexible joins and limited ad hoc filtering, so a query that does not align with the partition and clustering keys can be slow or impossible.
Key idea
A wide column store spreads sparse rows across partitions by key and sorts them by a clustering key, scaling writes while forcing query driven table design.