← Lessons

quiz vs the machine

Gold1410

Databases

Time Series in Cassandra

Modeling high volume time stamped data with buckets and TTL.

6 min read · core · beat Gold to climb

A natural fit

Cassandra suits time series because writes are fast and append friendly, and clustering columns keep events in time order on disk for efficient range scans.

Key design

A typical table keys the partition by a source plus time bucket and clusters by timestamp.

  • Partition key example: sensor id plus day, which bounds each partition.
  • Clustering column: the event timestamp, often descending so the newest rows read first.

Time bucketing

Without bucketing, a busy source would create an unbounded wide partition. Bucketing by hour, day, or month caps growth.

  • Pick a bucket size so each partition stays under the size guidelines.
  • Reads target one or a few buckets, keeping queries fast.

TTL and expiry

Time series data is often transient, so rows are written with a time to live (TTL) that auto expires them.

  • Pair TTL with TimeWindowCompactionStrategy so whole expired windows drop cheaply.
  • This avoids tombstone buildup from manual deletes.

Diagram

Key idea

Model time series with a source plus time bucket partition key, cluster by timestamp, and pair TTL with TWCS so old data expires cheaply without tombstone pain.

Check yourself

Answer to earn rating on the learn ladder.

1. Why add a time bucket to the partition key for time series?

2. Which compaction strategy pairs best with TTL based time series?