Moving changes, not snapshots
Copying a whole table every night is slow and misses intermediate changes. Change data capture, or CDC, streams each row change out of a source database as it happens, so downstream systems stay nearly in sync.
Log based capture
The robust approach reads the database transaction log, such as the write ahead log or binlog, which already records every committed change in order. A connector tails this log and emits an event per change with its operation type of insert, update, or delete.
- It avoids polling and adds little load to the source.
- It captures every change including deletes, which query based polling can miss.
- It preserves commit order, which matters for correctness.
Applying changes downstream
Consumers must apply changes idempotently and in order, often upserting by primary key and tombstoning deletes. Because the same change may be delivered more than once, downstream merge logic must be safe to repeat.
Key idea
Log based change data capture tails the transaction log to stream ordered inserts, updates, and deletes with low source load, requiring idempotent ordered application downstream.