Requirements
- Edit a file on one device and have changes appear on others.
- Avoid re uploading unchanged data.
- Handle conflicts when two devices edit the same file offline.
High level design
Files are split into content addressed chunks so only changed parts move.
- Chunking: split each file into blocks, hash each block, and store blocks keyed by hash in object storage.
- Metadata service: tracks the ordered list of chunk hashes per file version and the per device sync state.
- Sync flow: a client computes new chunk hashes, uploads only blocks the server lacks, then commits a new metadata version.
- Notification: a long polling or push channel tells other devices a new version exists.
Bottlenecks
- Bandwidth: deduplicating chunks by hash avoids resending unchanged data.
- Conflict resolution: detect divergent versions and keep both as conflicted copies rather than silently losing edits.
- Metadata scale: shard metadata by user or namespace.
Tradeoffs
- Smaller chunks dedupe better but increase metadata and hashing overhead.
- Push notifications cut latency but cost persistent connections versus cheaper polling.
Key idea
File sync is content addressed chunking where only changed blocks move, a metadata version log tracks state, and notifications trigger pulls on other devices.