Case Study: Design a Unique ID Generator in Distributed Systems

“Mỗi công dân có một số CMND/CCCD duy nhất. Mỗi sự kiện trong hệ thống phân tán cũng cần một ID duy nhất --- không trùng lặp, không xung đột, dù hàng triệu máy chủ cùng tạo ra cùng lúc. Đó chính là bài toán Unique ID Generator.”

Tags: system-design case-study unique-id snowflake distributed-systems alex-xu interview Student: Hieu Prerequisite: Tuan-01-Scale-From-Zero-To-Millions · Tuan-02-Back-of-the-envelope · Tuan-07-Database-Sharding-Replication Lien quan: Tuan-10-Consistent-Hashing · Tuan-16-Design-URL-Shortener · Tuan-20-Design-Key-Value-Store


1. Context & Why --- Tai sao can Unique ID Generator?

1.1 Analogy: So CMND/CCCD cho he thong

Hieu, em hay nghi the nay: moi cong dan Viet Nam co mot so CMND/CCCD duy nhat. So nay giup nha nuoc phan biet 100 trieu nguoi --- khong ai trung nhau. Trong he thong phan tan, moi event, moi record, moi transaction cung can mot “so CMND” --- mot globally unique identifier.

The gioi thucHe thong phan tan
So CMND/CCCDUnique ID
Co quan cong an capID Generator service
Khong trung giua cac tinhKhong trung giua cac datacenter
Co the tra cuu theo thoi gianSortable by time
Co dinh dang (12 so)Co dinh dang (64-bit integer)

1.2 Tai sao khong dung AUTO_INCREMENT trong DB?

Khi he thong chi co mot database duy nhat, AUTO_INCREMENT la du. Nhung khi em scale ra nhieu DB server, nhieu datacenter:

Van deGiai thich
Single point of failureMot DB di xuong = khong tao duoc ID
BottleneckMoi request phai goi DB de lay ID --- latency cao
Khong scale horizontally10K+ IDs/sec se lam DB qua tai
Khong work across datacenter2 DB o 2 DC co the tao trung ID
Khong sortable by timeAUTO_INCREMENT chi tang dan, khong chua thong tin thoi gian

Aha Moment: AUTO_INCREMENT la anti-pattern trong distributed systems. Em can mot co che tao ID khong can coordination giua cac node.


2. Deep Dive --- Alex Xu 4-Step Framework

Overview --- Bon buoc cua Alex Xu

BuocTen goiThoi gian (45 phut)
1Understand the Problem & Establish Design Scope~5 phut
2Propose High-Level Design~15 phut
3Design Deep Dive~20 phut
4Wrap Up~5 phut

Step 1 --- Understand the Problem & Establish Design Scope

2.1 Functional Requirements (Yeu cau chuc nang)

Cau hoi cho InterviewerTra loi / Gia dinh
ID phai unique globally?Co, khong bao gio trung lap
ID chi chua so hay co the chua chu cai?Chi chua so (numerical values only)
ID co can sortable theo thoi gian?Co, ID tao sau phai lon hon ID tao truoc (roughly)
Do dai ID?64-bit integer (fit vao BIGINT cua SQL)
He thong can tao bao nhieu ID/giay?10,000+ IDs/sec
He thong co nhieu datacenter?Co, nhieu datacenter, nhieu machine

Tom tat Functional Requirements:

  1. Globally unique --- Khong bao gio trung lap, ke ca across datacenter
  2. 64-bit integer --- Fit vao BIGINT, dung lam primary key
  3. Sortable by time --- ID tao sau > ID tao truoc (roughly ordered)
  4. High throughput --- 10,000+ IDs/sec per system

2.2 Non-Functional Requirements (Yeu cau phi chuc nang)

Yeu cauMuc tieuGiai thich
High Availability99.99% uptimeID generation khong bao gio duoc down --- moi service deu phu thuoc
Low Latency< 1ms per IDID generation nam tren critical path cua moi write operation
ScalabilityLinear scale voi so machineThem machine = them throughput, khong can coordination
OrderingTime-sortableIDs tao trong cung millisecond tren cung machine phai co thu tu
No coordinationKhong can central authorityMoi node tu tao ID doc lap

Step 2 --- Propose High-Level Design: 4 Approaches

Hieu, co 4 cach tiep can chinh de tao Unique ID trong distributed systems. Moi cach co trade-off rieng.

Approach 1: Multi-Master Replication (DB Auto-Increment voi buoc nhay)

Y tuong: Dung AUTO_INCREMENT cua DB, nhung moi server tang voi buoc nhay khac nhau.

Server 1: 1, 3, 5, 7, 9, ...    (start=1, increment=2)
Server 2: 2, 4, 6, 8, 10, ...   (start=2, increment=2)

Tong quat voi N servers:
Server k: k, k+N, k+2N, k+3N, ...
Uu diemNhuoc diem
Don gian, dung san feature cua DBKho scale: them/bot server phai thay doi increment cua TAT CA server
Khong can them service naoID khong sortable by time (Server 1 co the tao ID 7 SAU khi Server 2 tao ID 8)
Tuong doi reliableKhong work across datacenter (latency cao)
Throughput bi gioi han boi DB write speed

Khi nao dung: He thong nho, it server, khong can time-sorted IDs.

Approach 2: UUID (Universally Unique Identifier)

Y tuong: Moi machine tu tao UUID 128-bit doc lap, khong can coordination.

UUID v4 Example: 550e8400-e29b-41d4-a716-446655440000
                 |--------|    |  |    |  |-----------|
                  random    version  variant   random
Uu diemNhuoc diem
Don gian nhat, moi ngon ngu deu ho tro128-bit (khong fit vao yeu cau 64-bit)
Khong can coordination giua cac nodeKhong sortable by time
Khong co single point of failureHieu suat DB kem: random UUID lam B+ tree index fragmentation
Throughput vo han (moi node tu tao)Dai (36 chars string) --- ton storage

Collision probability cua UUID v4:

Voi UUIDs (1 quintillion):

Thuc te collision probability cuc thap, nhung khong phai zero va khong dam bao unique bang Snowflake.

Khi nao dung: Khi khong can sortable, khong gioi han 64-bit, can trien khai nhanh (vd: session ID, correlation ID, idempotency key).

Approach 3: Ticket Server (Flickr Approach)

Y tuong: Dung mot (hoac vai) server trung tam voi AUTO_INCREMENT lam “may ban ve”. Moi service goi den Ticket Server de lay ID tiep theo.

          +------------------+
          |  Ticket Server   |
          |  AUTO_INCREMENT  |
          |  current_id: 1000|
          +--------+---------+
                   |
      +-----+-----+-----+
      |     |     |     |
   Svc A  Svc B  Svc C  Svc D
   id=1001 id=1002 id=1003 id=1004
Uu diemNhuoc diem
Don gian, de implementSingle Point of Failure (SPOF)
ID la so, 64-bit, tang danBottleneck: moi request phai goi Ticket Server
Sortable by time (roughly)Latency tang khi cross-datacenter
Flickr dung thanh cong nhieu namCan replicate Ticket Server de HA --- them complexity

Flickr’s solution: Dung 2 Ticket Servers voi increment = 2, giong Multi-Master nhung chi danh rieng cho ID generation.

Khi nao dung: He thong vua, chap nhan single point of coordination, can ID tang dan don gian.

Y tuong: Moi ID la mot 64-bit integer duoc compose tu nhieu thanh phan: timestamp, datacenter ID, machine ID, va sequence number. Moi machine tu tao ID doc lap.

+---+-------------------------------------------+---------+---------+----------------+
| 0 |          41-bit Timestamp                  | 5-bit   | 5-bit   | 12-bit         |
|   |          (milliseconds since epoch)        |Datacenter| Machine | Sequence       |
+---+-------------------------------------------+---------+---------+----------------+
 1 bit                41 bits                     5 bits    5 bits     12 bits = 64 bits
Uu diemNhuoc diem
64-bit integer --- fit yeu cauPhu thuoc vao dong ho may (clock)
Sortable by time (timestamp trong ID)Clock skew giua cac machine gay ra out-of-order IDs
Khong can coordinationCan quan ly datacenter ID va machine ID
Throughput cuc cao: 4,096 IDs/ms/machineTime-sorted chi trong pham vi “roughly” --- khong tuyet doi
De extract timestamp tu IDEpoch overflow sau 69 nam

Aha Moment: Snowflake la cach tiep can duoc su dung nhieu nhat trong thuc te. Twitter, Discord, Instagram, Baidu deu dung bien the cua Snowflake.


Bang so sanh toan bo Approaches

Tieu chiMulti-MasterUUIDTicket ServerSnowflakeULIDKSUIDMongoDB ObjectId
Bit size64128646412816096
Globally uniqueCo (voi cau hinh dung)Co (cuc cao)CoCoCoCoCo
Sortable by timeKhongKhong (v4)CoCoCoCoCo
No coordinationKhongCoKhongCoCoCoCo
ThroughputThap (DB bound)Vo hanThap (bottleneck)4,096/ms/machineVo hanVo han16M/sec/process
ComplexityThapCuc thapTrung binhTrung binhThapThapThap
DB index friendlyCoKhong (random)CoCoCoCoCo
Dung trong thuc teMySQL replicationSession ID, correlation IDFlickrTwitter, Discord, InstagramShopify, CockroachDBSegmentMongoDB
Fit 64-bit?CoKhongCoCoKhongKhongKhong

Cac bien the khac (Ngoai Alex Xu)

ULID (Universally Unique Lexicographically Sortable Identifier):

  • 128-bit: 48-bit timestamp (millisecond) + 80-bit randomness
  • Encoded thanh 26-char Crockford Base32
  • Sortable by time, khong can coordination
  • Su dung: Shopify, CockroachDB

KSUID (K-Sortable Unique Identifier):

  • 160-bit: 32-bit timestamp (second) + 128-bit random payload
  • Encoded thanh 27-char Base62
  • Epoch: 2014-05-13 --- overflow sau ~136 nam
  • Su dung: Segment

MongoDB ObjectId:

  • 96-bit (12 bytes): 32-bit timestamp (second) + 40-bit random + 24-bit counter
  • Sortable by time (second resolution)
  • Counter cho phep 16,777,216 unique IDs per second per process

Step 3 --- Design Deep Dive: Snowflake ID Generator

3.1 Bit Layout Chi Tiet

 0 | 00000000 00000000 00000000 00000000 00000000 0 | 00000 | 00000 | 000000000000
 ^ |                    41 bits                      | 5 bits| 5 bits|   12 bits
 |                       |                              |       |         |
Sign bit             Timestamp                     Datacenter Machine  Sequence
(luon = 0)     (ms since custom epoch)               ID       ID      Number
Thanh phanSo bitRangeGiai thich
Sign bit1Luon = 0Dam bao ID luon duong (positive)
Timestamp410 --- msMillisecond ke tu custom epoch
Datacenter ID50 --- 31Toi da 32 datacenter
Machine ID50 --- 31Toi da 32 machine per datacenter
Sequence120 --- 4095Counter reset moi millisecond

3.2 Cach tao ID (Thuat toan)

Moi khi mot service request ID:

1. Lay current_timestamp (ms since epoch)
2. Neu current_timestamp == last_timestamp:
     sequence = (sequence + 1) AND 0xFFF   // 12-bit mask
     Neu sequence == 0:  // overflow! da tao 4096 IDs trong 1ms
       Wait cho next millisecond
3. Neu current_timestamp > last_timestamp:
     sequence = 0   // reset counter
4. Neu current_timestamp < last_timestamp:
     CLOCK SKEW DETECTED! -> Handle error
5. last_timestamp = current_timestamp
6. Compose ID:
     id = (timestamp << 22) | (datacenter_id << 17) | (machine_id << 12) | sequence
7. Return id

3.3 Epoch Selection (Chon epoch)

Snowflake KHONG dung Unix epoch (1970-01-01). Tai sao?

Neu dung Unix epoch (1970-01-01):

Da qua gan! Nen chon custom epoch = ngay he thong go-live:

Custom EpochOverflow DateCon lai (tu 2026)
2010-01-01 (Twitter)~2079~53 nam
2015-01-01 (Discord)~2084~58 nam
2020-01-01~2089~63 nam
2025-01-01 (recommended cho he thong moi)~2094~68 nam
Unix epoch 1970-01-01~2039~13 nam

Aha Moment: Chon custom epoch gan voi ngay deploy de toi uu hoa thoi gian song cua he thong. Discord chon 2015-01-01 --- ngay dau tien cua Discord.

3.4 Clock Sync va NTP (Network Time Protocol)

Van de cot loi: Snowflake phu thuoc vao timestamp --- nhung dong ho cua cac machine KHONG BAO GIO dong bo hoan hao.

NTP (Network Time Protocol) dong bo dong ho giua cac may:

  • Stratum 0: Atomic clock, GPS receiver (do chinh xac ~nanosecond)
  • Stratum 1: Server ket noi truc tiep voi Stratum 0 (~microsecond)
  • Stratum 2: Server sync voi Stratum 1 (~millisecond)
  • Stratum 3+: Moi tang them ~1-10ms drift
  GPS Satellite / Atomic Clock
         |
    [Stratum 0]
         |
    [Stratum 1]  <-- Google, Amazon NTP servers
         |
    [Stratum 2]  <-- Datacenter NTP servers
         |
    [Stratum 3]  <-- Application servers (cac machine cua em)

Clock drift thuc te:

  • Server thuong: drift ~100ms/ngay neu khong co NTP
  • Voi NTP sync moi 30 giay: drift < 1ms
  • Google TrueTime (Spanner): uncertainty < 7ms (dung GPS + atomic clock)

3.5 Clock Skew Handling (Xu ly lech dong ho)

Scenario: Machine A co dong ho cham hon Machine B 5ms.

  • Machine A tao ID voi timestamp = 1000
  • Machine B tao ID voi timestamp = 1005
  • Thuc te hai ID duoc tao cung luc --- nhung ID cua B “co ve” tao sau

Cac cach xu ly:

StrategyMo taTrade-off
WaitNeu current_time < last_time, doi cho den khi current_time >= last_timeDon gian nhung co the block neu clock nhay lui nhieu
RejectNeu clock skew > threshold (vd: 5ms), reject request va raise alarmAn toan nhung co the mat ID trong thoi gian ngan
Logical clockDung Lamport timestamp hoac Hybrid Logical Clock (HLC) de dam bao causalityPhuc tap hon nhung dam bao ordering
Tolerance windowChap nhan ID out-of-order trong mot window nho (vd: 10ms)Thuc te nhat, hau het cac he thong dung cach nay

Cach Twitter Snowflake xu ly: Neu current_timestamp < last_timestamp, throw exception va log warning. Operations team se dieu tra NTP issue.

Aha Moment: Clock skew la unavoidable trong distributed systems. Snowflake khong dam bao strict global ordering --- chi “roughly sorted”. Neu can strict ordering, phai dung single-machine approach (Ticket Server) hoac Hybrid Logical Clock.

3.6 Sequence Overflow Scenarios

Khi mot machine tao > 4,096 IDs trong 1 millisecond:

Millisecond T=1000:
  ID #0:    ...1000...0000 00000000
  ID #1:    ...1000...0000 00000001
  ...
  ID #4095: ...1000...1111 11111111  <- MAX sequence
  ID #4096: OVERFLOW! sequence quay ve 0
            -> Phai WAIT cho T=1001

Handling strategies:

StrategyMo taKhi nao dung
Spin-waitBusy-loop cho den next millisecondDefault trong Snowflake, OK cho burst ngan
SleepThread.sleep(1)Giam CPU nhung co the mat chinh xac
Pre-generate poolTao truoc mot pool IDs, tra ve tu poolHigh-throughput services, giam latency spike
Borrow from futureTao ID voi timestamp = T+1 va sequence = 0Nguy hiem --- co the gay duplicate neu clock correction xay ra

3.7 Ordering Guarantees

Pham viDam bao?Giai thich
Cung machine, cung millisecondCo, strict orderingSequence number tang dan
Cung machine, khac millisecondCo, strict orderingTimestamp lon hon = ID lon hon
Khac machine, cung datacenterKhong dam baoClock skew co the lam out-of-order
Khac datacenterKhong dam baoClock skew + network latency

Chu y: Dieu nay co nghia la 2 ID tu 2 machine khac nhau co the co timestamp giong nhau nhung thu tu so sanh phu thuoc vao datacenter_id va machine_id, khong phai thoi gian thuc.


3. Back-of-the-Envelope Estimation

3.1 IDs per Millisecond per Machine

3.2 IDs per Second per Machine

3.3 Total Capacity voi N Machines

Voi max configuration (32 datacenter x 32 machine = 1,024 machines):

3.4 Yeu cau 10K IDs/sec can bao nhieu machine?

Chi can 1 machine la thua suc cho 10K IDs/sec. Thuc te, ta deploy nhieu machine de High Availability, khong phai vi throughput.

3.5 Timestamp Overflow

Voi custom epoch = 2025-01-01:

3.6 Tong so ID co the tao trong 69 nam

3.7 Storage Estimation

Moi ID la 8 bytes (64-bit):

So sanh voi UUID (16 bytes) hoac UUID string (36 bytes):

FormatSize per ID1 Billion IDsDB Index Impact
Snowflake (64-bit)8 bytes8 GBSequential --- B+ tree insert O(1) amortized
UUID binary (128-bit)16 bytes16 GBRandom --- B+ tree split O(log n)
UUID string (36 char)36 bytes36 GBRandom + larger index pages

4. Security First --- Bao mat cho ID Generator

4.1 ID Predictability Attacks (Tan cong doan truoc ID)

Threat: Neu attacker biet ID structure, ho co the:

AttackMo taImpact
Enumeration attackDoan ID cua record khac bang cach tang/giam IDTruy cap du lieu trai phep
Timing attackExtract timestamp tu ID de biet thoi gian tao recordInformation leakage
Volume estimationSo sanh 2 IDs de uoc luong so record giua chungBusiness intelligence leak
Infrastructure mappingExtract datacenter_id va machine_idBiet so luong server, datacenter topology

Vi du Enumeration:

Attacker thay order_id = 7100424390656 00001 00001 000000000001
                                        ^DC    ^Machine
Ho biet: datacenter 1, machine 1
Ho tang sequence: 000000000002, 000000000003, ...
-> Co the duyet qua toan bo orders cua machine do

4.2 Information Leakage tu Snowflake ID

Thanh phanThong tin bi loMuc do nghiem trong
Timestamp (41 bit)Thoi diem chinh xac (ms) tao recordTrung binh --- attacker biet khi nao user dang ky, dat hang
Datacenter ID (5 bit)Topology cua he thongThap --- nhung huu ich cho targeted attack
Machine ID (5 bit)So luong va vi tri serverThap --- co the dung cho DDoS targeting
Sequence (12 bit)Traffic volume tai thoi diem doTrung binh --- business intelligence

4.3 Mitigation Strategies (Chien luoc giam thieu)

StrategyMo taTrade-off
Khong expose internal IDDung public-facing ID khac (UUID hoac encrypted ID) cho API, giu Snowflake ID internalThem mot lop mapping, tang storage
Encrypt ID truoc khi exposeDung Format-Preserving Encryption (FPE) de encrypt 64-bit ID thanh 64-bit ciphertextVan giu duoc size, nhung mat sortability
Shuffle bitsXao tron thu tu cac bit de timestamp khong con o vi tri co dinhMat sortability, can lookup table
Add random bitsThay datacenter+machine (10 bit) bang random bitsMat kha nang trace ve source, nhung an toan hon
Hash-based public IDpublic_id = HMAC-SHA256(secret_key, snowflake_id)[:8]Mot chieu, khong reverse duoc, nhung can DB lookup
Rate limit APIGioi han so request per user/IP de chan enumerationKhong giai quyet root cause nhung giam impact

Recommended approach cho hau het he thong:

Internal: Snowflake ID (dung cho DB, inter-service communication)
External: Encrypted Snowflake ID hoac UUID mapping (dung cho public API)

Database:
  +------------------+------------------+
  | snowflake_id (PK)| public_id (UNIQUE)|
  | 71004243906560001 | a8f3k2m9x1       |
  +------------------+------------------+

4.4 OWASP Considerations

OWASP RiskAp dung cho ID GeneratorGiai phap
A01 Broken Access ControlAttacker dung predicted ID de truy cap resource cua nguoi khacAuthorization check TREN MOI REQUEST, khong chi dua vao ID
A04 Insecure DesignExpose Snowflake ID tren public APIDung public ID mapping hoac FPE encryption
A09 Security LoggingKhong log ai request ID naoLog moi ID generation request voi caller identity

Aha Moment: ID Generator ban than khong co loi bao mat --- loi la o cach em expose ID ra ngoai. Luon co mot lop abstraction giua internal ID va public-facing ID.


5. DevOps --- Van hanh ID Generator

5.1 Monitoring Metrics

MetricMo taAlert Threshold
id_generation_rateSo ID tao per second per machineSpike bat thuong > 2x baseline
id_generation_latency_p99Latency de tao 1 ID> 1ms (thuong < 0.01ms)
sequence_overflow_countSo lan sequence overflow (4096/ms)> 0 per minute = can them machine hoac review traffic
clock_skew_detectedSo lan current_time < last_time> 0 = NTP issue, can dieu tra ngay
ntp_offset_msDo lech giua local clock va NTP server> 10ms = critical alert
ntp_sync_failuresSo lan NTP sync that bai> 3 consecutive failures = critical
machine_id_conflicts2 machine co cung datacenter_id + machine_id> 0 = IMMEDIATE action (duplicate IDs!)

5.2 Clock Drift Detection

# prometheus-alerts.yml
groups:
  - name: id_generator_alerts
    rules:
      # Clock skew detected
      - alert: ClockSkewDetected
        expr: increase(clock_skew_detected_total[5m]) > 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Clock skew detected on {{ $labels.instance }}"
          description: "Machine {{ $labels.machine_id }} detected backward clock movement. IDs may be out of order."
 
      # NTP offset too high
      - alert: NTPOffsetHigh
        expr: abs(ntp_offset_milliseconds) > 10
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "NTP offset > 10ms on {{ $labels.instance }}: {{ $value }}ms"
 
      # NTP sync failure
      - alert: NTPSyncFailure
        expr: increase(ntp_sync_failures_total[15m]) > 3
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "NTP sync failing on {{ $labels.instance }}"
          description: "NTP has failed to sync 3+ times in 15 minutes. Clock drift will increase."
 
      # Sequence overflow
      - alert: SequenceOverflow
        expr: rate(sequence_overflow_total[1m]) > 0
        for: 1m
        labels:
          severity: warning
        annotations:
          summary: "Sequence overflow on machine {{ $labels.machine_id }}"
          description: "Machine is generating > 4096 IDs/ms. Consider adding more machines."
 
      # ID generation latency spike
      - alert: IDGenerationLatencyHigh
        expr: histogram_quantile(0.99, rate(id_generation_duration_seconds_bucket[5m])) > 0.001
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "P99 ID generation latency > 1ms on {{ $labels.instance }}"

5.3 NTP Monitoring Best Practices

PracticeMo ta
Dung nhieu NTP sourceCau hinh it nhat 3 NTP server (vd: time.google.com, time.aws.com, pool.ntp.org)
Monitor NTP offset lien tucExport ntp_offset_ms metric qua node_exporter hoac custom exporter
Set NTP sync intervalDefault 64-1024 giay, co the giam xuong 16-32 giay cho ID generator machine
Dung chrony thay ntpdchrony handle drift tot hon, khoi dong nhanh hon, chinh xac hon
Avoid NTP step (nhay dong ho)Cau hinh NTP chi slew (dieu chinh tu tu) thay vi step (nhay dot ngot)
Hardware timestampingDung NIC co ho tro hardware timestamping de giam jitter

5.4 Machine ID Management

StrategyMo taKhi nao dung
Static configCau hinh datacenter_id va machine_id trong config file hoac env varHe thong nho, it thay doi
ZooKeeper/etcdMoi machine dang ky voi ZooKeeper de nhan machine_id uniqueHe thong lon, can auto-assignment
Cloud metadataDung cloud instance metadata (AWS instance ID, GCP zone) de tinh machine_idCloud-native systems
MAC address hashHash MAC address thanh 10-bit (datacenter + machine)Don gian nhung co the conflict

Pitfall: Khi mot machine restart hoac bi thay the, machine_id PHAI duoc bao toan hoac de-register truoc khi assign cho machine moi. Neu 2 machine co cung ID chay dong thoi --- duplicate IDs se xay ra.

5.5 Grafana Dashboard Panels

PanelQueryVisualization
ID Generation Ratesum(rate(ids_generated_total[1m])) by (machine_id)Time series, stacked
Latency P50/P95/P99histogram_quantile(0.99, rate(id_gen_duration_bucket[5m]))Time series, multi-line
Sequence Overflow Eventsincrease(sequence_overflow_total[1h])Stat panel (should be 0)
NTP Offset per Machinentp_offset_millisecondsTime series, threshold line at 10ms
Active Machinescount(up{job="id-generator"} == 1)Gauge
Clock Skew Events (24h)increase(clock_skew_detected_total[24h])Stat panel (should be 0)

6. System Design Diagrams

6.1 Snowflake Bit Layout

block-beta
    columns 4
    block:header:4
        A["Snowflake ID: 64-bit Layout"]
    end
    block:bits:4
        B["0"]
        C["Timestamp (41 bits)"]
        D["DC(5) + Machine(5)"]
        E["Sequence (12 bits)"]
    end
    block:detail:4
        F["Sign<br/>(always 0)"]
        G["Milliseconds since<br/>custom epoch<br/>(~69.7 years)"]
        H["Datacenter: 0-31<br/>Machine: 0-31<br/>(1,024 nodes max)"]
        I["0-4095 per ms<br/>(4.096M IDs/sec<br/>per machine)"]
    end

    style B fill:#6c757d,color:#fff
    style C fill:#0d6efd,color:#fff
    style D fill:#198754,color:#fff
    style E fill:#dc3545,color:#fff
    style F fill:#6c757d,color:#fff
    style G fill:#0d6efd,color:#fff
    style H fill:#198754,color:#fff
    style I fill:#dc3545,color:#fff

6.2 Multi-Datacenter ID Generation Architecture

graph TB
    subgraph "Client Services"
        SvcA["Order Service"]
        SvcB["User Service"]
        SvcC["Payment Service"]
        SvcD["Inventory Service"]
    end

    subgraph "Load Balancer Layer"
        LB["Load Balancer<br/>(Route to nearest DC)"]
    end

    subgraph "Datacenter 1 (US-East) — dc_id=00001"
        direction TB
        subgraph "ID Generator Cluster DC1"
            M1_1["Machine 1<br/>machine_id=00001<br/>Snowflake Generator"]
            M1_2["Machine 2<br/>machine_id=00010<br/>Snowflake Generator"]
            M1_3["Machine 3<br/>machine_id=00011<br/>Snowflake Generator"]
        end
        NTP1["NTP Server DC1<br/>(Stratum 2)<br/>Sync: time.google.com"]
        ZK1["ZooKeeper Node 1<br/>(Machine ID Registry)"]
    end

    subgraph "Datacenter 2 (EU-West) — dc_id=00010"
        direction TB
        subgraph "ID Generator Cluster DC2"
            M2_1["Machine 1<br/>machine_id=00001<br/>Snowflake Generator"]
            M2_2["Machine 2<br/>machine_id=00010<br/>Snowflake Generator"]
        end
        NTP2["NTP Server DC2<br/>(Stratum 2)<br/>Sync: time.aws.com"]
        ZK2["ZooKeeper Node 2<br/>(Machine ID Registry)"]
    end

    subgraph "Datacenter 3 (AP-Southeast) — dc_id=00011"
        direction TB
        subgraph "ID Generator Cluster DC3"
            M3_1["Machine 1<br/>machine_id=00001<br/>Snowflake Generator"]
            M3_2["Machine 2<br/>machine_id=00010<br/>Snowflake Generator"]
        end
        NTP3["NTP Server DC3<br/>(Stratum 2)<br/>Sync: pool.ntp.org"]
        ZK3["ZooKeeper Node 3<br/>(Machine ID Registry)"]
    end

    subgraph "Monitoring"
        Prometheus["Prometheus<br/>(Metrics Collection)"]
        Grafana["Grafana<br/>(Dashboard)"]
        AlertManager["AlertManager<br/>→ Slack / PagerDuty"]
    end

    SvcA & SvcB & SvcC & SvcD --> LB
    LB --> M1_1 & M1_2 & M1_3
    LB --> M2_1 & M2_2
    LB --> M3_1 & M3_2

    NTP1 -.->|sync| M1_1 & M1_2 & M1_3
    NTP2 -.->|sync| M2_1 & M2_2
    NTP3 -.->|sync| M3_1 & M3_2

    ZK1 -.->|assign machine_id| M1_1 & M1_2 & M1_3
    ZK2 -.->|assign machine_id| M2_1 & M2_2
    ZK3 -.->|assign machine_id| M3_1 & M3_2

    ZK1 <-->|replicate| ZK2 <-->|replicate| ZK3

    M1_1 & M1_2 & M1_3 & M2_1 & M2_2 & M3_1 & M3_2 --> Prometheus
    Prometheus --> Grafana
    Prometheus --> AlertManager

    style M1_1 fill:#0d6efd,color:#fff
    style M1_2 fill:#0d6efd,color:#fff
    style M1_3 fill:#0d6efd,color:#fff
    style M2_1 fill:#198754,color:#fff
    style M2_2 fill:#198754,color:#fff
    style M3_1 fill:#dc3545,color:#fff
    style M3_2 fill:#dc3545,color:#fff
    style NTP1 fill:#f9a825,stroke:#333
    style NTP2 fill:#f9a825,stroke:#333
    style NTP3 fill:#f9a825,stroke:#333
    style ZK1 fill:#6f42c1,color:#fff
    style ZK2 fill:#6f42c1,color:#fff
    style ZK3 fill:#6f42c1,color:#fff
    style Prometheus fill:#e65100,color:#fff
    style Grafana fill:#388e3c,color:#fff

6.3 Clock Sync Flow

sequenceDiagram
    participant AC as Atomic Clock<br/>(Stratum 0)
    participant S1 as NTP Stratum 1<br/>(time.google.com)
    participant S2 as NTP Stratum 2<br/>(DC NTP Server)
    participant IDGen as ID Generator<br/>(Machine)
    participant Mon as Monitoring<br/>(Prometheus)

    Note over AC,S1: Stratum 0 → 1: ~microsecond accuracy
    AC->>S1: Time reference signal

    Note over S1,S2: Stratum 1 → 2: ~millisecond accuracy
    S2->>S1: NTP Request (T1)
    S1-->>S2: NTP Response (T2, T3)
    S2->>S2: Calculate offset<br/>offset = ((T2-T1) + (T3-T4)) / 2

    Note over S2,IDGen: Stratum 2 → 3: ~1-10ms accuracy
    loop Every 16-32 seconds
        IDGen->>S2: NTP Request
        S2-->>IDGen: NTP Response
        IDGen->>IDGen: Adjust clock (slew, not step)
    end

    IDGen->>Mon: Export ntp_offset_ms metric

    alt NTP offset > 10ms
        Mon->>Mon: Fire NTPOffsetHigh alert
        Mon-->>IDGen: Alert → Ops team investigates
    end

    alt Clock goes backward
        IDGen->>IDGen: DETECT: current_time < last_time
        IDGen->>Mon: Increment clock_skew_detected counter
        IDGen->>IDGen: WAIT until current_time >= last_time
        Mon->>Mon: Fire ClockSkewDetected alert
    end

6.4 ID Generation Request Flow

sequenceDiagram
    participant Svc as Client Service
    participant LB as Load Balancer
    participant IDGen as ID Generator<br/>(Machine #5, DC #2)
    participant Clock as System Clock

    Svc->>LB: Request new ID
    LB->>IDGen: Forward to nearest generator

    IDGen->>Clock: Get current_timestamp_ms
    Clock-->>IDGen: timestamp = 1742300000123

    alt timestamp == last_timestamp
        IDGen->>IDGen: sequence++
        alt sequence > 4095 (overflow)
            IDGen->>IDGen: WAIT for next millisecond
            IDGen->>Clock: Get current_timestamp_ms
            Clock-->>IDGen: timestamp = 1742300000124
            IDGen->>IDGen: sequence = 0
        end
    else timestamp > last_timestamp
        IDGen->>IDGen: sequence = 0
    else timestamp < last_timestamp (CLOCK SKEW!)
        IDGen->>IDGen: Log warning + wait
    end

    IDGen->>IDGen: Compose ID:<br/>= (timestamp << 22)<br/>| (dc_id << 17)<br/>| (machine_id << 12)<br/>| sequence

    IDGen-->>LB: Return ID = 7300811265438109697
    LB-->>Svc: ID = 7300811265438109697

    Note over Svc: Extract info from ID:<br/>timestamp: 2025-03-18T12:00:00.123Z<br/>datacenter: 2<br/>machine: 5<br/>sequence: 1

7. Aha Moments & Common Pitfalls

Aha Moments --- Duc ket

#1 --- Clock skew la dieu khong the tranh khoi trong distributed systems: Khong co cach nao dong bo hoan hao dong ho giua cac machine. Snowflake chap nhan “roughly sorted” thay vi “strictly sorted”. Neu can strict global ordering --- dung Lamport Clock, Hybrid Logical Clock, hoac Google TrueTime.

#2 --- UUID vs Snowflake khong phai chon 1 bo 1: UUID phu hop cho idempotency key, session ID, correlation ID (khong can sort, khong can compact). Snowflake phu hop cho primary key, event ID, message ID (can sort, can fit 64-bit). Nhieu he thong dung ca hai cho cac muc dich khac nhau.

#3 --- DB index performance voi random vs sequential IDs: Day la ly do lon nhat de khong dung UUID lam primary key trong relational DB. B+ tree index yeu sequential insert --- moi record moi append vao cuoi leaf node. Random UUID gay page split lien tuc, lam giam write throughput 2-5x va tang disk I/O.

#4 --- 1 machine Snowflake tao du 4.1M IDs/sec: Throughput khong phai ly do deploy nhieu machine. Ly do chinh la High Availability --- neu 1 machine chet, cac machine khac van tao ID binh thuong.

#5 --- Custom epoch la quyet dinh mot lan, khong the thay doi: Chon epoch sai (qua xa trong qua khu) = lang phi bit timestamp. Chon epoch trong tuong lai = ID am. Chon epoch = ngay go-live la tot nhat.

#6 --- Snowflake ID co the bi reverse-engineer: Bat ky ai co ID cung co the extract timestamp, datacenter, machine, va sequence. Dung bao gio expose Snowflake ID truc tiep tren public API neu em quan tam security.

Common Pitfalls --- Sai lam thuong gap

Pitfall 1: Khong xu ly clock skew

Sai: “Dong ho may luon chinh xac, khong can lo.” Dung: Clock skew LUON xay ra. NTP step, leap second, VM migration, hardware issue --- tat ca deu co the lam dong ho nhay lui. Phai co logic detect va handle.

Pitfall 2: Dung UUID lam primary key roi thac mac sao DB cham

Sai: “UUID unique, dung lam PK luon cho tien.” Dung: Random UUID gay B+ tree fragmentation. Neu phai dung UUID, dung UUID v7 (time-ordered) hoac ULID. Hoac dung Snowflake ID lam PK va UUID lam public-facing ID.

Pitfall 3: Hard-code machine_id

Sai: machine_id = 1 trong config file, deploy len 5 server. Dung: 5 server cung machine_id = 1 = duplicate IDs. Phai co co che assign unique machine_id --- ZooKeeper, etcd, hoac cloud metadata.

Pitfall 4: Khong monitor sequence overflow

Sai: “4096 IDs/ms la du roi, khong can lo.” Dung: Traffic spike, retry storm, hoac bug co the gay burst > 4096/ms. Sequence overflow = ID generator phai wait = latency spike. Phai monitor va alert.

Pitfall 5: Quen tinh epoch overflow date

Sai: Dung Unix epoch 1970 cho he thong deploy 2025. Dung: . Chi con 13 nam! Chon custom epoch = 2025 de co 69 nam runway.

Pitfall 6: Nghi Snowflake dam bao global ordering

Sai: “ID lon hon = tao sau, luon luon.” Dung: Chi dung tren cung 1 machine. Giua 2 machine khac nhau, clock skew co the lam ID nho hon duoc tao sau ID lon hon. Snowflake chi dam bao roughly time-sorted, khong phai strictly time-sorted.

Pitfall 7: Khong co ke hoach cho sau khi epoch overflow

Sai: “69 nam nua moi het, khong phai viec cua minh.” Dung: Document ro overflow date. Dat calendar reminder. Len ke hoach migration: tang timestamp len 42+ bit (giam machine/sequence bits), hoac chon epoch moi va migrate.


8. Wrap Up --- Tom tat cho Interviewer

BuocNoi dung chinh
Step 1: Scope64-bit, globally unique, time-sortable, 10K+ IDs/sec, multi-datacenter
Step 2: High-Level4 approaches: Multi-Master, UUID, Ticket Server, Snowflake (chon)
Step 3: Deep DiveBit layout (1+41+5+5+12), clock sync (NTP), clock skew handling, sequence overflow
Step 4: Wrap UpMonitoring (clock drift, overflow), Security (ID enumeration), Alternatives (ULID, KSUID)

Interview tip: Khi interviewer hoi “Design a Unique ID Generator”, ho muon nghe:

  1. Em hoi cau hoi de lam ro requirements (khong nhay vao Snowflake ngay)
  2. Em trinh bay nhieu options roi chon co ly do
  3. Em hieu trade-offs (clock skew, ordering guarantees, security)
  4. Em biet van hanh he thong (monitoring, alerting, NTP)

TopicLinkLien quan
ScalingTuan-01-Scale-From-Zero-To-MillionsTai sao can distributed ID khi scale
Capacity EstimationTuan-02-Back-of-the-envelopeCong thuc IDs/sec, overflow calculation
Database ShardingTuan-07-Database-Sharding-ReplicationID lam shard key, sequential vs random impact
Consistent HashingTuan-10-Consistent-HashingPhan phoi ID generation across nodes
MonitoringTuan-13-Monitoring-ObservabilityClock drift monitoring, alerting
SecurityTuan-14-AuthN-AuthZ-Security · Tuan-15-Data-Security-EncryptionID enumeration, information leakage
URL ShortenerTuan-16-Design-URL-ShortenerSnowflake ID dung lam base cho short code
Key-Value StoreTuan-20-Design-Key-Value-StoreID lam key trong distributed KV store

Tham khao


Case Study lien quan: Tuan-16-Design-URL-Shortener (Snowflake ID dung de tao short code) · Tuan-20-Design-Key-Value-Store (ID lam key trong distributed store)