Case Study: Design Search Autocomplete System — Full System Design Interview Walkthrough

“Mỗi lần em gõ một chữ trên Google, hệ thống phải quét hàng tỷ query trong vòng dưới 100ms để đưa ra 10 gợi ý chính xác nhất. Đó không phải magic — đó là engineering.”

Tags: system-design case-study search-autocomplete trie alex-xu interview Student: Hieu Source: Alex Xu — System Design Interview, Volume 1, Chapter 13 Prerequisite: Tuan-01-Scale-From-Zero-To-Millions · Tuan-02-Back-of-the-envelope Lien quan: Tuan-06-Cache-Strategy · Tuan-08-Message-Queue · Tuan-10-Consistent-Hashing


1. Context & Why — Tai sao Autocomplete quan trong?

1.1 The Everyday Analogy

Hieu, em hay mo Google Search ra. Gox “system d” — lap tuc xuat hien mot dropdown voi 10 goi y: “system design interview”, “system design primer”, “system design alex xu”… Tat ca dien ra trong duoi 100ms — nhanh hon ca mot cai chop mat (300-400ms).

Khong chi Google. Moi search bar em gap hang ngay deu co autocomplete:

  • YouTube — goi y video khi go tim kiem
  • Amazon — goi y san pham khi go ten
  • Spotify — goi y bai hat, nghe si
  • IDE (VS Code, IntelliJ) — code completion cung la mot dang autocomplete
  • LinkedIn — goi y nguoi, cong ty khi search

1.2 Tai sao day la bai toan kho?

Thach thucGiai thich
Latency cuc thapUser ky vong response < 100ms. Neu cham hon, cam giac “lag” se khien user bo di
Scale khong loGoogle xu ly ~8.5 billion searches/day. Moi search co ~20 keystroke → 170 billion autocomplete requests/day
RelevanceGoi y phai co nghia — khong phai random prefix match, ma phai sort theo popularity, freshness, personalization
Real-time feelingDu backend co the delay, frontend phai tao cam giac “instant” qua debounce + caching
Multi-languageTieng Viet, tieng Nhat, tieng Trung — moi ngon ngu co cach tokenize khac nhau
SafetyKhong duoc goi y noi dung offensive, bao luc, khieu dam

1.3 Business Impact

Autocomplete khong chi la tien ich — no la vu khi kinh doanh:

  • Giam thoi gian search — User gox 2-3 ky tu thay vi ca cau. Google uoc tinh autocomplete tiet kiem 200 nam thoi gian go moi ngay (toan cau)
  • Tang conversion — Amazon cho thay autocomplete tot tang click-through rate len 24%
  • Guided discovery — Huong user den cac query pho bien, gian tiep dieu khien traffic
  • Giam loi chinh ta — User chon goi y thay vi tu go, giam typo

Aha Moment: Autocomplete khong chi “giup user go nhanh hon”. No la mot cong cu dinh huong hanh vi nguoi dung — ai kiem soat autocomplete, nguoi do anh huong den nhung gi hang trieu nguoi tim kiem.


2. Deep Dive — Alex Xu 4-Step Framework

Overview — Bon buoc cua Alex Xu

BuocTen goiThoi gian (45 phut)
1Understand the Problem & Establish Design Scope~5 phut
2Propose High-Level Design~15 phut
3Design Deep Dive~20 phut
4Wrap Up~5 phut

Step 1 — Understand the Problem & Establish Design Scope

1.1 Functional Requirements (Yeu cau chuc nang)

Hieu, truoc khi thiet ke, em phai hoi interviewer de lam ro scope. Day la cac cau hoi va gia dinh:

Cau hoiTra loi / Gia dinh
Autocomplete chi match tu dau (prefix) hay bat ky vi tri?Chi prefix match. “sys” → “system design” nhung “stem” KHONG match
Tra ve bao nhieu goi y?10 suggestions moi request
Goi y dua tren tieu chi gi?Popularity (tan suat query) la primary signal
Co ho tro multi-language khong?Co — tieng Anh, tieng Viet, tieng Nhat, v.v.
Co capitalize/lowercase distinction khong?Khong — tat ca query duoc normalize ve lowercase
User phai dang nhap moi co autocomplete?Khong — hoat dong cho ca anonymous user
Co personalization khong?Phase 1: khong. Phase 2: co the dua vao search history ca nhan

Tom tat Functional Requirements:

  1. Prefix matching — Khi user go “din”, tra ve cac query bat dau bang “din” (vi du: “dinner recipes”, “dinosaur”, “dining table”)
  2. Top 10 suggestions — Sort theo popularity (query frequency), tra ve 10 ket qua hang dau
  3. Multi-language — Ho tro Unicode, xu ly duoc tieng Viet (co dau), tieng Nhat (Kanji + Hiragana), tieng Trung
  4. Near real-time update — Query moi (trending) can xuat hien trong goi y sau mot khoang thoi gian hop ly (khong can instant)

1.2 Non-Functional Requirements (Yeu cau phi chuc nang)

RequirementTargetGiai thich
Low LatencyResponse < 100ms (P99)Autocomplete phai nhanh hon toc do go cua user. Neu user go 5 ky tu/giay (200ms/keystroke), response phai den truoc keystroke tiep theo
High Availability99.99% uptimeAutocomplete down = search bar cam thay “chet”, du search van hoat dong
Scalability10M+ DAU, 20B+ requests/dayScale ngang khi traffic tang
Fault ToleranceGraceful degradationNeu autocomplete chet, search van hoat dong binh thuong — chi mat goi y
ConsistencyEventual consistency OKKhong can strong consistency — viec goi y cu hon vai phut la chap nhan duoc
RelevanceTop suggestions phai co y nghiaKhong goi y random, phai sort dung theo popularity

Pitfall: Nhieu ung vien quen hoi ve graceful degradation. Autocomplete la non-critical feature — neu no chet, search PHAI van hoat dong. Day la dieu interviewer muon nghe.

1.3 Quy mo he thong (Scale)

Thong soGia triGiai thich
DAU (Daily Active Users)10 millionQuy mo trung binh (nho hon Google, lon hon startup)
Searches per user per day10Trung binh
Average words per query4 words”best italian restaurant nyc”
Average chars per word5 charactersTieng Anh trung binh
Characters per query20 characters4 words x 5 chars
Autocomplete requests per query20Moi keystroke gui 1 request (truoc khi optimize)

Step 2 — Propose High-Level Design

2.1 Hai thanh phan chinh

Hieu, he thong autocomplete co 2 phan lon, hoat dong doc lap nhung bo sung cho nhau:

Thanh phanChuc nangTinh chat
Data Gathering ServiceThu thap query tu user, aggregate frequency, xay dung TrieOffline / Near-realtime, write-heavy
Query ServiceNhan prefix tu user, lookup Trie, tra ve top K suggestionsOnline, read-heavy, latency-critical

Aha Moment: Day la pattern CQRS (Command Query Responsibility Segregation) — tach rieng write path (data gathering) va read path (query service). Em da gap pattern nay o Tuan-11-Microservices-Pattern.

2.2 Data Gathering Service — Tong quan

Luong hoat dong:

  1. User go query va nhan Enter (hoan thanh search)
  2. Query duoc log vao Analytics Log (raw log)
  3. Dinh ky (moi gio/ngay/tuan), Aggregation Pipeline doc raw log, dem tan suat moi query
  4. Trie Builder lay data aggregated, xay dung/cap nhat Trie
  5. Trie moi duoc push len Trie Storage (persistent) va Trie Cache (in-memory)

2.3 Query Service — Tong quan

Luong hoat dong:

  1. User go ky tu trong search box
  2. Frontend gui AJAX request voi prefix hien tai (vd: “sys”)
  3. Request den API GatewayQuery Service
  4. Query Service lookup prefix trong Trie Cache (in-memory)
  5. Tra ve top 10 suggestions
  6. Frontend hien thi dropdown

2.4 High-Level Architecture Diagram

graph TB
    subgraph "User Layer"
        U[User Browser/App]
    end

    subgraph "Online Path — Query Service"
        AG[API Gateway<br/>Rate Limiter + Auth]
        QS[Query Service<br/>Stateless]
        TC[Trie Cache<br/>In-Memory<br/>Redis / Local Memory]
    end

    subgraph "Offline Path — Data Gathering"
        AL[Analytics Log<br/>Raw Search Queries]
        AP[Aggregation Pipeline<br/>MapReduce / Spark]
        AF[Aggregated Frequency<br/>DB / File Storage]
        TB[Trie Builder<br/>Xay dung Trie tu frequency data]
        TS[Trie Storage<br/>Persistent — S3 / HDFS]
    end

    subgraph "Filter & Safety"
        FL[Filter Layer<br/>Remove offensive content]
    end

    U -->|"AJAX: prefix='sys'"| AG
    AG --> QS
    QS --> TC
    TC -->|"Cache miss"| TS
    QS --> FL
    FL -->|"Filtered top 10"| AG
    AG -->|"JSON response"| U

    U -->|"Completed search query"| AL
    AL --> AP
    AP --> AF
    AF --> TB
    TB --> TS
    TB -->|"Push new Trie"| TC

    style TC fill:#e1f5fe
    style FL fill:#fff3e0
    style AP fill:#f3e5f5

Note cho Hieu: Chu y la 2 path (online va offline) chi chia se Trie — khong co dependency truc tiep nao khac. Dieu nay giup em scale tung phan doc lap.


Step 3 — Design Deep Dive

Day la phan quan trong nhat cua buoi phong van — noi em the hien chieu sau ky thuat.

3.1 Trie Data Structure — Co ban

Trie la gi?

Trie (doc la “try”, tu chu “reTRIEval”) la mot cau truc du lieu dang cay, duoc thiet ke dac biet cho prefix search. Moi node dai dien cho mot ky tu, va duong di tu root den mot node tao thanh mot prefix.

Dac diem chinh cua Trie:

Dac diemGiai thich
RootNode goc, khong chua ky tu nao
NodeMoi node chua 1 ky tu va pointer den cac child node
PathDuong tu root den mot node = mot prefix
Terminal nodeNode danh dau ket thuc mot tu/query hoan chinh
BranchingMoi node co the co toi da 26 child (tieng Anh) hoac nhieu hon (Unicode)
Vi du truc quan

Gia su em co 3 query: “tree”, “try”, “true” voi frequency tuong ung.

graph TD
    ROOT["Root"]
    T["t"]
    R["r"]
    E1["e (tree: 35)"]
    Y["y (try: 20)"]
    U["u"]
    E2["e (true: 15)"]
    RE["r"]
    EE["e"]

    ROOT --> T
    T --> R
    R --> E1
    R --> Y
    R --> U
    E1 --> EE["e (tree: 35)"]
    U --> E2

    style E1 fill:#c8e6c9
    style Y fill:#c8e6c9
    style E2 fill:#c8e6c9
Time Complexity cua Trie
OperationComplexityGiai thich
Insert a queryO(L)L = do dai query. Duyet tung ky tu, tao node neu chua co
Search exact queryO(L)Duyet tung ky tu tu root
Find all queries with prefixO(L + N)L = do dai prefix, N = so node trong subtree. Day la van de!

Pitfall: Tim tat ca query co prefix “a” nghia la duyet toan bo subtree bat dau tu node “a”. Neu subtree co hang trieu node → cuc cham. Day la ly do em can optimize Trie.

3.2 Optimized Trie — Bi quyet cua autocomplete nhanh

Optimization 1: Luu top K tai moi node

Van de: Voi Trie co ban, de tim top 10 suggestions cho prefix “be”:

  1. Tim node “b” → “e” (O(prefix length) = nhanh)
  2. Duyet toan bo subtree tu “be” → thu thap tat ca terminal node
  3. Sort theo frequency → lay top 10

Buoc 2 la bottleneck — subtree co the co hang trieu node.

Giai phap: Luu san top K queries tai moi node trong Trie.

ApproachTim top 10 cho prefix “be”Time
Basic TrieDuyet toan bo subtree + sortO(subtree size) — co the hang trieu
Optimized TrieDoc truc tiep top 10 tai node “be”O(1) — constant time!

Trade-off:

  • Space tang: Moi node luu them K entries (vd: 10 strings + frequency). Neu Trie co 1 trieu node → luu them 10 trieu entries
  • Update phuc tap hon: Khi frequency thay doi, phai cap nhat top K tai nhieu node (tu leaf len root)
  • Nhung latency giam cuc manh: Tu O(millions) xuong O(1)

Aha Moment: Day la classic space-time trade-off. Trong autocomplete, latency la vua — em san sang doi space de co speed. Interviewer se an tuong neu em noi ro trade-off nay.

Optimization 2: Gioi han do sau prefix (Prefix depth limit)

Van de: Mot query co the dai toi 50+ ky tu. Trie sau 50 level la lang phi vi:

  • It user nao go qua 10-15 ky tu truoc khi chon goi y
  • Cang go nhieu ky tu, so luong match cang it → autocomplete cang it huu ich

Giai phap: Gioi han do sau cua Trie, vi du chi luu prefix toi da 50 ky tu.

Prefix lengthSo luong matchesAutocomplete huu ich?
1-3 ky tuHang trieuRat huu ich — user dang explore
4-7 ky tuHang nghinHuu ich — user dang narrow down
8-15 ky tuHang tramVan huu ich
15+ ky tuVai chuc hoac it honIt huu ich — user da biet muon gi
50+ ky tuGan nhu 0Vo nghia

Aha Moment cho Hieu: Google chi hien autocomplete cho khoang 30 ky tu dau tien. Sau do, dropdown bien mat. Day khong phai bug — day la optimization co chu y.

Optimization 3: Node compression (Trie nen)

Neu mot chuoi node chi co 1 child (khong branch), gop chung thanh 1 node:

Truoc khi nen: r → o → m → a → n → c → e (7 nodes) Sau khi nen: “romance” (1 node)

Loi ich:

  • Giam so luong node (thuong giam 50-70%)
  • Giam memory footprint
  • Giam so lan pointer dereference khi traverse

3.3 Trie Update Strategy — Offline Rebuild vs Incremental

Tai sao KHONG update Trie real-time?
Ly doGiai thich
Write amplificationMoi query moi → cap nhat frequency + top K tai nhieu node (tu leaf len root). Voi 100K QPS, qua nhieu writes
Consistency phuc tapNhieu server doc Trie cung luc. Neu update giua chung → inconsistent state
Latency spikeWrite lock co the block read, gay latency spike cho autocomplete
Khong can real-timeAutocomplete suggestions khong can chinh xac tung giay. Delay 15 phut - 1 gio la chap nhan duoc

Cach hoat dong:

  1. Dinh ky (moi 1 gio / moi ngay), Aggregation Pipeline dem frequency cua tat ca query
  2. Trie Builder xay Trie moi hoan toan tu aggregated data
  3. Trie moi duoc serialize va luu vao Trie Storage
  4. Query Service servers tai Trie moi (hot swap — khong downtime)

Uu diem:

  • Don gian, de implement, de debug
  • Khong co race condition
  • Trie luon o trang thai consistent
  • De rollback — chi can quay ve Trie version cu

Nhuoc diem:

  • Trending query mat thoi gian de xuat hien (delay = rebuild interval)
  • Rebuild Trie lon ton nhieu compute resource
Strategy 2: Incremental Update (Cho he thong can nhanh hon)

Cach hoat dong:

  1. Khi co query moi, gui event vao Message Queue (Tuan-08-Message-Queue)
  2. Trie Updater consumer doc event, cap nhat frequency tai node tuong ung
  3. Recalculate top K tai cac node bi anh huong (tu leaf len root)

Uu diem:

  • Trending query xuat hien nhanh hon (delay = processing time, thuong vai phut)
  • Khong can rebuild toan bo Trie

Nhuoc diem:

  • Phuc tap hon nhieu: phai xu ly concurrency, locking, versioning
  • Risk: cap nhat sai co the corrupt Trie
  • Can careful engineering de khong anh huong read performance
Hybrid Approach (Thuc te)

Hau het cac he thong lon dung hybrid:

  • Baseline Trie: Rebuild offline moi ngay tu historical data
  • Delta Trie: Incremental update moi 15 phut tu trending data
  • Merge: Khi query, merge ket qua tu ca 2 Trie

Aha Moment: YouTube dung hybrid approach — goi y co ban tu offline Trie, nhung trending video/topic duoc inject tu realtime pipeline. Do la ly do em thay trending video xuat hien trong autocomplete chi vai phut sau khi viral.

3.4 Data Gathering Pipeline — Chi tiet

Toan bo luong du lieu tu raw log den Trie
graph LR
    subgraph "Step 1: Collection"
        SB[Search Box<br/>User go query]
        QL[Query Logger<br/>Append to log file]
        KF[Message Queue<br/>Kafka / Kinesis]
    end

    subgraph "Step 2: Storage"
        S3[Raw Log Storage<br/>S3 / HDFS]
    end

    subgraph "Step 3: Aggregation"
        SP[Aggregation Engine<br/>Spark / Flink / MapReduce]
        FT[Frequency Table<br/>query → count per time window]
    end

    subgraph "Step 4: Trie Build"
        TBU[Trie Builder<br/>Xay dung Trie tu frequency]
        SER[Trie Serializer<br/>Convert Trie → binary format]
    end

    subgraph "Step 5: Distribution"
        TST[Trie Storage<br/>S3 / GCS]
        TCA[Trie Cache<br/>Redis / In-memory]
        QSE[Query Servers<br/>Load Trie vao RAM]
    end

    SB --> QL
    QL --> KF
    KF --> S3
    S3 --> SP
    SP --> FT
    FT --> TBU
    TBU --> SER
    SER --> TST
    TST --> TCA
    TST --> QSE

    style KF fill:#e8eaf6
    style SP fill:#f3e5f5
    style TCA fill:#e1f5fe
Step-by-step giai thich

Step 1 — Collection (Thu thap):

  • Moi khi user hoan thanh mot search (nhan Enter hoac click goi y), query duoc log
  • Quan trong: Chi log query da hoan thanh, KHONG log moi keystroke. Ly do:
    • Keystroke log qua nhieu (20x nhieu hon completed query)
    • Keystroke la “partial query” — khong dai dien cho y dinh thuc su cua user
    • Vi du: user go “sys” → “syst” → “system” → “system design” → Enter. Chi log “system design”
  • Log format: timestamp | user_id_hash | query | country | language
  • Dung Message Queue (Kafka) lam buffer de tranh mat log khi traffic spike

Step 2 — Storage (Luu tru):

  • Raw log duoc luu vao distributed storage (S3, HDFS)
  • Partition theo ngay va gio de de truy van
  • Retention policy: giu raw log 30 ngay, sau do archive hoac xoa
  • Du lieu nay cung duoc dung cho analytics khac (search quality, trending topics)

Step 3 — Aggregation (Tong hop):

  • Day la buoc nang nhat ve compute
  • Dung framework nhu Apache Spark, Flink, hoac MapReduce
  • Input: raw log files
  • Output: bang tan suat — moi dong la (query, time_window, count)
  • Time window thuong la 1 gio hoac 1 ngay
  • Weighted aggregation: Query ganh day hon duoc trong so cao hon

Voi thuong la 0.9-0.99, va la thoi diem cua time window .

Giai thich cho Hieu: Decay factor dam bao query “World Cup 2026” co weight cao trong mua World Cup, nhung giam dan sau do. Khong co decay, nhung query cu nhung pho bien (vi du “Facebook login”) se luon ap dao trending query.

Step 4 — Trie Build (Xay dung Trie):

  • Doc frequency table, xay dung Trie tu scratch
  • Tai moi node, tinh va luu top K queries (thuong K = 10-25)
  • Apply compression (gop cac node chi co 1 child)
  • Serialize Trie thanh binary format de luu tru va truyen tai hieu qua

Step 5 — Distribution (Phan phoi):

  • Trie da serialize duoc upload len Trie Storage (S3)
  • Query servers tai Trie moi tu S3 → load vao RAM
  • Co the dung Trie Cache (Redis) lam intermediate layer cho cac server chua kip load Trie moi
  • Blue-green deployment: Giu 2 version Trie. Server dang dung version A. Load version B vao RAM. Switch traffic sang B. Neu co loi, switch lai A

3.5 Query Service — Chi tiet

Luong xu ly mot autocomplete request
sequenceDiagram
    participant User as User Browser
    participant FE as Frontend JS
    participant CDN as CDN/Edge Cache
    participant AG as API Gateway
    participant QS as Query Service
    participant TC as Trie Cache (Memory)
    participant FL as Filter Layer

    User->>FE: Go ky tu "sys"
    Note over FE: Debounce 200ms<br/>Cho user go them
    User->>FE: Go them "t" → "syst"
    Note over FE: Debounce 200ms
    Note over FE: 200ms het — gui request
    FE->>CDN: GET /autocomplete?q=syst

    alt Cache HIT tai CDN
        CDN-->>FE: Return cached suggestions
    else Cache MISS
        CDN->>AG: Forward request
        AG->>QS: Lookup prefix "syst"
        QS->>TC: Trie.search("syst")
        TC-->>QS: Top 10 raw suggestions
        QS->>FL: Filter offensive content
        FL-->>QS: Filtered top 10
        QS-->>AG: JSON response
        AG-->>CDN: Response + Cache-Control header
        CDN-->>FE: Return suggestions
    end

    FE-->>User: Hien thi dropdown 10 goi y
Frontend Optimizations

Debounce — Ky thuat tiet kiem 80% traffic:

Khong co debounce, moi keystroke gui 1 request:

  • User go “system design” (13 ky tu) → 13 requests
  • Nhung user go lien tuc — cac request cho “s”, “sy”, “sys” deu vo nghia vi user chua dung lai

Voi debounce 200ms:

  • Chi gui request khi user ngung go 200ms
  • User go “system design” lien tuc → chi gui 2-3 requests thay vi 13
  • Tiet kiem ~80% traffic
Khong co debounceCo debounce 200ms
”s” → request”s” → cho…
“sy” → request”sy” → cho…
“sys” → request”sys” → cho…
“syst” → request”syst” → cho…
“syste” → request”syste” → gui request (user ngung 200ms)
“system” → request”system” → cho…
“system ” → request”system ” → cho…
“system d” → request”system d” → cho…
“system de” → request”system de” → gui request
… (13 requests)… (3-4 requests)

Browser Caching:

  • Response cho prefix “sys” it thay doi trong 1 gio
  • Set Cache-Control: max-age=3600 (cache 1 gio tai browser)
  • Lan sau user go “sys” → browser tra ket qua tu cache, khong can gui request
  • Hieu qua cang lon voi cac prefix ngan (it thay doi hon prefix dai)

AJAX (Asynchronous):

  • Request duoc gui asynchronous — khong block UI
  • Neu response den cham (> 300ms), user da go them ky tu → huy request cu, gui request moi
  • Tranh hien thi ket qua “stale” cho prefix cu
Backend Optimizations

In-Memory Trie:

  • Toan bo Trie duoc load vao RAM cua Query Server
  • Lookup time = O(prefix length) — thuong < 1 microsecond
  • Khong can truy cap disk hay external database
  • Moi Query Server giu ban sao rieng cua Trie — khong co shared state → de scale horizontal

CDN / Edge Caching:

  • Autocomplete responses la read-only va thay doi cham (moi gio/ngay)
  • Cache tai CDN edge server — user o Viet Nam duoc serve tu CDN Singapore, khong can request ve US
  • Cache key: prefix string (vd: “sys”, “syst”, “syste”)
  • Hit rate cao cho prefix ngan (it variation)

Two-tier Cache Strategy (Tham khao: Tuan-06-Cache-Strategy):

TierImplementationLatencyHit Rate
L1: CDN/Browser cacheHTTP cache~0ms (browser) / ~5ms (CDN)60-80% cho short prefix
L2: Application cacheIn-memory Trie tai Query Server< 1ms100% (Trie luon co trong RAM)

Aha Moment: Voi debounce + browser caching + CDN caching, chi 10-20% autocomplete requests thuc su den backend. Day la ly do he thong co the handle hang ty requests/day voi it server.

3.6 Filter Layer — An toan noi dung

Tai sao can filter?

Autocomplete goi y cong khai — tat ca user deu thay. Neu goi y chua noi dung offensive, violent, hay sexually explicit:

  • PR disaster — bao chi dua tin, social media viral (da xay ra voi Google nhieu lan)
  • Legal risk — vi pham luat bao ve tre em, luat chong ky thi
  • User trust — mat long tin nguoi dung
Cac loai noi dung can filter
LoaiVi duMuc do
Hate speechNoi dung phan biet chung toc, gioi tinh, ton giaoBlock hoan toan
ViolenceHuong dan bao luc, vu khiBlock hoan toan
Adult contentNoi dung khieu dam (tru khi SafeSearch OFF)Block default, cho phep voi setting
Defamation”[ten nguoi] + tu xuc pham”Block dua tren rule
Self-harmNoi dung tu gay thuong, tu tuBlock + hien thi hotline ho tro
Illegal activityMua ban chat cam, hackBlock hoan toan
Filter implementation

Layer 1 — Blocklist (Offline):

  • Danh sach cac query/keyword bi cam, duoc maintain boi content moderation team
  • Apply khi build Trie — loai bo cac query trong blocklist truoc khi insert vao Trie
  • Uu diem: nhanh, khong anh huong runtime
  • Nhuoc diem: khong bat duoc variation moi (vd: “h4te” thay vi “hate”)

Layer 2 — ML Classification (Near-realtime):

  • Model ML phan loai query thanh safe/unsafe
  • Chay khi build Trie (offline) hoac khi filter response (online)
  • Bat duoc variation, tieng long, encoded text
  • Nhuoc diem: co false positive (block query vo hai)

Layer 3 — Runtime Filter (Online):

  • Truoc khi tra suggestions ve user, chay qua filter cuoi cung
  • Check against latest blocklist (co the cap nhat real-time qua config service)
  • Dam bao khong co suggestion nao “lot luoi” offline filter

3.7 Trie Serialization — Luu tru va truyen tai Trie

Tai sao can serialize?

Trie trong memory la pointer-based structure — moi node chua pointer den children. Khong the:

  • Luu truc tiep xuong disk (pointer khong co nghia khi reload)
  • Gui qua network tu Trie Builder den Query Server
  • Version control (so sanh 2 Trie)
Cac cach serialize

Cach 1 — Level-order traversal (BFS):

  • Duyet Trie theo tung level (root → level 1 → level 2 → …)
  • Luu moi node duoi dang: (character, is_terminal, num_children, top_k_list)
  • De reconstruct — doc tuan tu, xay lai Trie

Cach 2 — Key-value pairs:

  • Luu moi prefix va top K cua no duoi dang key-value:
    • Key: “sys” → Value: [“system design:1000”, “system:800”, …]
    • Key: “syst” → Value: [“system design:1000”, “system:800”, …]
  • Co the luu vao Redis, RocksDB, hay bat ky KV store nao
  • Loi ich: khong can reconstruct Trie — chi can KV lookup

Cach 3 — Compact binary format (Recommended):

  • Custom binary format, optimized cho size va speed
  • Dung techniques: varint encoding, dictionary compression, delta encoding
  • Trie 1GB trong memory co the giam xuong 200-300MB khi serialize
  • Nhanh nhat khi deserialize vi it overhead

3.8 Multi-Language Support

Thach thuc
Ngon nguVan deVi du
Tieng VietDau thanh, chu cai dac biet (a, a, o, o, u, d)“pho” vs “pho” la 2 tu khac nhau
Tieng Nhat3 he thong chu: Hiragana, Katakana, KanjiUser co the go “tokyo” bang Romaji, Hiragana, hoac Kanji
Tieng TrungKhong co space giua cac tu, Pinyin input”bei jing” (Pinyin) → “Beijing”
Tieng HanJamo (phu am + nguyen am) ket hop thanh syllableUser go tung Jamo, autocomplete phai xu ly tung buoc
Tieng A RapRight-to-left, ky tu thay doi hinh dang theo vi triPrefix match phuc tap hon
Giai phap: Unicode + Separate Tries per Language

Approach 1 — Unicode Trie:

  • Dung Unicode code point lam key cho moi node (thay vi chi 26 chu cai ASCII)
  • Moi node co the co hang ngan children (Unicode co ~150,000 ky tu)
  • Dung hash map thay vi array cho children → tiet kiem memory
  • Uu diem: 1 Trie cho tat ca ngon ngu
  • Nhuoc diem: Trie rat lon, kho shard

Approach 2 — Separate Trie per Language (Recommended):

  • Moi ngon ngu co Trie rieng
  • Detect ngon ngu tu: (1) user setting, (2) keyboard input, (3) geo-location
  • Route request den Trie tuong ung
  • Uu diem: moi Trie nho hon, de optimize cho tung ngon ngu
  • Nhuoc diem: user bilingual (vd: go ca tieng Anh va tieng Viet) can query nhieu Trie

Normalization pipeline (ap dung cho moi ngon ngu):

  1. Convert ve lowercase
  2. Remove diacritics cho search (nhung giu lai khi hien thi). Vi du: “Pho” → normalize thanh “pho” de search, nhung hien thi “Pho”
  3. Handle synonyms (vd: “tp hcm” = “thanh pho ho chi minh” = “saigon”)
  4. Handle transliteration (vd: “tokyo” → Romaji match cho tieng Nhat)

3.9 Personalization Layer

Overview

Personalization = goi y dua tren lich su ca nhan cua user, khong chi global popularity.

Vi du: Khi Hieu go “py”:

  • Global autocomplete: “python download”, “python tutorial”, “pyridoxine”
  • Personalized cho Hieu (backend dev): “python asyncio”, “python system design”, “pydantic”
Architecture
ComponentChuc nang
User Profile StoreLuu search history per user (recent queries, click-through)
Personal Trie / Score AdjustmentAdjust score cua suggestions dua tren user profile
Merge LayerKet hop global suggestions + personal suggestions, re-rank
Implementation don gian
  1. Luu 50 query gan nhat cua moi user trong Redis (key: user_id, value: list of recent queries)
  2. Khi autocomplete request den:
    • Lay top 10 tu global Trie (frequency-based)
    • Lay queries tu user’s recent history match voi prefix
    • Merge: uu tien personal match, fill con lai tu global
  3. Neu user chua dang nhap → chi dung global Trie

Privacy note: Personal data phai duoc luu cach biet, co the xoa theo yeu cau (GDPR right to erasure). Chi tiet o Section 4 (Security).

3.10 Scaling — Trie Sharding & Replication

Tai sao can shard Trie?

Khi he thong lon:

  • Trie cho tieng Anh co the co hang ty node → khong fit vao RAM cua 1 server
  • Traffic cho prefix “a” co the gap 10 lan prefix “x” → hot spot
  • Can phan tan Trie ra nhieu server
Sharding Strategy 1: By First Character
ShardPrefixesNhung server
Shard 0a-cServer 1, 2, 3
Shard 1d-fServer 4, 5, 6
Shard 2g-iServer 7, 8, 9
Shard 8y-z + special charsServer 25, 26, 27

Uu diem: Don gian, de implement Nhuoc diem: Khong deu tai! Prefix “s” co nhieu query hon prefix “x” gap 100 lan → shard “s” bi overload

Thay vi chia theo 1 ky tu dau, chia theo range cua prefix, dam bao moi shard co luong traffic tuong duong:

ShardPrefix RangeEstimated Traffic
Shard 0”a” — “be”~12% traffic
Shard 1”bf” — “ce”~11% traffic
Shard 2”cf” — “di”~10% traffic
Shard 9”t” — “z”~13% traffic

Cach xac dinh range: Dua tren historical query distribution. Dieu chinh dinh ky khi traffic pattern thay doi.

Shard Map & Zookeeper

Lam sao Query Service biet prefix “sys” nam o shard nao?

Giai phap: Dung Shard Map — mapping tu prefix range → shard address. Luu shard map trong Zookeeper (hoac etcd):

graph TD
    subgraph "Shard Map (Zookeeper)"
        ZK[Zookeeper Cluster]
        SM["Shard Map:<br/>a-be → Shard 0<br/>bf-ce → Shard 1<br/>cf-di → Shard 2<br/>...<br/>t-z → Shard 9"]
    end

    subgraph "Query Service"
        QS1[Query Server 1]
        QS2[Query Server 2]
        QS3[Query Server 3]
    end

    subgraph "Trie Shards"
        S0[Shard 0<br/>Prefix: a-be]
        S1[Shard 1<br/>Prefix: bf-ce]
        S2[Shard 2<br/>Prefix: cf-di]
        S9[Shard 9<br/>Prefix: t-z]
    end

    QS1 --> ZK
    QS2 --> ZK
    QS3 --> ZK
    ZK --> SM

    QS1 --> S0
    QS1 --> S1
    QS2 --> S2
    QS3 --> S9

    style ZK fill:#fff9c4

Luong hoat dong:

  1. Query Server khoi dong → doc shard map tu Zookeeper, cache local
  2. Request den voi prefix “sys” → lookup shard map → “sys” thuoc Shard 9 (range “t-z”) — uh oh, “sys” thuoc range “s” nen can chinh lai mapping
  3. Route request den Shard tuong ung
  4. Khi shard map thay doi (rebalance), Zookeeper notify tat ca Query Server → update cache
Replication

Moi shard co 3 replicas (tham khao: Tuan-07-Database-Sharding-Replication):

  • 1 Primary — nhan Trie update tu Trie Builder
  • 2 Replicas — serve read traffic
  • Neu Primary chet → mot Replica duoc promote len Primary
  • Read traffic duoc load balance giua cac replicas

Aha Moment cho Hieu: Sharding Trie kho hon sharding database thong thuong. Voi DB, em hash key → shard. Voi Trie, em phai giu toan bo subtree cua mot prefix range tren cung shard — khong the chia nho hon. Day la constraint dac biet cua Trie.


3. Capacity Estimation (Uoc luong chi tiet)

3.1 QPS Calculation

Given:

  • DAU = 10 million
  • Searches per user per day = 10
  • Keystrokes per query = 20 (trung binh)
  • Seconds in a day = 86,400

Truoc optimization (khong debounce, khong cache):

Sau optimization (debounce giam 80%, browser cache giam 60% con lai):

Nhan xet: Tu 69K QPS peak xuong 5.5K QPS peak — giam 92% nho debounce + caching. Day la ly do frontend optimization cuc ky quan trong cho autocomplete.

3.2 Storage for Trie

Assumptions:

  • So luong unique queries: 100 million (100M)
  • Average query length: 20 characters = 20 bytes (ASCII) hoac 40 bytes (UTF-8 multi-language)
  • Moi node trong Trie: 50 bytes (character + metadata + pointers)
  • Top K list per node: 10 entries x 30 bytes = 300 bytes per node
  • So luong node trong Trie (sau compression): ~500 million (500M)

Voi compression (binary serialization):

Nhan xet: 52.5 GB co the fit vao RAM cua 1 server manh (64GB+ RAM). Nhung de an toan va scale, em nen shard thanh 5-10 shard, moi shard ~5-10 GB. Rat thoai mai.

3.3 Network Bandwidth

Request size: ~50 bytes (prefix + headers) Response size: ~500 bytes (10 suggestions x 50 bytes moi cai)

Nhan xet: Bandwidth cuc thap. Autocomplete la tinh toan nhieu, truyen tai it (compute-heavy, IO-light). Bottleneck la CPU/RAM cho Trie lookup, khong phai network.

3.4 Storage for Logs & Aggregated Data


4. Security Considerations

4.1 Filter Offensive/Harmful Suggestions

Moi de doaAnh huongGiai phap
Autocomplete goi y noi dung phan biet chung tocPR crisis, mat user trust, kien tungMulti-layer filter (blocklist + ML)
Goi y lien quan bao luc, vu khiTrach nhiem phap lyStrict blocklist, zero tolerance
Goi y noi dung nguoi lonKhong phu hop moi doi tuongSafeSearch toggle, default ON
Goi y tu tu / tu gay thuongTrach nhiem xa hoiBlock + hien thi “Neu ban can ho tro, goi 1800-…”

Best Practices:

  • Maintain curated blocklist — duoc cap nhat hang tuan boi content moderation team (nguoi thuc, khong chi ML)
  • ML model phan loai query safe/unsafe — train tren data da label
  • A/B testing filter — test filter moi tren % nho user truoc khi rollout
  • Incident response — khi phat hien goi y xau da lot luoi, co the push blocklist update trong phut, khong can rebuild Trie
  • Human-in-the-loop — voi nhung query mo ho (co the offensive trong 1 context nhung khong phai o context khac), de con nguoi review

4.2 User Privacy

Van deChi tietGiai phap
Search log chua PIILog query chua ten, dia chi, thong tin ca nhanAnonymize user_id truoc khi log (hash + salt)
Location trackingIP address trong log → xac dinh vi tri userAggregate theo region (quoc gia/thanh pho), khong luu exact IP
Sensitive searchesUser search thong tin y te, luat su, chinh triKHONG log cac query thuoc sensitive categories
Data retentionLuu log qua lau → tang risk data breachDelete raw log sau 30 ngay, aggregated data sau 2 nam

Privacy-by-design principles:

  1. Data minimization — Chi thu thap du lieu can thiet cho autocomplete, khong hon
  2. Anonymization — Tach search log khoi user identity. Dung differential privacy khi co the
  3. Encryption at rest — Log va aggregated data duoc encrypt tren disk (AES-256)
  4. Encryption in transit — HTTPS/TLS cho tat ca request (da la standard)
  5. Access control — Chi data engineering team truy cap raw log. Dung role-based access (RBAC)

4.3 GDPR Compliance (Right to Erasure)

GDPR (General Data Protection Regulation) cua EU yeu cau:

  • User co quyen yeu cau xoa tat ca du lieu lien quan den ho
  • Dieu nay ap dung cho: search logs, aggregated data, va ca autocomplete suggestions chua thong tin ca nhan

Thach thuc: Trie da duoc build — lam sao xoa 1 query cu the?

Giai phap:

ApproachMo taProsCons
Rebuild Trie (exclude deleted queries)Khi nhan deletion request, danh dau query de exclude, ap dung khi rebuild Trie ke tiepDon gian, cleanDelay (phai cho rebuild cycle)
Runtime filterMaintain “deleted queries” list. Filter tai query timeNhanh (ap dung ngay lap tuc)Overhead moi request
HybridRuntime filter ngay lap tuc + rebuild Trie de xoa vinh vienNhanh va cleanPhuc tap hon

Process xu ly GDPR deletion request:

  1. User gui request xoa du lieu → Identity Service xac nhan user
  2. Tim tat ca log entries lien quan den user (bang user_id_hash)
  3. Xoa/anonymize log entries
  4. Them cac queries can xoa vao “deletion list”
  5. Apply runtime filter ngay → user khong con thay goi y lien quan
  6. Trie rebuild ke tiep se exclude cac queries nay

4.4 Preventing Manipulation (SEO Spam in Suggestions)

Moi de doa: Bad actors co gang “bom” autocomplete bang cach:

  • Dung bot go lap di lap lai mot query de tang frequency
  • Coordinated attack — nhieu nguoi cung search 1 query de day no len top
  • Muc dich: quang cao mien phi, boi nho doi thu, lan truyen thong tin sai

Giai phap:

Ky thuatMo ta
Rate limiting per user/IPGioi han so query/phut tu cung 1 user hoac IP. Tham khao Tuan-09-Rate-Limiter
Bot detectionPhat hien pattern bat thuong: query qua nhanh, khong co mouse movement, headless browser
Minimum frequency thresholdQuery phai dat so luong toi thieu (vd: 1000 unique users) moi xuat hien trong autocomplete
Unique user countingDem so unique users search mot query, khong phai tong so searches. 1 user search 1000 lan = 1 count
Anomaly detectionPhat hien query tang dot ngot bat thuong → flag de review truoc khi dua vao Trie
Verified source weightingQuery tu logged-in user co weight cao hon anonymous. Query tu known-good IP range co weight cao hon suspicious IP

5. DevOps & Monitoring

5.1 Trie Rebuild Pipeline Monitoring

MetricMo taAlert Threshold
Pipeline completion timeThoi gian tu raw log → Trie hoan chinh> 2x binh thuong → alert
Pipeline success rate% pipeline run thanh cong< 99% → alert
Trie size deltaThay doi size giua 2 version TrieTang/giam > 20% → investigate
Query count deltaSo luong query trong Trie moi vs cuGiam > 10% → co the mat data
Aggregation lagThoi gian tu query xay ra → xuat hien trong Trie> 4 gio → check pipeline

Pipeline health dashboard nen hien thi:

  • Thoi gian hoan thanh pipeline lan gan nhat
  • So luong raw log records processed
  • So luong unique queries sau aggregation
  • Trie size (bytes) va so luong nodes
  • Trang thai deployment (dang serve Trie version nao)

5.2 Query Latency Monitoring

MetricTargetGiai thich
P50 latency< 10msPhan nua request phai duoi 10ms
P95 latency< 50ms95% request duoi 50ms
P99 latency< 100ms99% request duoi 100ms — day la hard requirement
P99.9 latency< 200msEdge case — co the chap nhan

Nguyen nhan latency spike va cach xu ly:

Nguyen nhanPhat hienXu ly
GC pause (Java/Go)JVM GC log, latency spike dinh kyTune GC, dung low-latency GC (ZGC, Shenandoah)
Trie reload (load Trie moi vao RAM)Latency spike tai thoi diem deploy Trie moiBlue-green deployment: load Trie moi xong moi switch traffic
Cold start (server moi khoi dong)Latency cao o vai nghin request dauWarm-up: load Trie va xu ly dummy requests truoc khi nhan traffic that
Network issueLatency tang dong deu tren nhieu serverCheck network dashboard, route lai traffic
Hot prefixLatency tang cho 1 shard cu theScale up shard do, hoac rebalance prefix range

5.3 Cache Hit Ratio

Cache layerTarget hit ratioGiai thich
Browser cache30-50%User thuong go lai cac prefix giong nhau
CDN cache60-80%Prefix ngan (1-3 chars) duoc cache hieu qua
Application cache (Trie in memory)100%Trie luon co trong RAM — moi prefix deu co ket qua

Neu cache hit ratio thap hon mong doi:

  • Browser cache thap → kiem tra Cache-Control header co dung khong
  • CDN cache thap → kiem tra TTL, cache key strategy, xem co vary header khong can thiet
  • Nhieu cache miss → xem xet user behavior change (trending event → query moi)

5.4 A/B Testing for Ranking Algorithms

Autocomplete ranking khong chi la frequency. Cac signal khac:

SignalMo taVi du
FrequencySo lan query duoc search”python tutorial” > “python asyncio”
FreshnessQuery moi/trending duoc boost”world cup 2026” boost trong mua giai
PersonalizationDua tren user historyDev search “python” → boost “python docs”
Click-through rate% user chon goi y doGoi y duoc chon nhieu → boost
GeographyDua tren vi tri userUser o VN: “pho” → “pho ha noi”; User o US: “pho” → “phone number”

A/B testing process:

  1. Hypothesis: “Them freshness signal se tang click-through rate 5%”
  2. Experiment: 5% user (random) nhan ranking moi (treatment), 95% nhan ranking cu (control)
  3. Metrics: So sanh click-through rate, search completion time, user satisfaction (bounce rate)
  4. Duration: Chay it nhat 2 tuan de co du statistical significance
  5. Decision: Neu treatment tot hon voi p-value < 0.05 → rollout 100%

Pitfall: A/B test autocomplete kho vi feedback loop — neu em goi y “X” cho treatment group, ho se click “X” nhieu hon → tang frequency cua “X” → goi y “X” nhieu hon. Can careful experimental design de tranh bias.

5.5 Alerting Strategy

AlertSeverityAction
P99 latency > 100ms keo dai 5 phutCriticalPage on-call engineer. Kiem tra: Trie reload? GC? Hot shard?
Pipeline fail 2 lan lien tiepHighInvestigate pipeline. Trie cu van dang serve — khong mat service
Trie size giam > 20%HighCo the mat data trong aggregation. KHONG deploy Trie nay — rollback
Cache hit ratio giam > 30%MediumKiem tra cache config, TTL, trending event
Filter miss (offensive suggestion reported)HighUpdate blocklist ngay. Root cause analysis
Error rate > 1%CriticalKiem tra server health, dependency failure

6. Architecture Diagrams (Mermaid)

6.1 Overall System Architecture

graph TB
    subgraph "Client Layer"
        WEB[Web Browser<br/>Debounce + Local Cache]
        MOB[Mobile App<br/>Debounce + Local Cache]
    end

    subgraph "Edge Layer"
        CDN[CDN / Edge Cache<br/>Cloudflare / CloudFront]
    end

    subgraph "API Layer"
        LB[Load Balancer]
        AG1[API Gateway 1]
        AG2[API Gateway 2]
    end

    subgraph "Query Service Layer"
        QS1[Query Server 1<br/>Trie Shard 0-2 in RAM]
        QS2[Query Server 2<br/>Trie Shard 3-5 in RAM]
        QS3[Query Server 3<br/>Trie Shard 6-9 in RAM]
        FL[Filter Service<br/>Blocklist + ML]
    end

    subgraph "Coordination"
        ZK[Zookeeper<br/>Shard Map + Config]
    end

    subgraph "Storage Layer"
        TS[Trie Storage<br/>S3 / GCS]
        RC[Redis Cache<br/>Hot Prefixes]
    end

    subgraph "Data Pipeline (Offline)"
        KF[Kafka<br/>Query Event Stream]
        SP[Spark Cluster<br/>Aggregation]
        DB[Aggregated Data<br/>PostgreSQL / Hive]
        TBU[Trie Builder<br/>Build + Serialize]
    end

    subgraph "Monitoring"
        MON[Prometheus + Grafana<br/>Metrics & Dashboards]
        ALR[Alert Manager<br/>PagerDuty Integration]
    end

    WEB --> CDN
    MOB --> CDN
    CDN --> LB
    LB --> AG1
    LB --> AG2
    AG1 --> QS1
    AG1 --> QS2
    AG2 --> QS2
    AG2 --> QS3
    QS1 --> FL
    QS2 --> FL
    QS3 --> FL
    QS1 --> ZK
    QS2 --> ZK
    QS3 --> ZK
    QS1 --> RC
    QS2 --> RC
    QS3 --> RC

    WEB -->|"Completed queries"| KF
    MOB -->|"Completed queries"| KF
    KF --> SP
    SP --> DB
    DB --> TBU
    TBU --> TS
    TS -->|"Pull new Trie"| QS1
    TS -->|"Pull new Trie"| QS2
    TS -->|"Pull new Trie"| QS3

    QS1 --> MON
    QS2 --> MON
    QS3 --> MON
    SP --> MON
    MON --> ALR

    style CDN fill:#e1f5fe
    style ZK fill:#fff9c4
    style KF fill:#e8eaf6
    style SP fill:#f3e5f5
    style FL fill:#fff3e0
    style MON fill:#e8f5e9

6.2 Trie Structure Visualization

graph TD
    ROOT["ROOT<br/>Top 10: system design, software engineer, ..."]

    S["s<br/>Top 10: system design, software engineer,<br/>stack overflow, ..."]
    T["t<br/>Top 10: tutorial, twitter, ..."]

    SY["sy<br/>Top 10: system design, sync, ..."]
    SO["so<br/>Top 10: software engineer, sorting, ..."]
    ST["st<br/>Top 10: stack overflow, streaming, ..."]

    SYS["sys<br/>Top 10: system design, system design interview,<br/>system design primer, ..."]

    SYST["syst<br/>Top 10: system design, system design interview, ..."]

    SYSTE["syste<br/>Top 10: system design, system design interview,<br/>system design primer, ..."]

    SYSTEM["system<br/>Top 10: system design, system design interview,<br/>system 32, system restore, ..."]

    ROOT --> S
    ROOT --> T
    S --> SY
    S --> SO
    S --> ST
    SY --> SYS
    SYS --> SYST
    SYST --> SYSTE
    SYSTE --> SYSTEM

    style ROOT fill:#ffcdd2
    style S fill:#f8bbd0
    style SYS fill:#e1bee7
    style SYSTEM fill:#c5cae9

Key insight: Moi node luu san top 10. Khi user go “sys”, Query Service chi can doc 1 node — khong can traverse subtree. O(1) lookup!

6.3 Data Gathering Pipeline (Chi tiet)

graph LR
    subgraph "Step 1: Ingest"
        U1[User Search Event]
        U2[User Search Event]
        U3[User Search Event]
        KP[Kafka Producer<br/>Batch + Compress]
        KT[Kafka Topic<br/>search-queries<br/>Partitioned by region]
    end

    subgraph "Step 2: Raw Storage"
        KC[Kafka Consumer<br/>S3 Sink Connector]
        S3R[S3 Raw Logs<br/>Partitioned by date/hour<br/>Parquet format]
    end

    subgraph "Step 3: Aggregate"
        SCH[Scheduler<br/>Airflow / Cron]
        SPK[Spark Job<br/>GroupBy query<br/>Count distinct users<br/>Apply decay]
        S3A[S3 Aggregated<br/>query, frequency, timestamp]
    end

    subgraph "Step 4: Build"
        TB[Trie Builder<br/>Read aggregated data<br/>Build optimized Trie<br/>Compress + Serialize]
        VAL[Validator<br/>Check Trie size<br/>Spot-check queries<br/>Compare with previous]
    end

    subgraph "Step 5: Deploy"
        S3T[S3 Trie Storage<br/>Versioned: v2026-03-18-12h]
        DEP[Deployer<br/>Notify Query Servers<br/>Blue-Green switch]
        QSR[Query Servers<br/>Pull + Load Trie]
    end

    U1 --> KP
    U2 --> KP
    U3 --> KP
    KP --> KT
    KT --> KC
    KC --> S3R
    SCH -->|"Trigger hourly"| SPK
    S3R --> SPK
    SPK --> S3A
    S3A --> TB
    TB --> VAL
    VAL -->|"Pass"| S3T
    VAL -->|"Fail"| SCH
    S3T --> DEP
    DEP --> QSR

    style KT fill:#e8eaf6
    style SPK fill:#f3e5f5
    style VAL fill:#fff3e0
    style DEP fill:#e8f5e9

6.4 Query Service Flow (Chi tiet)

flowchart TD
    A[User types character] --> B{Debounce timer active?}
    B -->|Yes| C[Reset timer to 200ms]
    B -->|No| D[Start 200ms timer]
    C --> E[Wait for timer]
    D --> E
    E --> F{Timer expired?}
    F -->|No, user typed again| B
    F -->|Yes| G[Get current prefix]
    G --> H{Prefix in browser cache?}
    H -->|Yes| I[Return cached result]
    H -->|No| J[Send AJAX to CDN]
    J --> K{CDN cache hit?}
    K -->|Yes| L[Return CDN cached result]
    K -->|No| M[Forward to API Gateway]
    M --> N[Rate limit check]
    N -->|Rejected| O[Return 429 Too Many Requests]
    N -->|Passed| P[Lookup Shard Map]
    P --> Q[Route to correct Trie Shard]
    Q --> R[Trie prefix lookup — O of 1]
    R --> S[Get top 10 raw suggestions]
    S --> T[Apply Filter Layer]
    T --> U{All 10 suggestions safe?}
    U -->|Yes| V[Return JSON response]
    U -->|No| W[Remove unsafe, backfill from next candidates]
    W --> V
    V --> X[Set Cache-Control headers]
    X --> Y[Response cached at CDN]
    Y --> Z[Response cached at Browser]
    Z --> AA[Display dropdown to user]

    style I fill:#e1f5fe
    style L fill:#e1f5fe
    style R fill:#c8e6c9
    style T fill:#fff3e0

7. Aha Moments & Pitfalls

7.1 Aha Moments — Nhung dieu interviewer muon nghe

#Aha MomentGiai thich
1Trie vs Database lookupNhieu nguoi nghi “chi can SELECT * FROM queries WHERE query LIKE ‘sys%’ ORDER BY frequency LIMIT 10”. Nhung voi 100M+ queries, LIKE query + sort la O(n log n) — cuc cham. Trie cho O(1) lookup vi top K da duoc precomputed
2Offline update, khong phai realtimeAutocomplete khong can real-time. Delay 15 phut - 1 gio la chap nhan duoc. Dieu nay don gian hoa thiet ke rat nhieu — khong can distributed lock, khong can consensus
3Debounce saves 80% trafficMot optimization o frontend giam 80% load cho backend. Interviewer an tuong khi em nghi toi full-stack optimization, khong chi backend
4Space-time trade-off tai moi Trie nodeLuu top K tai moi node = tang space 10x nhung giam lookup time tu O(millions) xuong O(1). Day la quintessential engineering trade-off
5Graceful degradationAutocomplete la nice-to-have, khong phai critical. Neu chet, search van hoat dong. Noi dieu nay cho thay em hieu system resilience
6Two separate systemsData Gathering va Query Service la hoan toan doc lap. Chi share Trie artifact. Day la clean architecture — moi phan co the scale/fail doc lap
7Cache at every layerBrowser → CDN → Application RAM. Moi layer filter traffic, chi 10-20% requests den backend. Day la multi-tier caching strategy

7.2 Pitfalls — Nhung loi thuong gap

#PitfallTai sao saiNen lam gi
1Dung RDBMS cho autocompleteLIKE query + ORDER BY tren 100M rows = seconds, khong phai milliseconds. Index giup nhung van cham hon TrieDung Trie — data structure duoc thiet ke rieng cho prefix search
2Update Trie moi keystroke23K QPS writes vao Trie = race condition, lock contention, write amplificationTach read path (online) va write path (offline). Update Trie dinh ky
3Khong nghi ve frontendChi thiet ke backend → bo qua debounce, caching → traffic gap 5-10xAutocomplete la bai toan full-stack. Frontend optimization quan trong nhu backend
4Shard Trie theo hashHash(“sys”) co the map den shard khac voi hash(“syst”) → khong the traverse tu prefix nay sang prefix kiaShard theo prefix range, khong phai hash. Giu subtree tren cung shard
5Quen filter layerDeploy autocomplete khong co content filter → goi y offensive → PR disasterLuon co filter layer. Blocklist + ML + human review
6Quen tinh capacityKhong uoc luong QPS, storage → thiet ke khong phu hop scaleBack-of-envelope estimation truoc khi dive vao design. Tham khao Tuan-02-Back-of-the-envelope
7Khong noi ve failure modeChi trinh bay happy path → interviewer hoi “neu Trie Builder fail thi sao?” → lamLuon chuan bi: Trie Builder fail → keep serving Trie cu. Query Server crash → LB route sang server khac. Filter Service down → serve unfiltered (hoac block autocomplete tam thoi)

7.3 Interview Tips — Cach trinh bay

Giai doanEm nen lamEm KHONG nen lam
Mo dau (2 phut)Hoi requirements, clarify scope, neu gia dinhNhay thang vao ve diagram
High-level (5 phut)Ve 2 thanh phan chinh (Data Gathering + Query Service), giai thich flowVe qua nhieu detail, liet ke cong nghe cu the
Deep dive (15 phut)Giai thich Trie, optimization, trade-offs. Hoi interviewer muon dive vao phan naoTu chon phan de nhat de trinh bay
Wrap up (3 phut)Noi ve scaling (sharding, replication), monitoring, securityBo qua hoan toan, hoac noi “em nghi vay la du”

Cac bai hoc lien quan truc tiep

TopicLinkAp dung trong Autocomplete
Cache StrategyTuan-06-Cache-StrategyMulti-tier caching: Browser → CDN → In-memory Trie. Cache invalidation khi Trie moi deploy
Message QueueTuan-08-Message-QueueKafka lam buffer cho search query log. Decouple Data Gathering pipeline
Consistent HashingTuan-10-Consistent-HashingCo the dung cho Trie sharding (du prefix-range sharding pho bien hon)

Cac bai hoc bo tro

TopicLinkLien he
Scale from ZeroTuan-01-Scale-From-Zero-To-MillionsAutocomplete bat dau tu 1 server voi Trie in-memory, scale dan len
Back-of-envelopeTuan-02-Back-of-the-envelopePhuong phap tinh QPS, storage, bandwidth o Section 3
Database ShardingTuan-07-Database-Sharding-ReplicationTrie sharding tuong tu DB sharding nhung co constraint rieng (phai giu subtree)
Rate LimiterTuan-09-Rate-LimiterBao ve autocomplete API khoi bot/spam
MonitoringTuan-13-Monitoring-ObservabilityCach monitor pipeline, latency, cache hit ratio
SecurityTuan-14-AuthN-AuthZ-SecurityAPI Gateway authentication, rate limiting

So sanh voi cac Case Study khac

Case StudyDiem tuong dongDiem khac biet
Tuan-16-Design-URL-ShortenerRead-heavy, caching quan trong, estimation approach giong nhauURL Shortener dung hash, Autocomplete dung Trie. URL Shortener can strong consistency, Autocomplete chap nhan eventual
Tuan-18-Design-News-FeedCa hai can ranking algorithm, personalization, offline pipelineNews Feed phuc tap hon ve social graph. Autocomplete don gian hon ve data model nhung kho hon ve latency requirement
Tuan-19-Design-Notification-SystemCa hai co offline pipeline (data gathering vs notification scheduling)Notification la push, Autocomplete la pull. Notification chap nhan delay, Autocomplete phai < 100ms

9. Summary — Tom tat 1 trang

He thong Autocomplete gom 2 phan chinh:

1. Data Gathering Service (Offline):

  • Thu thap search queries tu user → Kafka → S3
  • Aggregate frequency bang Spark/MapReduce
  • Build optimized Trie (luu top K tai moi node)
  • Serialize va deploy Trie dinh ky (moi gio/ngay)

2. Query Service (Online):

  • Nhan prefix tu user (sau debounce)
  • Lookup Trie in-memory → O(1) vi top K da precomputed
  • Filter offensive content
  • Tra ve 10 suggestions trong < 100ms

5 dieu quan trong nhat can nho:

  1. Trie la data structure cot loi — thiet ke rieng cho prefix search, vuot troi hon database LIKE query
  2. Precompute top K tai moi node — trade space cho speed, O(1) lookup
  3. Tach read path va write path — CQRS pattern, scale doc lap
  4. Frontend optimization (debounce + cache) giam 90%+ traffic — day la full-stack problem
  5. Shard Trie theo prefix range, KHONG phai hash — phai giu subtree tren cung shard

“Autocomplete la bai toan tuong don gian nhung an chua moi khia canh cua system design: data structure, distributed systems, caching, pipeline, security, va full-stack thinking. Master bai nay, em se san sang cho bat ky system design interview nao.”


Next Step: Thuc hanh ve lai toan bo architecture diagram tu dau, khong nhin tai lieu. Neu em ve duoc 80% chinh xac, em da nam vung bai nay.