Case Study: Design Search Autocomplete System — Full System Design Interview Walkthrough

“Mỗi lần em gõ một chữ trên Google, hệ thống phải quét hàng tỷ query trong vòng dưới 100ms để đưa ra 10 gợi ý chính xác nhất. Đó không phải magic — đó là engineering.”

Tags: system-design case-study search-autocomplete trie alex-xu interview Student: Hieu Source: Alex Xu — System Design Interview, Volume 1, Chapter 13 Prerequisite: Tuan-01-Scale-From-Zero-To-Millions · Tuan-02-Back-of-the-envelope Lien quan: Tuan-06-Cache-Strategy · Tuan-08-Message-Queue · Tuan-10-Consistent-Hashing

1. Context & Why — Tai sao Autocomplete quan trong?

1.1 The Everyday Analogy

Hieu, em hay mo Google Search ra. Gox “system d” — lap tuc xuat hien mot dropdown voi 10 goi y: “system design interview”, “system design primer”, “system design alex xu”… Tat ca dien ra trong duoi 100ms — nhanh hon ca mot cai chop mat (300-400ms).

Khong chi Google. Moi search bar em gap hang ngay deu co autocomplete:

YouTube — goi y video khi go tim kiem
Amazon — goi y san pham khi go ten
Spotify — goi y bai hat, nghe si
IDE (VS Code, IntelliJ) — code completion cung la mot dang autocomplete
LinkedIn — goi y nguoi, cong ty khi search

1.2 Tai sao day la bai toan kho?

Thach thuc	Giai thich
Latency cuc thap	User ky vong response < 100ms. Neu cham hon, cam giac “lag” se khien user bo di
Scale khong lo	Google xu ly ~8.5 billion searches/day. Moi search co ~20 keystroke → 170 billion autocomplete requests/day
Relevance	Goi y phai co nghia — khong phai random prefix match, ma phai sort theo popularity, freshness, personalization
Real-time feeling	Du backend co the delay, frontend phai tao cam giac “instant” qua debounce + caching
Multi-language	Tieng Viet, tieng Nhat, tieng Trung — moi ngon ngu co cach tokenize khac nhau
Safety	Khong duoc goi y noi dung offensive, bao luc, khieu dam

1.3 Business Impact

Autocomplete khong chi la tien ich — no la vu khi kinh doanh:

Giam thoi gian search — User gox 2-3 ky tu thay vi ca cau. Google uoc tinh autocomplete tiet kiem 200 nam thoi gian go moi ngay (toan cau)
Tang conversion — Amazon cho thay autocomplete tot tang click-through rate len 24%
Guided discovery — Huong user den cac query pho bien, gian tiep dieu khien traffic
Giam loi chinh ta — User chon goi y thay vi tu go, giam typo

Aha Moment: Autocomplete khong chi “giup user go nhanh hon”. No la mot cong cu dinh huong hanh vi nguoi dung — ai kiem soat autocomplete, nguoi do anh huong den nhung gi hang trieu nguoi tim kiem.

2. Deep Dive — Alex Xu 4-Step Framework

Overview — Bon buoc cua Alex Xu

Buoc	Ten goi	Thoi gian (45 phut)
1	Understand the Problem & Establish Design Scope	~5 phut
2	Propose High-Level Design	~15 phut
3	Design Deep Dive	~20 phut
4	Wrap Up	~5 phut

Step 1 — Understand the Problem & Establish Design Scope

1.1 Functional Requirements (Yeu cau chuc nang)

Hieu, truoc khi thiet ke, em phai hoi interviewer de lam ro scope. Day la cac cau hoi va gia dinh:

Cau hoi	Tra loi / Gia dinh
Autocomplete chi match tu dau (prefix) hay bat ky vi tri?	Chi prefix match. “sys” → “system design” nhung “stem” KHONG match
Tra ve bao nhieu goi y?	10 suggestions moi request
Goi y dua tren tieu chi gi?	Popularity (tan suat query) la primary signal
Co ho tro multi-language khong?	Co — tieng Anh, tieng Viet, tieng Nhat, v.v.
Co capitalize/lowercase distinction khong?	Khong — tat ca query duoc normalize ve lowercase
User phai dang nhap moi co autocomplete?	Khong — hoat dong cho ca anonymous user
Co personalization khong?	Phase 1: khong. Phase 2: co the dua vao search history ca nhan

Tom tat Functional Requirements:

Prefix matching — Khi user go “din”, tra ve cac query bat dau bang “din” (vi du: “dinner recipes”, “dinosaur”, “dining table”)
Top 10 suggestions — Sort theo popularity (query frequency), tra ve 10 ket qua hang dau
Multi-language — Ho tro Unicode, xu ly duoc tieng Viet (co dau), tieng Nhat (Kanji + Hiragana), tieng Trung
Near real-time update — Query moi (trending) can xuat hien trong goi y sau mot khoang thoi gian hop ly (khong can instant)

1.2 Non-Functional Requirements (Yeu cau phi chuc nang)

Requirement	Target	Giai thich
Low Latency	Response < 100ms (P99)	Autocomplete phai nhanh hon toc do go cua user. Neu user go 5 ky tu/giay (200ms/keystroke), response phai den truoc keystroke tiep theo
High Availability	99.99% uptime	Autocomplete down = search bar cam thay “chet”, du search van hoat dong
Scalability	10M+ DAU, 20B+ requests/day	Scale ngang khi traffic tang
Fault Tolerance	Graceful degradation	Neu autocomplete chet, search van hoat dong binh thuong — chi mat goi y
Consistency	Eventual consistency OK	Khong can strong consistency — viec goi y cu hon vai phut la chap nhan duoc
Relevance	Top suggestions phai co y nghia	Khong goi y random, phai sort dung theo popularity

Pitfall: Nhieu ung vien quen hoi ve graceful degradation. Autocomplete la non-critical feature — neu no chet, search PHAI van hoat dong. Day la dieu interviewer muon nghe.

1.3 Quy mo he thong (Scale)

Thong so	Gia tri	Giai thich
DAU (Daily Active Users)	10 million	Quy mo trung binh (nho hon Google, lon hon startup)
Searches per user per day	10	Trung binh
Average words per query	4 words	”best italian restaurant nyc”
Average chars per word	5 characters	Tieng Anh trung binh
Characters per query	20 characters	4 words x 5 chars
Autocomplete requests per query	20	Moi keystroke gui 1 request (truoc khi optimize)

Step 2 — Propose High-Level Design

2.1 Hai thanh phan chinh

Hieu, he thong autocomplete co 2 phan lon, hoat dong doc lap nhung bo sung cho nhau:

Thanh phan	Chuc nang	Tinh chat
Data Gathering Service	Thu thap query tu user, aggregate frequency, xay dung Trie	Offline / Near-realtime, write-heavy
Query Service	Nhan prefix tu user, lookup Trie, tra ve top K suggestions	Online, read-heavy, latency-critical

Aha Moment: Day la pattern CQRS (Command Query Responsibility Segregation) — tach rieng write path (data gathering) va read path (query service). Em da gap pattern nay o Tuan-11-Microservices-Pattern.

2.2 Data Gathering Service — Tong quan

Luong hoat dong:

User go query va nhan Enter (hoan thanh search)
Query duoc log vao Analytics Log (raw log)
Dinh ky (moi gio/ngay/tuan), Aggregation Pipeline doc raw log, dem tan suat moi query
Trie Builder lay data aggregated, xay dung/cap nhat Trie
Trie moi duoc push len Trie Storage (persistent) va Trie Cache (in-memory)

2.3 Query Service — Tong quan

Luong hoat dong:

User go ky tu trong search box
Frontend gui AJAX request voi prefix hien tai (vd: “sys”)
Request den API Gateway → Query Service
Query Service lookup prefix trong Trie Cache (in-memory)
Tra ve top 10 suggestions
Frontend hien thi dropdown

2.4 High-Level Architecture Diagram

graph TB
    subgraph "User Layer"
        U[User Browser/App]
    end

    subgraph "Online Path — Query Service"
        AG[API Gateway<br/>Rate Limiter + Auth]
        QS[Query Service<br/>Stateless]
        TC[Trie Cache<br/>In-Memory<br/>Redis / Local Memory]
    end

    subgraph "Offline Path — Data Gathering"
        AL[Analytics Log<br/>Raw Search Queries]
        AP[Aggregation Pipeline<br/>MapReduce / Spark]
        AF[Aggregated Frequency<br/>DB / File Storage]
        TB[Trie Builder<br/>Xay dung Trie tu frequency data]
        TS[Trie Storage<br/>Persistent — S3 / HDFS]
    end

    subgraph "Filter & Safety"
        FL[Filter Layer<br/>Remove offensive content]
    end

    U -->|"AJAX: prefix='sys'"| AG
    AG --> QS
    QS --> TC
    TC -->|"Cache miss"| TS
    QS --> FL
    FL -->|"Filtered top 10"| AG
    AG -->|"JSON response"| U

    U -->|"Completed search query"| AL
    AL --> AP
    AP --> AF
    AF --> TB
    TB --> TS
    TB -->|"Push new Trie"| TC

    style TC fill:#e1f5fe
    style FL fill:#fff3e0
    style AP fill:#f3e5f5

Note cho Hieu: Chu y la 2 path (online va offline) chi chia se Trie — khong co dependency truc tiep nao khac. Dieu nay giup em scale tung phan doc lap.

Step 3 — Design Deep Dive

Day la phan quan trong nhat cua buoi phong van — noi em the hien chieu sau ky thuat.

3.1 Trie Data Structure — Co ban

Trie la gi?

Trie (doc la “try”, tu chu “reTRIEval”) la mot cau truc du lieu dang cay, duoc thiet ke dac biet cho prefix search. Moi node dai dien cho mot ky tu, va duong di tu root den mot node tao thanh mot prefix.

Dac diem chinh cua Trie:

Dac diem	Giai thich
Root	Node goc, khong chua ky tu nao
Node	Moi node chua 1 ky tu va pointer den cac child node
Path	Duong tu root den mot node = mot prefix
Terminal node	Node danh dau ket thuc mot tu/query hoan chinh
Branching	Moi node co the co toi da 26 child (tieng Anh) hoac nhieu hon (Unicode)

Vi du truc quan

Gia su em co 3 query: “tree”, “try”, “true” voi frequency tuong ung.

graph TD
    ROOT["Root"]
    T["t"]
    R["r"]
    E1["e (tree: 35)"]
    Y["y (try: 20)"]
    U["u"]
    E2["e (true: 15)"]
    RE["r"]
    EE["e"]

    ROOT --> T
    T --> R
    R --> E1
    R --> Y
    R --> U
    E1 --> EE["e (tree: 35)"]
    U --> E2

    style E1 fill:#c8e6c9
    style Y fill:#c8e6c9
    style E2 fill:#c8e6c9

Time Complexity cua Trie

Operation	Complexity	Giai thich
Insert a query	O(L)	L = do dai query. Duyet tung ky tu, tao node neu chua co
Search exact query	O(L)	Duyet tung ky tu tu root
Find all queries with prefix	O(L + N)	L = do dai prefix, N = so node trong subtree. Day la van de!

Pitfall: Tim tat ca query co prefix “a” nghia la duyet toan bo subtree bat dau tu node “a”. Neu subtree co hang trieu node → cuc cham. Day la ly do em can optimize Trie.

3.2 Optimized Trie — Bi quyet cua autocomplete nhanh

Optimization 1: Luu top K tai moi node

Van de: Voi Trie co ban, de tim top 10 suggestions cho prefix “be”:

Tim node “b” → “e” (O(prefix length) = nhanh)
Duyet toan bo subtree tu “be” → thu thap tat ca terminal node
Sort theo frequency → lay top 10

Buoc 2 la bottleneck — subtree co the co hang trieu node.

Giai phap: Luu san top K queries tai moi node trong Trie.

Approach	Tim top 10 cho prefix “be”	Time
Basic Trie	Duyet toan bo subtree + sort	O(subtree size) — co the hang trieu
Optimized Trie	Doc truc tiep top 10 tai node “be”	O(1) — constant time!

Trade-off:

Space tang: Moi node luu them K entries (vd: 10 strings + frequency). Neu Trie co 1 trieu node → luu them 10 trieu entries
Update phuc tap hon: Khi frequency thay doi, phai cap nhat top K tai nhieu node (tu leaf len root)
Nhung latency giam cuc manh: Tu O(millions) xuong O(1)

Aha Moment: Day la classic space-time trade-off. Trong autocomplete, latency la vua — em san sang doi space de co speed. Interviewer se an tuong neu em noi ro trade-off nay.

Optimization 2: Gioi han do sau prefix (Prefix depth limit)

Van de: Mot query co the dai toi 50+ ky tu. Trie sau 50 level la lang phi vi:

It user nao go qua 10-15 ky tu truoc khi chon goi y
Cang go nhieu ky tu, so luong match cang it → autocomplete cang it huu ich

Giai phap: Gioi han do sau cua Trie, vi du chi luu prefix toi da 50 ky tu.

Prefix length	So luong matches	Autocomplete huu ich?
1-3 ky tu	Hang trieu	Rat huu ich — user dang explore
4-7 ky tu	Hang nghin	Huu ich — user dang narrow down
8-15 ky tu	Hang tram	Van huu ich
15+ ky tu	Vai chuc hoac it hon	It huu ich — user da biet muon gi
50+ ky tu	Gan nhu 0	Vo nghia

Aha Moment cho Hieu: Google chi hien autocomplete cho khoang 30 ky tu dau tien. Sau do, dropdown bien mat. Day khong phai bug — day la optimization co chu y.

Optimization 3: Node compression (Trie nen)

Neu mot chuoi node chi co 1 child (khong branch), gop chung thanh 1 node:

Truoc khi nen: r → o → m → a → n → c → e (7 nodes) Sau khi nen: “romance” (1 node)

Loi ich:

Giam so luong node (thuong giam 50-70%)
Giam memory footprint
Giam so lan pointer dereference khi traverse

3.3 Trie Update Strategy — Offline Rebuild vs Incremental

Tai sao KHONG update Trie real-time?

Ly do	Giai thich
Write amplification	Moi query moi → cap nhat frequency + top K tai nhieu node (tu leaf len root). Voi 100K QPS, qua nhieu writes
Consistency phuc tap	Nhieu server doc Trie cung luc. Neu update giua chung → inconsistent state
Latency spike	Write lock co the block read, gay latency spike cho autocomplete
Khong can real-time	Autocomplete suggestions khong can chinh xac tung giay. Delay 15 phut - 1 gio la chap nhan duoc

Strategy 1: Offline Rebuild (Recommended cho hau het truong hop)

Cach hoat dong:

Dinh ky (moi 1 gio / moi ngay), Aggregation Pipeline dem frequency cua tat ca query
Trie Builder xay Trie moi hoan toan tu aggregated data
Trie moi duoc serialize va luu vao Trie Storage
Query Service servers tai Trie moi (hot swap — khong downtime)

Uu diem:

Don gian, de implement, de debug
Khong co race condition
Trie luon o trang thai consistent
De rollback — chi can quay ve Trie version cu

Nhuoc diem:

Trending query mat thoi gian de xuat hien (delay = rebuild interval)
Rebuild Trie lon ton nhieu compute resource

Strategy 2: Incremental Update (Cho he thong can nhanh hon)

Cach hoat dong:

Khi co query moi, gui event vao Message Queue (Tuan-08-Message-Queue)
Trie Updater consumer doc event, cap nhat frequency tai node tuong ung
Recalculate top K tai cac node bi anh huong (tu leaf len root)

Uu diem:

Trending query xuat hien nhanh hon (delay = processing time, thuong vai phut)
Khong can rebuild toan bo Trie

Nhuoc diem:

Phuc tap hon nhieu: phai xu ly concurrency, locking, versioning
Risk: cap nhat sai co the corrupt Trie
Can careful engineering de khong anh huong read performance

Hybrid Approach (Thuc te)

Hau het cac he thong lon dung hybrid:

Baseline Trie: Rebuild offline moi ngay tu historical data
Delta Trie: Incremental update moi 15 phut tu trending data
Merge: Khi query, merge ket qua tu ca 2 Trie

Aha Moment: YouTube dung hybrid approach — goi y co ban tu offline Trie, nhung trending video/topic duoc inject tu realtime pipeline. Do la ly do em thay trending video xuat hien trong autocomplete chi vai phut sau khi viral.

3.4 Data Gathering Pipeline — Chi tiet

Toan bo luong du lieu tu raw log den Trie

graph LR
    subgraph "Step 1: Collection"
        SB[Search Box<br/>User go query]
        QL[Query Logger<br/>Append to log file]
        KF[Message Queue<br/>Kafka / Kinesis]
    end

    subgraph "Step 2: Storage"
        S3[Raw Log Storage<br/>S3 / HDFS]
    end

    subgraph "Step 3: Aggregation"
        SP[Aggregation Engine<br/>Spark / Flink / MapReduce]
        FT[Frequency Table<br/>query → count per time window]
    end

    subgraph "Step 4: Trie Build"
        TBU[Trie Builder<br/>Xay dung Trie tu frequency]
        SER[Trie Serializer<br/>Convert Trie → binary format]
    end

    subgraph "Step 5: Distribution"
        TST[Trie Storage<br/>S3 / GCS]
        TCA[Trie Cache<br/>Redis / In-memory]
        QSE[Query Servers<br/>Load Trie vao RAM]
    end

    SB --> QL
    QL --> KF
    KF --> S3
    S3 --> SP
    SP --> FT
    FT --> TBU
    TBU --> SER
    SER --> TST
    TST --> TCA
    TST --> QSE

    style KF fill:#e8eaf6
    style SP fill:#f3e5f5
    style TCA fill:#e1f5fe

Step-by-step giai thich

Step 1 — Collection (Thu thap):

Moi khi user hoan thanh mot search (nhan Enter hoac click goi y), query duoc log
Quan trong: Chi log query da hoan thanh, KHONG log moi keystroke. Ly do:
- Keystroke log qua nhieu (20x nhieu hon completed query)
- Keystroke la “partial query” — khong dai dien cho y dinh thuc su cua user
- Vi du: user go “sys” → “syst” → “system” → “system design” → Enter. Chi log “system design”
Log format: timestamp | user_id_hash | query | country | language
Dung Message Queue (Kafka) lam buffer de tranh mat log khi traffic spike

Step 2 — Storage (Luu tru):

Raw log duoc luu vao distributed storage (S3, HDFS)
Partition theo ngay va gio de de truy van
Retention policy: giu raw log 30 ngay, sau do archive hoac xoa
Du lieu nay cung duoc dung cho analytics khac (search quality, trending topics)

Step 3 — Aggregation (Tong hop):

Day la buoc nang nhat ve compute
Dung framework nhu Apache Spark, Flink, hoac MapReduce
Input: raw log files
Output: bang tan suat — moi dong la (query, time_window, count)
Time window thuong la 1 gio hoac 1 ngay
Weighted aggregation: Query ganh day hon duoc trong so cao hon

w e i g h t e d_f re q u e n cy (q) = i = 1 \sum N co u n t_{i} \times d ec a y_f a c t o r^{(n o w - t_{i})}

Voi $d ec a y_f a c t or$ thuong la 0.9-0.99, va $t_{i}$ la thoi diem cua time window $i$ .

Giai thich cho Hieu: Decay factor dam bao query “World Cup 2026” co weight cao trong mua World Cup, nhung giam dan sau do. Khong co decay, nhung query cu nhung pho bien (vi du “Facebook login”) se luon ap dao trending query.

Step 4 — Trie Build (Xay dung Trie):

Doc frequency table, xay dung Trie tu scratch
Tai moi node, tinh va luu top K queries (thuong K = 10-25)
Apply compression (gop cac node chi co 1 child)
Serialize Trie thanh binary format de luu tru va truyen tai hieu qua

Step 5 — Distribution (Phan phoi):

Trie da serialize duoc upload len Trie Storage (S3)
Query servers tai Trie moi tu S3 → load vao RAM
Co the dung Trie Cache (Redis) lam intermediate layer cho cac server chua kip load Trie moi
Blue-green deployment: Giu 2 version Trie. Server dang dung version A. Load version B vao RAM. Switch traffic sang B. Neu co loi, switch lai A

3.5 Query Service — Chi tiet

Luong xu ly mot autocomplete request

sequenceDiagram
    participant User as User Browser
    participant FE as Frontend JS
    participant CDN as CDN/Edge Cache
    participant AG as API Gateway
    participant QS as Query Service
    participant TC as Trie Cache (Memory)
    participant FL as Filter Layer

    User->>FE: Go ky tu "sys"
    Note over FE: Debounce 200ms<br/>Cho user go them
    User->>FE: Go them "t" → "syst"
    Note over FE: Debounce 200ms
    Note over FE: 200ms het — gui request
    FE->>CDN: GET /autocomplete?q=syst

    alt Cache HIT tai CDN
        CDN-->>FE: Return cached suggestions
    else Cache MISS
        CDN->>AG: Forward request
        AG->>QS: Lookup prefix "syst"
        QS->>TC: Trie.search("syst")
        TC-->>QS: Top 10 raw suggestions
        QS->>FL: Filter offensive content
        FL-->>QS: Filtered top 10
        QS-->>AG: JSON response
        AG-->>CDN: Response + Cache-Control header
        CDN-->>FE: Return suggestions
    end

    FE-->>User: Hien thi dropdown 10 goi y

Frontend Optimizations

Debounce — Ky thuat tiet kiem 80% traffic:

Khong co debounce, moi keystroke gui 1 request:

User go “system design” (13 ky tu) → 13 requests
Nhung user go lien tuc — cac request cho “s”, “sy”, “sys” deu vo nghia vi user chua dung lai

Voi debounce 200ms:

Chi gui request khi user ngung go 200ms
User go “system design” lien tuc → chi gui 2-3 requests thay vi 13
Tiet kiem ~80% traffic

Khong co debounce	Co debounce 200ms
”s” → request	”s” → cho…
“sy” → request	”sy” → cho…
“sys” → request	”sys” → cho…
“syst” → request	”syst” → cho…
“syste” → request	”syste” → gui request (user ngung 200ms)
“system” → request	”system” → cho…
“system ” → request	”system ” → cho…
“system d” → request	”system d” → cho…
“system de” → request	”system de” → gui request
… (13 requests)	… (3-4 requests)

Browser Caching:

Response cho prefix “sys” it thay doi trong 1 gio
Set Cache-Control: max-age=3600 (cache 1 gio tai browser)
Lan sau user go “sys” → browser tra ket qua tu cache, khong can gui request
Hieu qua cang lon voi cac prefix ngan (it thay doi hon prefix dai)

AJAX (Asynchronous):

Request duoc gui asynchronous — khong block UI
Neu response den cham (> 300ms), user da go them ky tu → huy request cu, gui request moi
Tranh hien thi ket qua “stale” cho prefix cu

Backend Optimizations

In-Memory Trie:

Toan bo Trie duoc load vao RAM cua Query Server
Lookup time = O(prefix length) — thuong < 1 microsecond
Khong can truy cap disk hay external database
Moi Query Server giu ban sao rieng cua Trie — khong co shared state → de scale horizontal

CDN / Edge Caching:

Autocomplete responses la read-only va thay doi cham (moi gio/ngay)
Cache tai CDN edge server — user o Viet Nam duoc serve tu CDN Singapore, khong can request ve US
Cache key: prefix string (vd: “sys”, “syst”, “syste”)
Hit rate cao cho prefix ngan (it variation)

Two-tier Cache Strategy (Tham khao: Tuan-06-Cache-Strategy):

Tier	Implementation	Latency	Hit Rate
L1: CDN/Browser cache	HTTP cache	~0ms (browser) / ~5ms (CDN)	60-80% cho short prefix
L2: Application cache	In-memory Trie tai Query Server	< 1ms	100% (Trie luon co trong RAM)

Aha Moment: Voi debounce + browser caching + CDN caching, chi 10-20% autocomplete requests thuc su den backend. Day la ly do he thong co the handle hang ty requests/day voi it server.

3.6 Filter Layer — An toan noi dung

Tai sao can filter?

Autocomplete goi y cong khai — tat ca user deu thay. Neu goi y chua noi dung offensive, violent, hay sexually explicit:

PR disaster — bao chi dua tin, social media viral (da xay ra voi Google nhieu lan)
Legal risk — vi pham luat bao ve tre em, luat chong ky thi
User trust — mat long tin nguoi dung

Cac loai noi dung can filter

Loai	Vi du	Muc do
Hate speech	Noi dung phan biet chung toc, gioi tinh, ton giao	Block hoan toan
Violence	Huong dan bao luc, vu khi	Block hoan toan
Adult content	Noi dung khieu dam (tru khi SafeSearch OFF)	Block default, cho phep voi setting
Defamation	”[ten nguoi] + tu xuc pham”	Block dua tren rule
Self-harm	Noi dung tu gay thuong, tu tu	Block + hien thi hotline ho tro
Illegal activity	Mua ban chat cam, hack	Block hoan toan

Filter implementation

Layer 1 — Blocklist (Offline):

Danh sach cac query/keyword bi cam, duoc maintain boi content moderation team
Apply khi build Trie — loai bo cac query trong blocklist truoc khi insert vao Trie
Uu diem: nhanh, khong anh huong runtime
Nhuoc diem: khong bat duoc variation moi (vd: “h4te” thay vi “hate”)

Layer 2 — ML Classification (Near-realtime):

Model ML phan loai query thanh safe/unsafe
Chay khi build Trie (offline) hoac khi filter response (online)
Bat duoc variation, tieng long, encoded text
Nhuoc diem: co false positive (block query vo hai)

Layer 3 — Runtime Filter (Online):

Truoc khi tra suggestions ve user, chay qua filter cuoi cung
Check against latest blocklist (co the cap nhat real-time qua config service)
Dam bao khong co suggestion nao “lot luoi” offline filter

3.7 Trie Serialization — Luu tru va truyen tai Trie

Tai sao can serialize?

Trie trong memory la pointer-based structure — moi node chua pointer den children. Khong the:

Luu truc tiep xuong disk (pointer khong co nghia khi reload)
Gui qua network tu Trie Builder den Query Server
Version control (so sanh 2 Trie)

Cac cach serialize

Cach 1 — Level-order traversal (BFS):

Duyet Trie theo tung level (root → level 1 → level 2 → …)
Luu moi node duoi dang: (character, is_terminal, num_children, top_k_list)
De reconstruct — doc tuan tu, xay lai Trie

Cach 2 — Key-value pairs:

Luu moi prefix va top K cua no duoi dang key-value:
- Key: “sys” → Value: [“system design:1000”, “system:800”, …]
- Key: “syst” → Value: [“system design:1000”, “system:800”, …]
Co the luu vao Redis, RocksDB, hay bat ky KV store nao
Loi ich: khong can reconstruct Trie — chi can KV lookup

Cach 3 — Compact binary format (Recommended):

Custom binary format, optimized cho size va speed
Dung techniques: varint encoding, dictionary compression, delta encoding
Trie 1GB trong memory co the giam xuong 200-300MB khi serialize
Nhanh nhat khi deserialize vi it overhead

3.8 Multi-Language Support

Thach thuc

Ngon ngu	Van de	Vi du
Tieng Viet	Dau thanh, chu cai dac biet (a, a, o, o, u, d)	“pho” vs “pho” la 2 tu khac nhau
Tieng Nhat	3 he thong chu: Hiragana, Katakana, Kanji	User co the go “tokyo” bang Romaji, Hiragana, hoac Kanji
Tieng Trung	Khong co space giua cac tu, Pinyin input	”bei jing” (Pinyin) → “Beijing”
Tieng Han	Jamo (phu am + nguyen am) ket hop thanh syllable	User go tung Jamo, autocomplete phai xu ly tung buoc
Tieng A Rap	Right-to-left, ky tu thay doi hinh dang theo vi tri	Prefix match phuc tap hon

Giai phap: Unicode + Separate Tries per Language

Approach 1 — Unicode Trie:

Dung Unicode code point lam key cho moi node (thay vi chi 26 chu cai ASCII)
Moi node co the co hang ngan children (Unicode co ~150,000 ky tu)
Dung hash map thay vi array cho children → tiet kiem memory
Uu diem: 1 Trie cho tat ca ngon ngu
Nhuoc diem: Trie rat lon, kho shard

Approach 2 — Separate Trie per Language (Recommended):

Moi ngon ngu co Trie rieng
Detect ngon ngu tu: (1) user setting, (2) keyboard input, (3) geo-location
Route request den Trie tuong ung
Uu diem: moi Trie nho hon, de optimize cho tung ngon ngu
Nhuoc diem: user bilingual (vd: go ca tieng Anh va tieng Viet) can query nhieu Trie

Normalization pipeline (ap dung cho moi ngon ngu):

Convert ve lowercase
Remove diacritics cho search (nhung giu lai khi hien thi). Vi du: “Pho” → normalize thanh “pho” de search, nhung hien thi “Pho”
Handle synonyms (vd: “tp hcm” = “thanh pho ho chi minh” = “saigon”)
Handle transliteration (vd: “tokyo” → Romaji match cho tieng Nhat)

3.9 Personalization Layer

Overview

Personalization = goi y dua tren lich su ca nhan cua user, khong chi global popularity.

Vi du: Khi Hieu go “py”:

Global autocomplete: “python download”, “python tutorial”, “pyridoxine”
Personalized cho Hieu (backend dev): “python asyncio”, “python system design”, “pydantic”

Architecture

Component	Chuc nang
User Profile Store	Luu search history per user (recent queries, click-through)
Personal Trie / Score Adjustment	Adjust score cua suggestions dua tren user profile
Merge Layer	Ket hop global suggestions + personal suggestions, re-rank

Implementation don gian

Luu 50 query gan nhat cua moi user trong Redis (key: user_id, value: list of recent queries)
Khi autocomplete request den:
- Lay top 10 tu global Trie (frequency-based)
- Lay queries tu user’s recent history match voi prefix
- Merge: uu tien personal match, fill con lai tu global
Neu user chua dang nhap → chi dung global Trie

Privacy note: Personal data phai duoc luu cach biet, co the xoa theo yeu cau (GDPR right to erasure). Chi tiet o Section 4 (Security).

3.10 Scaling — Trie Sharding & Replication

Tai sao can shard Trie?

Khi he thong lon:

Trie cho tieng Anh co the co hang ty node → khong fit vao RAM cua 1 server
Traffic cho prefix “a” co the gap 10 lan prefix “x” → hot spot
Can phan tan Trie ra nhieu server

Sharding Strategy 1: By First Character

Shard	Prefixes	Nhung server
Shard 0	a-c	Server 1, 2, 3
Shard 1	d-f	Server 4, 5, 6
Shard 2	g-i	Server 7, 8, 9
…	…	…
Shard 8	y-z + special chars	Server 25, 26, 27

Uu diem: Don gian, de implement Nhuoc diem: Khong deu tai! Prefix “s” co nhieu query hon prefix “x” gap 100 lan → shard “s” bi overload

Sharding Strategy 2: By Prefix Range (Recommended)

Thay vi chia theo 1 ky tu dau, chia theo range cua prefix, dam bao moi shard co luong traffic tuong duong:

Shard	Prefix Range	Estimated Traffic
Shard 0	”a” — “be”	~12% traffic
Shard 1	”bf” — “ce”	~11% traffic
Shard 2	”cf” — “di”	~10% traffic
…	…	…
Shard 9	”t” — “z”	~13% traffic

Cach xac dinh range: Dua tren historical query distribution. Dieu chinh dinh ky khi traffic pattern thay doi.

Shard Map & Zookeeper

Lam sao Query Service biet prefix “sys” nam o shard nao?

Giai phap: Dung Shard Map — mapping tu prefix range → shard address. Luu shard map trong Zookeeper (hoac etcd):

graph TD
    subgraph "Shard Map (Zookeeper)"
        ZK[Zookeeper Cluster]
        SM["Shard Map:<br/>a-be → Shard 0<br/>bf-ce → Shard 1<br/>cf-di → Shard 2<br/>...<br/>t-z → Shard 9"]
    end

    subgraph "Query Service"
        QS1[Query Server 1]
        QS2[Query Server 2]
        QS3[Query Server 3]
    end

    subgraph "Trie Shards"
        S0[Shard 0<br/>Prefix: a-be]
        S1[Shard 1<br/>Prefix: bf-ce]
        S2[Shard 2<br/>Prefix: cf-di]
        S9[Shard 9<br/>Prefix: t-z]
    end

    QS1 --> ZK
    QS2 --> ZK
    QS3 --> ZK
    ZK --> SM

    QS1 --> S0
    QS1 --> S1
    QS2 --> S2
    QS3 --> S9

    style ZK fill:#fff9c4

Luong hoat dong:

Query Server khoi dong → doc shard map tu Zookeeper, cache local
Request den voi prefix “sys” → lookup shard map → “sys” thuoc Shard 9 (range “t-z”) — uh oh, “sys” thuoc range “s” nen can chinh lai mapping
Route request den Shard tuong ung
Khi shard map thay doi (rebalance), Zookeeper notify tat ca Query Server → update cache

Replication

Moi shard co 3 replicas (tham khao: Tuan-07-Database-Sharding-Replication):

1 Primary — nhan Trie update tu Trie Builder
2 Replicas — serve read traffic
Neu Primary chet → mot Replica duoc promote len Primary
Read traffic duoc load balance giua cac replicas

Aha Moment cho Hieu: Sharding Trie kho hon sharding database thong thuong. Voi DB, em hash key → shard. Voi Trie, em phai giu toan bo subtree cua mot prefix range tren cung shard — khong the chia nho hon. Day la constraint dac biet cua Trie.

3. Capacity Estimation (Uoc luong chi tiet)

3.1 QPS Calculation

Given:

DAU = 10 million
Searches per user per day = 10
Keystrokes per query = 20 (trung binh)
Seconds in a day = 86,400

T o t a l se a rc h es / d a y = 10 M \times 10 = 100 M se a rc h es / d a y

T o t a l k eys t ro k es / d a y = 100 M \times 20 = 2 B k eys t ro k es / d a y

Truoc optimization (khong debounce, khong cache):

QP S_{r a w} = \frac{2 , 000 , 000 , 000}{86 , 400} \approx 23, 148 re q u es t s / s \approx 23 K QPS

QP S_{p e ak} = 23 K \times 3 \approx 69 K QPS

Sau optimization (debounce giam 80%, browser cache giam 60% con lai):

QP S_{a f t er_d e b o u n ce} = 23 K \times 0.2 = 4, 600 QPS

QP S_{a f t er_c a c h e} = 4, 600 \times 0.4 = 1, 840 QPS

QP S_{p e ak_o pt imi ze d} = 1, 840 \times 3 \approx 5, 520 QPS

Nhan xet: Tu 69K QPS peak xuong 5.5K QPS peak — giam 92% nho debounce + caching. Day la ly do frontend optimization cuc ky quan trong cho autocomplete.

3.2 Storage for Trie

Assumptions:

So luong unique queries: 100 million (100M)
Average query length: 20 characters = 20 bytes (ASCII) hoac 40 bytes (UTF-8 multi-language)
Moi node trong Trie: 50 bytes (character + metadata + pointers)
Top K list per node: 10 entries x 30 bytes = 300 bytes per node
So luong node trong Trie (sau compression): ~500 million (500M)

T r i e s i z e_{ba s i c} = 500 M \times 50 b y t es = 25 GB

T o p K o v er h e a d = 500 M \times 300 b y t es = 150 GB

T o t a l T r i e s i ze = 25 GB + 150 GB = 175 GB

Voi compression (binary serialization):

C o m p resse d T r i e \approx 175 GB \times 0.3 = 52.5 GB

Nhan xet: 52.5 GB co the fit vao RAM cua 1 server manh (64GB+ RAM). Nhung de an toan va scale, em nen shard thanh 5-10 shard, moi shard ~5-10 GB. Rat thoai mai.

3.3 Network Bandwidth

Request size: ~50 bytes (prefix + headers) Response size: ~500 bytes (10 suggestions x 50 bytes moi cai)

I nb o u n d ban d w i d t h = 5, 520 QPS \times 50 b y t es = 276 K B / s \approx 0.28 MB / s

O u t b o u n d ban d w i d t h = 5, 520 QPS \times 500 b y t es = 2, 760 K B / s \approx 2.76 MB / s

Nhan xet: Bandwidth cuc thap. Autocomplete la tinh toan nhieu, truyen tai it (compute-heavy, IO-light). Bottleneck la CPU/RAM cho Trie lookup, khong phai network.

3.4 Storage for Logs & Aggregated Data

R a w l o g p er se a rc h = 200 b y t es (t im es t am p + u ser_ha s h + q u ery + m e t a d a t a)

R a w l o g / d a y = 100 M \times 200 b y t es = 20 GB / d a y

R a w l o g / ye a r = 20 GB \times 365 = 7.3 TB / ye a r

A gg re g a t e d d a t a (co m p resse d) \approx 5% o f r a w = 365 GB / ye a r

4. Security Considerations

4.1 Filter Offensive/Harmful Suggestions

Moi de doa	Anh huong	Giai phap
Autocomplete goi y noi dung phan biet chung toc	PR crisis, mat user trust, kien tung	Multi-layer filter (blocklist + ML)
Goi y lien quan bao luc, vu khi	Trach nhiem phap ly	Strict blocklist, zero tolerance
Goi y noi dung nguoi lon	Khong phu hop moi doi tuong	SafeSearch toggle, default ON
Goi y tu tu / tu gay thuong	Trach nhiem xa hoi	Block + hien thi “Neu ban can ho tro, goi 1800-…”

Best Practices:

Maintain curated blocklist — duoc cap nhat hang tuan boi content moderation team (nguoi thuc, khong chi ML)
ML model phan loai query safe/unsafe — train tren data da label
A/B testing filter — test filter moi tren % nho user truoc khi rollout
Incident response — khi phat hien goi y xau da lot luoi, co the push blocklist update trong phut, khong can rebuild Trie
Human-in-the-loop — voi nhung query mo ho (co the offensive trong 1 context nhung khong phai o context khac), de con nguoi review

4.2 User Privacy

Van de	Chi tiet	Giai phap
Search log chua PII	Log query chua ten, dia chi, thong tin ca nhan	Anonymize user_id truoc khi log (hash + salt)
Location tracking	IP address trong log → xac dinh vi tri user	Aggregate theo region (quoc gia/thanh pho), khong luu exact IP
Sensitive searches	User search thong tin y te, luat su, chinh tri	KHONG log cac query thuoc sensitive categories
Data retention	Luu log qua lau → tang risk data breach	Delete raw log sau 30 ngay, aggregated data sau 2 nam

Privacy-by-design principles:

Data minimization — Chi thu thap du lieu can thiet cho autocomplete, khong hon
Anonymization — Tach search log khoi user identity. Dung differential privacy khi co the
Encryption at rest — Log va aggregated data duoc encrypt tren disk (AES-256)
Encryption in transit — HTTPS/TLS cho tat ca request (da la standard)
Access control — Chi data engineering team truy cap raw log. Dung role-based access (RBAC)

GDPR (General Data Protection Regulation) cua EU yeu cau:

User co quyen yeu cau xoa tat ca du lieu lien quan den ho
Dieu nay ap dung cho: search logs, aggregated data, va ca autocomplete suggestions chua thong tin ca nhan

Thach thuc: Trie da duoc build — lam sao xoa 1 query cu the?

Giai phap:

Approach	Mo ta	Pros	Cons
Rebuild Trie (exclude deleted queries)	Khi nhan deletion request, danh dau query de exclude, ap dung khi rebuild Trie ke tiep	Don gian, clean	Delay (phai cho rebuild cycle)
Runtime filter	Maintain “deleted queries” list. Filter tai query time	Nhanh (ap dung ngay lap tuc)	Overhead moi request
Hybrid	Runtime filter ngay lap tuc + rebuild Trie de xoa vinh vien	Nhanh va clean	Phuc tap hon

Process xu ly GDPR deletion request:

User gui request xoa du lieu → Identity Service xac nhan user
Tim tat ca log entries lien quan den user (bang user_id_hash)
Xoa/anonymize log entries
Them cac queries can xoa vao “deletion list”
Apply runtime filter ngay → user khong con thay goi y lien quan
Trie rebuild ke tiep se exclude cac queries nay

4.4 Preventing Manipulation (SEO Spam in Suggestions)

Moi de doa: Bad actors co gang “bom” autocomplete bang cach:

Dung bot go lap di lap lai mot query de tang frequency
Coordinated attack — nhieu nguoi cung search 1 query de day no len top
Muc dich: quang cao mien phi, boi nho doi thu, lan truyen thong tin sai

Giai phap:

Ky thuat	Mo ta
Rate limiting per user/IP	Gioi han so query/phut tu cung 1 user hoac IP. Tham khao Tuan-09-Rate-Limiter
Bot detection	Phat hien pattern bat thuong: query qua nhanh, khong co mouse movement, headless browser
Minimum frequency threshold	Query phai dat so luong toi thieu (vd: 1000 unique users) moi xuat hien trong autocomplete
Unique user counting	Dem so unique users search mot query, khong phai tong so searches. 1 user search 1000 lan = 1 count
Anomaly detection	Phat hien query tang dot ngot bat thuong → flag de review truoc khi dua vao Trie
Verified source weighting	Query tu logged-in user co weight cao hon anonymous. Query tu known-good IP range co weight cao hon suspicious IP

5. DevOps & Monitoring

5.1 Trie Rebuild Pipeline Monitoring

Metric	Mo ta	Alert Threshold
Pipeline completion time	Thoi gian tu raw log → Trie hoan chinh	> 2x binh thuong → alert
Pipeline success rate	% pipeline run thanh cong	< 99% → alert
Trie size delta	Thay doi size giua 2 version Trie	Tang/giam > 20% → investigate
Query count delta	So luong query trong Trie moi vs cu	Giam > 10% → co the mat data
Aggregation lag	Thoi gian tu query xay ra → xuat hien trong Trie	> 4 gio → check pipeline

Pipeline health dashboard nen hien thi:

Thoi gian hoan thanh pipeline lan gan nhat
So luong raw log records processed
So luong unique queries sau aggregation
Trie size (bytes) va so luong nodes
Trang thai deployment (dang serve Trie version nao)

5.2 Query Latency Monitoring

Metric	Target	Giai thich
P50 latency	< 10ms	Phan nua request phai duoi 10ms
P95 latency	< 50ms	95% request duoi 50ms
P99 latency	< 100ms	99% request duoi 100ms — day la hard requirement
P99.9 latency	< 200ms	Edge case — co the chap nhan

Nguyen nhan latency spike va cach xu ly:

Nguyen nhan	Phat hien	Xu ly
GC pause (Java/Go)	JVM GC log, latency spike dinh ky	Tune GC, dung low-latency GC (ZGC, Shenandoah)
Trie reload (load Trie moi vao RAM)	Latency spike tai thoi diem deploy Trie moi	Blue-green deployment: load Trie moi xong moi switch traffic
Cold start (server moi khoi dong)	Latency cao o vai nghin request dau	Warm-up: load Trie va xu ly dummy requests truoc khi nhan traffic that
Network issue	Latency tang dong deu tren nhieu server	Check network dashboard, route lai traffic
Hot prefix	Latency tang cho 1 shard cu the	Scale up shard do, hoac rebalance prefix range

5.3 Cache Hit Ratio

Cache layer	Target hit ratio	Giai thich
Browser cache	30-50%	User thuong go lai cac prefix giong nhau
CDN cache	60-80%	Prefix ngan (1-3 chars) duoc cache hieu qua
Application cache (Trie in memory)	100%	Trie luon co trong RAM — moi prefix deu co ket qua

Neu cache hit ratio thap hon mong doi:

Browser cache thap → kiem tra Cache-Control header co dung khong
CDN cache thap → kiem tra TTL, cache key strategy, xem co vary header khong can thiet
Nhieu cache miss → xem xet user behavior change (trending event → query moi)

5.4 A/B Testing for Ranking Algorithms

Autocomplete ranking khong chi la frequency. Cac signal khac:

Signal	Mo ta	Vi du
Frequency	So lan query duoc search	”python tutorial” > “python asyncio”
Freshness	Query moi/trending duoc boost	”world cup 2026” boost trong mua giai
Personalization	Dua tren user history	Dev search “python” → boost “python docs”
Click-through rate	% user chon goi y do	Goi y duoc chon nhieu → boost
Geography	Dua tren vi tri user	User o VN: “pho” → “pho ha noi”; User o US: “pho” → “phone number”

A/B testing process:

Hypothesis: “Them freshness signal se tang click-through rate 5%”
Experiment: 5% user (random) nhan ranking moi (treatment), 95% nhan ranking cu (control)
Metrics: So sanh click-through rate, search completion time, user satisfaction (bounce rate)
Duration: Chay it nhat 2 tuan de co du statistical significance
Decision: Neu treatment tot hon voi p-value < 0.05 → rollout 100%

Pitfall: A/B test autocomplete kho vi feedback loop — neu em goi y “X” cho treatment group, ho se click “X” nhieu hon → tang frequency cua “X” → goi y “X” nhieu hon. Can careful experimental design de tranh bias.

5.5 Alerting Strategy

Alert	Severity	Action
P99 latency > 100ms keo dai 5 phut	Critical	Page on-call engineer. Kiem tra: Trie reload? GC? Hot shard?
Pipeline fail 2 lan lien tiep	High	Investigate pipeline. Trie cu van dang serve — khong mat service
Trie size giam > 20%	High	Co the mat data trong aggregation. KHONG deploy Trie nay — rollback
Cache hit ratio giam > 30%	Medium	Kiem tra cache config, TTL, trending event
Filter miss (offensive suggestion reported)	High	Update blocklist ngay. Root cause analysis
Error rate > 1%	Critical	Kiem tra server health, dependency failure

6. Architecture Diagrams (Mermaid)

6.1 Overall System Architecture

graph TB
    subgraph "Client Layer"
        WEB[Web Browser<br/>Debounce + Local Cache]
        MOB[Mobile App<br/>Debounce + Local Cache]
    end

    subgraph "Edge Layer"
        CDN[CDN / Edge Cache<br/>Cloudflare / CloudFront]
    end

    subgraph "API Layer"
        LB[Load Balancer]
        AG1[API Gateway 1]
        AG2[API Gateway 2]
    end

    subgraph "Query Service Layer"
        QS1[Query Server 1<br/>Trie Shard 0-2 in RAM]
        QS2[Query Server 2<br/>Trie Shard 3-5 in RAM]
        QS3[Query Server 3<br/>Trie Shard 6-9 in RAM]
        FL[Filter Service<br/>Blocklist + ML]
    end

    subgraph "Coordination"
        ZK[Zookeeper<br/>Shard Map + Config]
    end

    subgraph "Storage Layer"
        TS[Trie Storage<br/>S3 / GCS]
        RC[Redis Cache<br/>Hot Prefixes]
    end

    subgraph "Data Pipeline (Offline)"
        KF[Kafka<br/>Query Event Stream]
        SP[Spark Cluster<br/>Aggregation]
        DB[Aggregated Data<br/>PostgreSQL / Hive]
        TBU[Trie Builder<br/>Build + Serialize]
    end

    subgraph "Monitoring"
        MON[Prometheus + Grafana<br/>Metrics & Dashboards]
        ALR[Alert Manager<br/>PagerDuty Integration]
    end

    WEB --> CDN
    MOB --> CDN
    CDN --> LB
    LB --> AG1
    LB --> AG2
    AG1 --> QS1
    AG1 --> QS2
    AG2 --> QS2
    AG2 --> QS3
    QS1 --> FL
    QS2 --> FL
    QS3 --> FL
    QS1 --> ZK
    QS2 --> ZK
    QS3 --> ZK
    QS1 --> RC
    QS2 --> RC
    QS3 --> RC

    WEB -->|"Completed queries"| KF
    MOB -->|"Completed queries"| KF
    KF --> SP
    SP --> DB
    DB --> TBU
    TBU --> TS
    TS -->|"Pull new Trie"| QS1
    TS -->|"Pull new Trie"| QS2
    TS -->|"Pull new Trie"| QS3

    QS1 --> MON
    QS2 --> MON
    QS3 --> MON
    SP --> MON
    MON --> ALR

    style CDN fill:#e1f5fe
    style ZK fill:#fff9c4
    style KF fill:#e8eaf6
    style SP fill:#f3e5f5
    style FL fill:#fff3e0
    style MON fill:#e8f5e9

6.2 Trie Structure Visualization

graph TD
    ROOT["ROOT<br/>Top 10: system design, software engineer, ..."]

    S["s<br/>Top 10: system design, software engineer,<br/>stack overflow, ..."]
    T["t<br/>Top 10: tutorial, twitter, ..."]

    SY["sy<br/>Top 10: system design, sync, ..."]
    SO["so<br/>Top 10: software engineer, sorting, ..."]
    ST["st<br/>Top 10: stack overflow, streaming, ..."]

    SYS["sys<br/>Top 10: system design, system design interview,<br/>system design primer, ..."]

    SYST["syst<br/>Top 10: system design, system design interview, ..."]

    SYSTE["syste<br/>Top 10: system design, system design interview,<br/>system design primer, ..."]

    SYSTEM["system<br/>Top 10: system design, system design interview,<br/>system 32, system restore, ..."]

    ROOT --> S
    ROOT --> T
    S --> SY
    S --> SO
    S --> ST
    SY --> SYS
    SYS --> SYST
    SYST --> SYSTE
    SYSTE --> SYSTEM

    style ROOT fill:#ffcdd2
    style S fill:#f8bbd0
    style SYS fill:#e1bee7
    style SYSTEM fill:#c5cae9

Key insight: Moi node luu san top 10. Khi user go “sys”, Query Service chi can doc 1 node — khong can traverse subtree. O(1) lookup!

6.3 Data Gathering Pipeline (Chi tiet)

graph LR
    subgraph "Step 1: Ingest"
        U1[User Search Event]
        U2[User Search Event]
        U3[User Search Event]
        KP[Kafka Producer<br/>Batch + Compress]
        KT[Kafka Topic<br/>search-queries<br/>Partitioned by region]
    end

    subgraph "Step 2: Raw Storage"
        KC[Kafka Consumer<br/>S3 Sink Connector]
        S3R[S3 Raw Logs<br/>Partitioned by date/hour<br/>Parquet format]
    end

    subgraph "Step 3: Aggregate"
        SCH[Scheduler<br/>Airflow / Cron]
        SPK[Spark Job<br/>GroupBy query<br/>Count distinct users<br/>Apply decay]
        S3A[S3 Aggregated<br/>query, frequency, timestamp]
    end

    subgraph "Step 4: Build"
        TB[Trie Builder<br/>Read aggregated data<br/>Build optimized Trie<br/>Compress + Serialize]
        VAL[Validator<br/>Check Trie size<br/>Spot-check queries<br/>Compare with previous]
    end

    subgraph "Step 5: Deploy"
        S3T[S3 Trie Storage<br/>Versioned: v2026-03-18-12h]
        DEP[Deployer<br/>Notify Query Servers<br/>Blue-Green switch]
        QSR[Query Servers<br/>Pull + Load Trie]
    end

    U1 --> KP
    U2 --> KP
    U3 --> KP
    KP --> KT
    KT --> KC
    KC --> S3R
    SCH -->|"Trigger hourly"| SPK
    S3R --> SPK
    SPK --> S3A
    S3A --> TB
    TB --> VAL
    VAL -->|"Pass"| S3T
    VAL -->|"Fail"| SCH
    S3T --> DEP
    DEP --> QSR

    style KT fill:#e8eaf6
    style SPK fill:#f3e5f5
    style VAL fill:#fff3e0
    style DEP fill:#e8f5e9

6.4 Query Service Flow (Chi tiet)

flowchart TD
    A[User types character] --> B{Debounce timer active?}
    B -->|Yes| C[Reset timer to 200ms]
    B -->|No| D[Start 200ms timer]
    C --> E[Wait for timer]
    D --> E
    E --> F{Timer expired?}
    F -->|No, user typed again| B
    F -->|Yes| G[Get current prefix]
    G --> H{Prefix in browser cache?}
    H -->|Yes| I[Return cached result]
    H -->|No| J[Send AJAX to CDN]
    J --> K{CDN cache hit?}
    K -->|Yes| L[Return CDN cached result]
    K -->|No| M[Forward to API Gateway]
    M --> N[Rate limit check]
    N -->|Rejected| O[Return 429 Too Many Requests]
    N -->|Passed| P[Lookup Shard Map]
    P --> Q[Route to correct Trie Shard]
    Q --> R[Trie prefix lookup — O of 1]
    R --> S[Get top 10 raw suggestions]
    S --> T[Apply Filter Layer]
    T --> U{All 10 suggestions safe?}
    U -->|Yes| V[Return JSON response]
    U -->|No| W[Remove unsafe, backfill from next candidates]
    W --> V
    V --> X[Set Cache-Control headers]
    X --> Y[Response cached at CDN]
    Y --> Z[Response cached at Browser]
    Z --> AA[Display dropdown to user]

    style I fill:#e1f5fe
    style L fill:#e1f5fe
    style R fill:#c8e6c9
    style T fill:#fff3e0

7. Aha Moments & Pitfalls

7.1 Aha Moments — Nhung dieu interviewer muon nghe

#	Aha Moment	Giai thich
1	Trie vs Database lookup	Nhieu nguoi nghi “chi can SELECT * FROM queries WHERE query LIKE ‘sys%’ ORDER BY frequency LIMIT 10”. Nhung voi 100M+ queries, LIKE query + sort la O(n log n) — cuc cham. Trie cho O(1) lookup vi top K da duoc precomputed
2	Offline update, khong phai realtime	Autocomplete khong can real-time. Delay 15 phut - 1 gio la chap nhan duoc. Dieu nay don gian hoa thiet ke rat nhieu — khong can distributed lock, khong can consensus
3	Debounce saves 80% traffic	Mot optimization o frontend giam 80% load cho backend. Interviewer an tuong khi em nghi toi full-stack optimization, khong chi backend
4	Space-time trade-off tai moi Trie node	Luu top K tai moi node = tang space 10x nhung giam lookup time tu O(millions) xuong O(1). Day la quintessential engineering trade-off
5	Graceful degradation	Autocomplete la nice-to-have, khong phai critical. Neu chet, search van hoat dong. Noi dieu nay cho thay em hieu system resilience
6	Two separate systems	Data Gathering va Query Service la hoan toan doc lap. Chi share Trie artifact. Day la clean architecture — moi phan co the scale/fail doc lap
7	Cache at every layer	Browser → CDN → Application RAM. Moi layer filter traffic, chi 10-20% requests den backend. Day la multi-tier caching strategy

7.2 Pitfalls — Nhung loi thuong gap

#	Pitfall	Tai sao sai	Nen lam gi
1	Dung RDBMS cho autocomplete	LIKE query + ORDER BY tren 100M rows = seconds, khong phai milliseconds. Index giup nhung van cham hon Trie	Dung Trie — data structure duoc thiet ke rieng cho prefix search
2	Update Trie moi keystroke	23K QPS writes vao Trie = race condition, lock contention, write amplification	Tach read path (online) va write path (offline). Update Trie dinh ky
3	Khong nghi ve frontend	Chi thiet ke backend → bo qua debounce, caching → traffic gap 5-10x	Autocomplete la bai toan full-stack. Frontend optimization quan trong nhu backend
4	Shard Trie theo hash	Hash(“sys”) co the map den shard khac voi hash(“syst”) → khong the traverse tu prefix nay sang prefix kia	Shard theo prefix range, khong phai hash. Giu subtree tren cung shard
5	Quen filter layer	Deploy autocomplete khong co content filter → goi y offensive → PR disaster	Luon co filter layer. Blocklist + ML + human review
6	Quen tinh capacity	Khong uoc luong QPS, storage → thiet ke khong phu hop scale	Back-of-envelope estimation truoc khi dive vao design. Tham khao Tuan-02-Back-of-the-envelope
7	Khong noi ve failure mode	Chi trinh bay happy path → interviewer hoi “neu Trie Builder fail thi sao?” → lam	Luon chuan bi: Trie Builder fail → keep serving Trie cu. Query Server crash → LB route sang server khac. Filter Service down → serve unfiltered (hoac block autocomplete tam thoi)

7.3 Interview Tips — Cach trinh bay

Giai doan	Em nen lam	Em KHONG nen lam
Mo dau (2 phut)	Hoi requirements, clarify scope, neu gia dinh	Nhay thang vao ve diagram
High-level (5 phut)	Ve 2 thanh phan chinh (Data Gathering + Query Service), giai thich flow	Ve qua nhieu detail, liet ke cong nghe cu the
Deep dive (15 phut)	Giai thich Trie, optimization, trade-offs. Hoi interviewer muon dive vao phan nao	Tu chon phan de nhat de trinh bay
Wrap up (3 phut)	Noi ve scaling (sharding, replication), monitoring, security	Bo qua hoan toan, hoac noi “em nghi vay la du”

8. Internal Links & Cross-References

Cac bai hoc lien quan truc tiep

Topic	Link	Ap dung trong Autocomplete
Cache Strategy	Tuan-06-Cache-Strategy	Multi-tier caching: Browser → CDN → In-memory Trie. Cache invalidation khi Trie moi deploy
Message Queue	Tuan-08-Message-Queue	Kafka lam buffer cho search query log. Decouple Data Gathering pipeline
Consistent Hashing	Tuan-10-Consistent-Hashing	Co the dung cho Trie sharding (du prefix-range sharding pho bien hon)

Cac bai hoc bo tro

Topic	Link	Lien he
Scale from Zero	Tuan-01-Scale-From-Zero-To-Millions	Autocomplete bat dau tu 1 server voi Trie in-memory, scale dan len
Back-of-envelope	Tuan-02-Back-of-the-envelope	Phuong phap tinh QPS, storage, bandwidth o Section 3
Database Sharding	Tuan-07-Database-Sharding-Replication	Trie sharding tuong tu DB sharding nhung co constraint rieng (phai giu subtree)
Rate Limiter	Tuan-09-Rate-Limiter	Bao ve autocomplete API khoi bot/spam
Monitoring	Tuan-13-Monitoring-Observability	Cach monitor pipeline, latency, cache hit ratio
Security	Tuan-14-AuthN-AuthZ-Security	API Gateway authentication, rate limiting

So sanh voi cac Case Study khac

Case Study	Diem tuong dong	Diem khac biet
Tuan-16-Design-URL-Shortener	Read-heavy, caching quan trong, estimation approach giong nhau	URL Shortener dung hash, Autocomplete dung Trie. URL Shortener can strong consistency, Autocomplete chap nhan eventual
Tuan-18-Design-News-Feed	Ca hai can ranking algorithm, personalization, offline pipeline	News Feed phuc tap hon ve social graph. Autocomplete don gian hon ve data model nhung kho hon ve latency requirement
Tuan-19-Design-Notification-System	Ca hai co offline pipeline (data gathering vs notification scheduling)	Notification la push, Autocomplete la pull. Notification chap nhan delay, Autocomplete phai < 100ms

9. Summary — Tom tat 1 trang

He thong Autocomplete gom 2 phan chinh:

1. Data Gathering Service (Offline):

Thu thap search queries tu user → Kafka → S3
Aggregate frequency bang Spark/MapReduce
Build optimized Trie (luu top K tai moi node)
Serialize va deploy Trie dinh ky (moi gio/ngay)

2. Query Service (Online):

Nhan prefix tu user (sau debounce)
Lookup Trie in-memory → O(1) vi top K da precomputed
Filter offensive content
Tra ve 10 suggestions trong < 100ms

5 dieu quan trong nhat can nho:

Trie la data structure cot loi — thiet ke rieng cho prefix search, vuot troi hon database LIKE query
Precompute top K tai moi node — trade space cho speed, O(1) lookup
Tach read path va write path — CQRS pattern, scale doc lap
Frontend optimization (debounce + cache) giam 90%+ traffic — day la full-stack problem
Shard Trie theo prefix range, KHONG phai hash — phai giu subtree tren cung shard

“Autocomplete la bai toan tuong don gian nhung an chua moi khia canh cua system design: data structure, distributed systems, caching, pipeline, security, va full-stack thinking. Master bai nay, em se san sang cho bat ky system design interview nao.”

Next Step: Thuc hanh ve lai toan bo architecture diagram tu dau, khong nhin tai lieu. Neu em ve duoc 80% chinh xac, em da nam vung bai nay.

lthieu's notes

Explorer

Case-Design-Search-Autocomplete