Case Study: Design YouTube — Video Streaming Platform

“YouTube giống một đài truyền hình quốc gia — nhưng thay vì một kênh chiếu cho tất cả, mỗi người xem một kênh khác nhau, cùng lúc hàng triệu người. Và ‘đài truyền hình’ đó phải phát mượt mà trên mọi thiết bị, mọi tốc độ mạng, mọi quốc gia.”

Tags: system-design case-study video-streaming youtube cdn transcoding alex-xu Student: Hieu Prerequisite: Tuan-02-Back-of-the-envelope · Tuan-03-Networking-DNS-CDN · Tuan-08-Message-Queue Lien quan: Tuan-06-Cache-Strategy · Tuan-07-Database-Sharding-Replication · Tuan-13-Monitoring-Observability · Tuan-15-Data-Security-Encryption Reference: Alex Xu, System Design Interview Vol 1 — Chapter 14: Design YouTube


0. Context & Why — Tai sao Video Streaming kho?

0.1 Analogie: Dai truyen hinh quoc gia thoi dai so

Hieu, hay tuong tuong mot dai truyen hinh quoc gia. Truoc day, dai phat mot kenh duy nhat — tat ca moi nguoi cung xem mot chuong trinh. Don gian.

Bay gio tuong tuong moi nguoi xem mot kenh khac nhau, cung luc. Nguoi xem phim hanh dong 4K, nguoi xem clip hai 360p tren dien thoai 3G, nguoi tai video len, nguoi dang stream live. Tat ca dien ra dong thoi voi hang trieu nguoi.

Do chinh la YouTube. Va do la ly do no la mot trong nhung bai toan system design kho nhat.

0.2 Nhung thach thuc cot loi

Thach thucGiai thichVi sao kho?
Data volumeVideo la data type nang nhat — 1 phut video 1080p ~ 150MBKhong gi trong web lon bang video
BandwidthStreaming can bang thong lien tuc, khong phai burst nhu text/imageMoi user tieu thu 2-5 Mbps lien tuc
DiversityMuon van thiet bi: TV 4K, dien thoai cu, tablet, laptopPhai transcode ra nhieu resolution + codec
GlobalUser o moi noi tren the gioiCDN phai phu khap, latency phai thap
CostCDN bandwidth la chi phi #1 — len toi hang ty USD/namToi uu chi phi CDN la song con
Upload vs WatchUpload it nhung nang (processing), Watch nhieu nhung nhe (serving)Hai pipeline hoan toan khac nhau

0.3 YouTube theo so lieu thuc te (tham khao)

MetricCon soY nghia
DAU~2 tyGan 1/4 dan so the gioi
Video duoc upload moi phut500 gioKhong bao gio xem het
Tong video da upload> 800 trieuPetabyte-scale storage
Avg watch time/user/day~30 phutEngagement cuc cao
So resolution can transcode5-8144p, 240p, 360p, 480p, 720p, 1080p, 1440p, 4K
CDN cost/nam (uoc tinh)$1-5 tyChi phi lon nhat cua YouTube

Aha Moment: Video streaming khong phai la “gui file cho user.” No la chia file thanh hang ngan manh nho (segments), chon dung resolution cho tung thoi diem, va gui tung manh mot — giong nhu dut com cho em be, tung muong mot, dung toc do em be nuot duoc.


1. Step 1 — Understand the Problem & Establish Design Scope

1.1 Clarifying Questions (Cau hoi lam ro)

Trong interview, luon hoi truoc khi thiet ke. Duoi day la cac cau hoi quan trong:

Cau hoiTra loiGhi chu
Nhung feature nao can thiet ke?Upload video + Stream videoCore features
Clients nao?Mobile app, web browser, smart TVMulti-platform
DAU bao nhieu?5 trieuQuy mo trung binh-lon
Thoi gian xem trung binh/ngay?30 phutEngagement metric
Can ho tro nhieu resolution?Co — tu 240p den 4KAdaptive bitrate
Can ho tro nhieu ngon ngu?Co — subtitle, audio tracksi18n
Video dai toi da bao lau?12 gio (nhu livestream rerun)Anh huong storage
Can DRM (bao ve ban quyen)?CoDac biet cho premium content
Upload file toi da bao lon?256 GBCho video dai, 4K
Co phan biet user mien phi va tra tien?Co — priority transcodingBusiness logic

1.2 Functional Requirements (Yeu cau chuc nang)

  • FR1: Upload video — user upload video tu client, he thong xu ly va luu tru
  • FR2: Stream video — user xem video muot ma, khong giat lag
  • FR3: Multi-resolution — tu dong chuyen doi video thanh nhieu resolution (240p → 4K)
  • FR4: Adaptive bitrate — tu dong dieu chinh chat luong theo toc do mang cua user
  • FR5: Search — tim kiem video theo title, description, tags
  • FR6: Thumbnail generation — tu dong tao thumbnail cho moi video
  • FR7: Video metadata — title, description, view count, like/dislike, comments
  • FR8: Resumable upload — neu upload bi gian doan, tiep tuc tu cho dang do

1.3 Non-functional Requirements (Yeu cau phi chuc nang)

RequirementTargetGiai thich
Availability99.99%Video platform la entertainment — downtime = mat user
Latency (playback start)< 2 giayUser nhan play → video phai bat dau trong 2s
Buffering ratio< 1%Ty le thoi gian user phai doi buffer
Upload reliability99.9%Upload khong duoc mat file
Scalability5M DAU, scale to 50MThiet ke cho tuong lai
Durability99.999999999% (11 nines)Video upload len roi → khong bao gio mat
Global reachServe tu CDN gan nhatUser o VN xem video phai nhanh nhu user o My

1.4 Pham vi thiet ke (Out of scope)

De tap trung, chung ta khong thiet ke:

  • Recommendation engine (ML-based) — chi thiet ke data pipeline cho no
  • Comment system — da co trong case study khac
  • Live streaming — chi focus vao Video-on-Demand (VOD)
  • Monetization / Ads system
  • Content moderation chi tiet (chi overview)

2. Step 2 — Propose High-Level Design

2.1 Hai luong chinh (Two Main Flows)

Hieu, toan bo YouTube co the chia thanh hai luong chinh hoan toan khac nhau:

LuongMo taDac diem
Video Upload PipelineUser upload video → he thong xu ly → luu tru → phan phoiWrite-heavy, CPU-intensive, async
Video StreamingUser nhan play → he thong tra ve video segmentsRead-heavy, bandwidth-intensive, real-time

Key Insight: Hai luong nay co yeu cau hoan toan khac nhau. Upload can processing power (transcoding). Streaming can bandwidth (CDN). Thiet ke rieng biet cho tung luong.

2.2 High-Level Architecture Overview

graph TB
    subgraph Clients
        A[Web Browser]
        B[Mobile App]
        C[Smart TV]
    end

    subgraph "Upload Flow"
        D[API Gateway / Load Balancer]
        E[Upload Service]
        F[Original Video Storage<br/>Blob Store]
        G[Transcoding Service<br/>DAG Scheduler]
        H[Transcoded Video Storage<br/>Blob Store]
        I[Thumbnail Generator]
        J[Metadata Service]
        K[Message Queue<br/>Kafka/RabbitMQ]
    end

    subgraph "Streaming Flow"
        L[CDN Edge Server]
        M[CDN Regional Server]
        N[CDN Origin Server]
        O[Metadata Cache<br/>Redis]
    end

    subgraph "Data Stores"
        P[(Video Metadata DB<br/>MySQL/Vitess)]
        Q[(User Activity DB<br/>Cassandra)]
        R[Search Index<br/>Elasticsearch]
    end

    A & B & C -->|Upload| D
    D --> E
    E --> F
    E -->|Trigger| K
    K --> G
    G --> H
    G --> I
    H --> N
    E --> J
    J --> P
    J --> R

    A & B & C -->|Stream| L
    L -->|Cache miss| M
    M -->|Cache miss| N
    N -->|Fetch| H

    J --> O
    O --> L

2.3 Component Overview

ComponentChuc nangTech stack (vi du)
API GatewayAuthentication, rate limiting, routingKong, AWS API Gateway
Upload ServiceNhan file, chia chunk, luu vao blob storeCustom service + pre-signed URL
Original StorageLuu video goc truoc khi transcodeAWS S3, Google Cloud Storage
Transcoding ServiceChuyen doi video thanh nhieu resolution/codecFFmpeg workers, AWS Elastic Transcoder
Transcoded StorageLuu video da transcode o moi resolutionS3, GCS (rieng bucket)
CDNPhan phoi video den userCloudFront, Akamai, Google CDN
Metadata ServiceQuan ly thong tin videoREST API + MySQL/Vitess
Message QueueDecouple upload va transcodingKafka, RabbitMQ → Tuan-08-Message-Queue
CacheCache metadata, CDN configRedis → Tuan-06-Cache-Strategy

3. Step 3 — Design Deep Dive

3.1 Video Upload Pipeline — Chi tiet

3.1.1 Upload Flow chi tiet

Khi user upload video, day la toan bo quy trinh dien ra:

Buoc 1 — Pre-signed URL

  • Client goi API: “Toi muon upload video, file size = 2GB”
  • API Server kiem tra authentication, quota, file type
  • Neu hop le → tao pre-signed upload URL tro thang toi Blob Storage (S3)
  • Client upload truc tiep len S3, khong di qua API server → giam tai server

Tai sao pre-signed URL? Neu 1000 user upload cung luc, moi file 2GB, API server phai xu ly 2TB bandwidth. Voi pre-signed URL, traffic di thang vao S3 (designed for massive throughput). API server chi xu ly metadata (nhe).

Buoc 2 — Chunked Upload (Resumable)

  • File 2GB duoc chia thanh nhieu chunks (thuong 5-10MB/chunk)
  • Client upload tung chunk mot
  • Neu bi mat mang → chi can upload lai chunk bi loi, khong phai upload lai tu dau
  • Server theo doi progress: chunk 1/400 ✓, chunk 2/400 ✓, … chunk 235/400 ✗ (retry)

Buoc 3 — Original Storage

  • Khi tat ca chunks da upload xong → S3 ghep lai thanh file hoan chinh
  • File goc duoc luu vinh vien (hoac theo retention policy)
  • Mot event duoc push vao Message Queue → trigger transcoding pipeline

Buoc 4 — Transcoding (xem phan 3.1.2)

Buoc 5 — CDN Distribution

  • Video da transcode duoc push len CDN origin
  • CDN tu dong replicate ra edge servers khi co request

Buoc 6 — Metadata Update

  • Sau khi transcode xong, Metadata DB duoc update:
    • Video status: “processing” → “ready”
    • Available resolutions: [240p, 360p, 480p, 720p, 1080p]
    • Duration, thumbnail URL, manifest file URL
  • User nhan notification: “Video da san sang!“

3.1.2 Video Transcoding — DAG Architecture

Day la phan phuc tap nhat cua toan bo he thong. Transcoding khong phai chi la “chuyen 1080p thanh 720p.” No la mot DAG (Directed Acyclic Graph) gom nhieu buoc:

graph TD
    A[Original Video File] --> B[Video Splitting<br/>Chia thanh segments]
    B --> C1[Segment 1]
    B --> C2[Segment 2]
    B --> C3[Segment 3]
    B --> C4[Segment N...]

    C1 --> D1[Encode H.264 1080p]
    C1 --> D2[Encode H.264 720p]
    C1 --> D3[Encode H.264 480p]
    C1 --> D4[Encode H.264 360p]
    C1 --> D5[Encode H.264 240p]
    C1 --> D6[Encode VP9 1080p]
    C1 --> D7[Encode VP9 720p]
    C1 --> D8[Encode AV1 1080p]

    C2 --> E1[Encode H.264 1080p]
    C2 --> E2[Encode H.264 720p]
    C2 --> E3[...]

    D1 & D2 & D3 & D4 & D5 & D6 & D7 & D8 --> F[Merge Segments<br/>Per resolution/codec]

    A --> G[Audio Extraction]
    G --> G1[AAC 128kbps]
    G --> G2[AAC 256kbps]
    G --> G3[Opus 128kbps]

    A --> H[Thumbnail Generation<br/>Extract key frames]
    H --> H1[Thumbnail 1]
    H --> H2[Thumbnail 2]
    H --> H3[Thumbnail 3]

    A --> I[Watermark Overlay<br/>Optional - for copyright]

    F --> J[Generate Manifest Files<br/>HLS .m3u8 / DASH .mpd]
    G1 & G2 & G3 --> J
    J --> K[Upload to Transcoded Storage]
    K --> L[Push to CDN Origin]
    L --> M[Update Metadata DB<br/>Status = Ready]

3.1.3 Tai sao dung DAG?

Ly doGiai thich
ParallelismMoi segment duoc encode doc lap → chay song song tren nhieu worker
FlexibilityDe dang them/bot resolution, codec, buoc xu ly moi
Fault toleranceNeu encode segment 5 loi → chi retry segment 5, khong phai lam lai tu dau
PriorityEncode 720p truoc (pho bien nhat) → user xem duoc som hon
Resource efficiencyMoi task co resource requirement khac nhau → schedule thong minh

3.1.4 Video Codecs — H.264 vs VP9 vs AV1

CodecUu diemNhuoc diemDung khi nao
H.264 (AVC)Ho tro 99% thiet bi, decode nhanhFile lon honDefault cho moi video
VP9Nho hon H.264 ~30%, free (Google)Encode cham hon 10xChrome, Android, YouTube web
AV1Nho hon VP9 ~20%, freeEncode cuc cham (50-100x H.264)Content pho bien, tiet kiem bandwidth
HEVC (H.265)Nho hon H.264 ~40%License fee dat, ho tro thiet bi han cheApple ecosystem, Smart TV

Chien luoc cua YouTube thuc te: Encode H.264 cho tat ca video (compatibility). Voi video popular (nhieu view), encode them VP9 va AV1 de tiet kiem bandwidth CDN — vi chi phi encode 1 lan nhung tiet kiem bandwidth cho hang trieu luot xem.

3.1.5 Transcoding Queue va Priority

Khong phai moi video deu duoc transcode nhu nhau:

PriorityDoi tuongXu ly
P0 — CriticalPaid creators (YouTube Partner)Encode ngay, dedicated workers
P1 — HighVideo tu channel lon (>1M subs)Queue rieng, timeout ngan
P2 — NormalUser thuongQueue chung, best-effort
P3 — LowRe-encode video cu sang codec moiChi chay luc off-peak (2AM-6AM)

Implement bang multiple message queues voi priority khac nhau:

graph LR
    A[Upload Event] --> B{Priority<br/>Router}
    B -->|P0| C[Critical Queue<br/>Dedicated Workers]
    B -->|P1| D[High Queue<br/>Auto-scaling Workers]
    B -->|P2| E[Normal Queue<br/>Shared Workers]
    B -->|P3| F[Low Queue<br/>Spot Instances Only]

    C --> G[Transcoding Worker Pool]
    D --> G
    E --> G
    F --> G

    G --> H[Transcoded Storage]

Cost optimization: P3 jobs (re-encode video cu) chi chay tren spot instances (re hon 70-90% so voi on-demand). Neu spot bi recall → job tu dong quay lai queue, doi spot instance moi.

3.2 Video Streaming — Adaptive Bitrate

3.2.1 Streaming Protocol: HLS va DASH

Khi user nhan “Play,” dieu gi xay ra?

Buoc 1: Client request manifest file (playlist)

  • HLS: file .m3u8
  • DASH: file .mpd

Buoc 2: Manifest file chua danh sach tat ca resolution co san va URL cua tung segment:

#EXT-X-STREAM-INF:BANDWIDTH=800000,RESOLUTION=640x360
360p/segment_001.ts
360p/segment_002.ts
...

#EXT-X-STREAM-INF:BANDWIDTH=2400000,RESOLUTION=1280x720
720p/segment_001.ts
720p/segment_002.ts
...

#EXT-X-STREAM-INF:BANDWIDTH=5000000,RESOLUTION=1920x1080
1080p/segment_001.ts
1080p/segment_002.ts
...

Buoc 3: Client do bandwidth hien tai → chon resolution phu hop

Buoc 4: Client tai tung segment (thuong 2-10 giay/segment)

Buoc 5: Neu bandwidth thay doi (vao thang may, chuyen tu WiFi sang 4G) → client tu dong chuyen resolution cho segment tiep theo

Day la Adaptive Bitrate Streaming (ABR) — Key cua trai nghiem xem video muot. Thay vi buffer va cho, player giam resolution de video tiep tuc chay.

3.2.2 So sanh HLS va DASH

Tieu chiHLS (Apple)DASH (MPEG)
DeveloperAppleMPEG consortium (open standard)
Container.ts (Transport Stream).mp4 fragments
Manifest.m3u8.mpd (XML)
DRM supportFairPlayWidevine, PlayReady
Browser supportSafari native, others via JSChrome, Firefox, Edge
Latency10-30s (can giam voi Low-Latency HLS)3-10s
YouTube dung?Co (cho iOS/Safari)Co (cho Android/Chrome)

Thuc te: YouTube dung ca hai — HLS cho Apple ecosystem, DASH cho phan con lai. Player tu dong chon protocol phu hop.

3.2.3 Adaptive Bitrate Flow

sequenceDiagram
    participant User as User/Player
    participant CDN as CDN Edge
    participant Origin as Origin/Storage

    User->>CDN: GET /video/abc123/manifest.m3u8
    CDN-->>User: Manifest (list of resolutions + segment URLs)

    Note over User: Bandwidth = 5 Mbps → Chon 1080p

    User->>CDN: GET /video/abc123/1080p/seg_001.ts
    CDN-->>User: Segment 1 (1080p)

    User->>CDN: GET /video/abc123/1080p/seg_002.ts
    CDN-->>User: Segment 2 (1080p)

    Note over User: Bandwidth giam xuong 1.5 Mbps<br/>(vao thang may)

    User->>CDN: GET /video/abc123/480p/seg_003.ts
    CDN-->>User: Segment 3 (480p) ← Tu dong ha resolution

    User->>CDN: GET /video/abc123/480p/seg_004.ts
    CDN-->>User: Segment 4 (480p)

    Note over User: Bandwidth phuc hoi 8 Mbps<br/>(ra khoi thang may)

    User->>CDN: GET /video/abc123/1080p/seg_005.ts
    CDN-->>User: Segment 5 (1080p) ← Tu dong tang resolution

    Note over User: Pre-fetch: Player tai truoc<br/>2-3 segments tiep theo

3.2.4 Pre-fetching Strategy

Player thong minh khong chi tai segment hien tai ma con tai truoc (pre-fetch) cac segment tiep theo:

StrategyMo taKhi nao dung
Eager pre-fetchTai truoc 3-5 segmentsWiFi/unlimited data, user dang xem lien tuc
Conservative pre-fetchTai truoc 1-2 segmentsMobile data, user hay skip
No pre-fetchChi tai segment hien taiBandwidth cuc thap, user chua nhan Play
Quality pre-fetchTai truoc resolution thap, replace bang resolution cao khi ranhBandwidth bien dong

Aha Moment: Pre-fetching la ly do ban thay thanh buffer chay truoc vi tri dang xem tren YouTube. No giup video tiep tuc chay ngay ca khi mang cham dot ngot trong 2-3 giay.

3.3 CDN Architecture — Da tang (Multi-tier)

3.3.1 CDN la gi va tai sao la chi phi #1?

CDN (Content Delivery Network) la mang luoi server phan tan tren toan cau, cache noi dung gan user nhat co theTuan-03-Networking-DNS-CDN.

Voi video streaming, CDN bandwidth chinh la chi phi lon nhat:

Hang muc chi phi% tong chi phi (uoc tinh)Ghi chu
CDN bandwidth50-70%Chi phi #1, khong the tranh
Storage (S3/GCS)10-15%Video goc + transcoded
Transcoding compute10-15%CPU-intensive
Metadata infra (DB, cache)5-10%Tuong doi re
Network (non-CDN)3-5%Inter-region transfer
Other (monitoring, etc.)2-5%

3.3.2 CDN Multi-tier Architecture

graph TD
    subgraph "Tier 1 — Edge (PoP)"
        E1[Edge HCM<br/>Cache: Hot content<br/>TTL: 24h]
        E2[Edge Ha Noi<br/>Cache: Hot content<br/>TTL: 24h]
        E3[Edge Bangkok<br/>Cache: Hot content<br/>TTL: 24h]
        E4[Edge Tokyo<br/>Cache: Hot content<br/>TTL: 24h]
        E5[Edge SF<br/>Cache: Hot content<br/>TTL: 24h]
    end

    subgraph "Tier 2 — Regional"
        R1[Regional Singapore<br/>Cache: Warm content<br/>TTL: 7 days]
        R2[Regional US-West<br/>Cache: Warm content<br/>TTL: 7 days]
        R3[Regional EU-West<br/>Cache: Warm content<br/>TTL: 7 days]
    end

    subgraph "Tier 3 — Origin"
        O1[Origin Storage<br/>US-East<br/>All content]
        O2[Origin Storage<br/>EU-Central<br/>All content]
    end

    E1 & E2 & E3 -->|Cache miss| R1
    E4 -->|Cache miss| R1
    E5 -->|Cache miss| R2

    R1 -->|Cache miss| O1
    R2 -->|Cache miss| O1
    R3 -->|Cache miss| O2

    O1 <-->|Cross-region<br/>replication| O2

3.3.3 Push vs Pull CDN

KieuCach hoat dongUu diemNhuoc diemDung cho
PushServer chu dong day content len CDN truoc khi co requestLuot xem dau tien cung nhanh (da co cache)Ton storage, day noi dung khong ai xemVideo trending, popular creators
PullCDN chi fetch content khi co request dau tien (cache miss)Tiet kiem storage, chi cache content thuc su canRequest dau tien cham (phai fetch tu origin)Long-tail content (video it view)

Chien luoc ket hop: Video moi tu popular creators (>100K subs) → Push len edge ngay. Video tu user thuong → Pull khi co nguoi xem.

3.3.4 Long-tail Content Strategy

Thuc te, phan bo view cua YouTube tuan theo Power Law (quy luat luy thua):

  • Top 1% video chiem ~80% tong view → day len CDN edge, cache lau
  • Top 10% video chiem ~95% tong view → cache o regional CDN
  • Bottom 90% video (“long-tail”) chiem ~5% tong view → serve tu origin, khong cache
TierContent typeCache policyChi phi/view
Edge (expensive)Top 1% — trending, viralCache 24h, push proactivelyThap nhat (da cache)
Regional (moderate)Top 10% — popular, recentCache 7 ngay, pull-basedTrung binh
Origin (cheapest storage)Bottom 90% — long-tailKhong cache tren CDNCao nhat (fetch tu origin)

Aha Moment: Neu ban cache tat ca video tren CDN edge → chi phi CDN se gap 10-20 lan. Bi quyet la chi cache nhung gi thuc su duoc xem nhieu. Day la ly do YouTube chi tieu hang ty USD/nam cho CDN ma van phai toi uu lien tuc.

3.4 Metadata System

3.4.1 Video Metadata DB

TruongTypeGiai thich
video_idUUID / Snowflake IDPrimary key, globally unique
titleVARCHAR(500)Tieu de video
descriptionTEXTMo ta, co the dai
uploader_idBIGINT (FK)Nguoi upload
statusENUMuploading, processing, ready, failed, deleted
duration_secondsINTThoi luong
file_size_bytesBIGINTKich thuoc file goc
upload_timeTIMESTAMPThoi gian upload
available_resolutionsJSON[“240p”, “360p”, “480p”, “720p”, “1080p”]
manifest_urlVARCHAR(1000)URL toi HLS/DASH manifest
thumbnail_urlsJSONMang URL thumbnails
view_countBIGINTSo luot xem (eventual consistent)
like_countBIGINTSo luot thich
categoryVARCHAR(100)The loai
tagsJSONTags cho search
is_publicBOOLEANVideo cong khai hay rieng tu
drm_protectedBOOLEANCo bao ve DRM khong

Chon DB: MySQL voi Vitess (sharding layer) hoac CockroachDB — vi metadata can strong consistency (khong duoc hien thi video chua transcode xong).

Sharding key: video_id — phan bo deu, query chinh la theo video_id.

3.4.2 User Activity Data

Data typeVolumeStorageGiai thich
Watch historyCuc lonCassandra / BigTableMoi user xem ~20 video/ngay
Search historyLonElasticsearchFull-text search
Like/dislikeLonMySQL (counter) + KafkaEvent-driven update
Watch progressCuc lonRedis (TTL) → Cassandra”Tiep tuc xem tu 15:32”
Recommendation dataCuc lonBigQuery / Data warehouseML pipeline input

Luu y: View count la eventually consistent. YouTube khong dam bao view count chinh xac tuc thoi — no duoc aggregate theo batch (moi 5-10 phut) de giam write load.

3.4.3 Search Index

Video metadata duoc index vao Elasticsearch de ho tro full-text search:

Truong duoc indexBoost weightGiai thich
title10xQuan trong nhat
description3xPhu, nhung co nhieu keyword
tags5xCreator tu gan tag
channel_name7xSearch theo kenh
captions/subtitles1xTim trong phu de (cuc manh)

3.5 Error Handling & Reliability

3.5.1 Resumable Upload

Van deGiai phapChi tiet
Mat mang giua uploadChunked upload + checksumClient ghi nho chunk cuoi thanh cong, retry tu do
File corruptMD5/SHA256 checksum per chunkServer verify tung chunk, reject neu sai
Upload timeoutPre-signed URL co TTL (thuong 24h)Qua han → client request URL moi
Duplicate uploadIdempotency key (hash cua file)Neu cung file → khong upload lai
Server crashUpload state luu trong durable storageKhoi dong lai → doc state, tiep tuc

3.5.2 Transcoding Error Handling

LoiXu lyRetry policy
Segment encode failRetry segment do (khong phai ca video)3 lan, exponential backoff (1s → 2s → 4s)
Worker crashTask quay lai queue, worker khac pick upHeartbeat timeout = 30s
Out of memoryRetry tren worker co nhieu RAM honScale up worker class
Corrupt inputMark video = “failed,” thong bao userKhong retry — user can upload lai
Timeout (encode qua lau)Chia segment nho hon, retryDynamic segment sizing

3.5.3 Streaming Error Handling

Van deGiai phap
CDN edge downDNS failover sang edge khac (automatic)
Segment 404 (chua cache)CDN pull tu regional → origin
Bandwidth dropABR tu dong ha resolution
Video bị xoa giua luc xemGraceful error message, goi y video khac
Region-specific outageCross-region CDN failover

3.5.4 Pre-signed URL Security

BuocChi tiet
1. Client request uploadGui authentication token + file metadata
2. Server validateKiem tra token, quota, file type allowed
3. Generate pre-signed URLURL chua: bucket, key, expiration, signature
4. Client upload truc tiepUpload thang len S3 voi pre-signed URL
5. URL het hanSau 1-24h, URL khong con su dung duoc
6. Playback URLCung dung pre-signed URL voi TTL ngan (vd 6h) → chong share link trai phep

3.6 Cost Optimization — Chien luoc song con

3.6.1 CDN Cost Optimization

Day la phan quan trong nhat ve mat business — vi CDN chiem 50-70% chi phi:

Chien luocTiet kiemChi tiet
Long-tail tu origin30-40% CDN cost90% video it view → serve tu origin (S3 bandwidth re hon CDN)
Regional CDN selection10-20%Dung CDN re hon o region it traffic (VD: dung ISP CDN cho VN thay vi Akamai)
Codec optimization20-30% bandwidthVP9/AV1 cho popular video → giam bandwidth 20-30% per view
Off-peak transcoding40-60% computeTranscode video khong gap vao luc 2AM-6AM, dung spot instances
Intelligent caching15-25% CDN costCache TTL based on popularity — hot video cache lau, cold video cache ngan
Peer-to-peer (P2P)10-30% CDN costCho phep viewers share segments voi nhau (WebRTC)
Multi-CDN10-15%Dung nhieu CDN provider, route traffic theo gia + performance

3.6.2 Storage Cost Optimization

Chien luocChi tiet
Tiered storageVideo moi → S3 Standard. Sau 30 ngay → S3 Infrequent Access. Sau 1 nam → S3 Glacier
Delete old resolutionsVideo >2 nam, khong ai xem → xoa resolution thap (240p, 360p), giu 720p + 1080p
Lazy transcodingVideo moi: chi encode 720p + 1080p truoc. 240p, 360p, 4K → chi encode khi co request
DeduplicationDetect video trung lap (fingerprinting) → khong luu 2 ban

3.6.3 Transcoding Cost Optimization

Chien luocChi tiet
Spot instancesDung AWS Spot / GCP Preemptible cho P2, P3 jobs → re hon 70-90%
Reserved capacityP0, P1 jobs chay tren reserved instances → dam bao availability
Off-peak schedulingP3 jobs (re-encode) chi chay 2AM-6AM khi gia compute re nhat
Right-sizingVideo ngan (<1 phut) → worker nho. Video dai (>1h) → worker lon voi nhieu CPU
Hardware accelerationDung GPU (NVENC) hoac dedicated hardware (AWS MediaConvert) cho encode nhanh hon 5-10x

4. Capacity Estimation — Uoc luong chi tiet

4.1 Assumptions

Thong soGia triGiai thich
DAU5,000,0005M daily active users
Avg watch time/day30 phutEngagement trung binh
Avg video bitrate (streaming)2.5 MbpsTrung binh giua 480p va 1080p
Upload users/day0.1% DAU5,000 creators upload/day
Avg upload video duration10 phutTrung binh
Avg upload video size (original)1.5 GB1080p, 10 phut
Transcoding expansion factor3x1 video goc → ~3x storage (nhieu resolution)
Video segment size4 giayCho HLS/DASH

4.2 Storage Estimation

Upload storage moi ngay (video goc):

Transcoded storage moi ngay (nhieu resolution):

Tong storage moi ngay:

Storage 1 nam:

Chi phi storage (AWS S3 Standard ~ $0.023/GB/month):

Luu y: Con so nay chua tinh tiered storage (chuyen video cu sang S3 IA/Glacier se giam 50-70%).

4.3 Bandwidth Estimation — Streaming

Concurrent viewers (peak):

Bandwidth peak:

Aha Moment: 781 Gbps la con so khong lo. Day la ly do YouTube can mang luoi CDN toan cau voi hang ngan edge server — khong mot data center don le nao co the serve luong bandwidth nay.

Bandwidth trung binh:

Data transfer moi ngay:

Chi phi CDN bandwidth (AWS CloudFront ~ $0.02/GB o scale lon):

Day la chi phi CDN cho 5M DAU. YouTube voi 2 ty DAU → nhan len ~400x (nhung duoc discount volume). Day la ly do CDN cost la #1 expense.

4.4 Transcoding Compute Estimation

So video can transcode moi ngay:

Trung binh moi video can encode ra 6 resolution × 2 codec = 12 variants:

Thoi gian encode trung binh 1 variant (10 phut video):

Tong compute time (gia dinh 70% H.264, 30% VP9):

So workers can (moi worker 1 vCPU, chay 20h/day):

Chi phi (AWS c5.xlarge spot ~ $0.05/h):

4.5 Tong hop chi phi

Hang mucChi phi/thang%
CDN bandwidth$1,700,00072%
Storage (S3)$253,00011%
Transcoding compute$42,7502%
Metadata infra (DB, cache, search)$150,0006%
Network (non-CDN)$120,0005%
Monitoring, ops, other$100,0004%
Tong~$2,365,750100%

Validation: CDN chiem 72% — dung voi industry benchmark (50-70%). Con so nay xac nhan rang CDN cost optimization la uu tien #1.


5. Security — Bao mat Video Platform

5.1 DRM (Digital Rights Management)

DRM bao ve noi dung co ban quyen (phim, show, music video) khoi bi sao chep trai phep.

DRM SystemPlatformSu dung boi
Widevine (Google)Chrome, Android, ChromecastYouTube, Netflix, Disney+
FairPlay (Apple)Safari, iOS, Apple TVYouTube (tren iOS), Apple TV+
PlayReady (Microsoft)Edge, Xbox, WindowsYouTube (tren Edge), Netflix

Cach DRM hoat dong:

BuocChi tiet
1. Encrypt videoMoi segment duoc encrypt bang AES-128 key
2. License serverKey duoc luu tren License Server (khong gui cung video)
3. Client request licensePlayer gui device info + authentication → License Server
4. License Server validateKiem tra: user co quyen xem? Device hop le? Region cho phep?
5. Tra ve licenseLicense chua decryption key, co TTL (vd 24h)
6. Client decrypt & playPlayer decrypt segments in-memory, khong luu file giai ma

Luu y: DRM khong phai bulletproof — luon co cach bypass (screen recording). Nhung no tang chi phi cho piracy, du de bao ve cho da so use case.

YouTube dung he thong Content ID de phat hien vi pham ban quyen:

BuocChi tiet
1. FingerprintingMoi video duoc tao “fingerprint” (audio + visual) khi upload
2. Database matchingFingerprint duoc so sanh voi database cua copyright holders
3. Match foundNeu match → tu dong: block video, cho chay nhung chia doanh thu, hoac chi track
4. Appeal processUploader co the khieu nai (dispute) neu la fair use
Ky thuatGiai thich
Audio fingerprintingChromaprint / Shazam-like — trích xuat melody, rhythm, tao hash
Visual fingerprintingPerceptual hashing — so sanh frame-by-frame, chiu duoc crop/resize
Temporal matchingKhong chi match toan bo video ma con match 1 doan (vd: 30 giay nhac nen)

5.3 Abuse Prevention

Loai abuseGiai phap
Illegal contentML-based scanning (CSAM detection, violence detection) truoc khi publish
Spam uploadRate limiting per user, CAPTCHA cho upload, file type validation
View bottingAnomaly detection: IP clustering, behavioral analysis, khong dem view tu bot
DDoSCDN-level protection (Cloudflare, AWS Shield), rate limiting → Tuan-09-Rate-Limiter
Account takeover2FA, suspicious login detection, session management

5.4 Pre-signed URL cho Upload va Playback

Use caseTTLGiai thich
Upload URL1-24hDu thoi gian upload file lon, nhung khong qua lau de bi leak
Playback manifest URL6-12hDu cho 1 session xem, het han → client request moi
Playback segment URL1-6hNgan hon manifest, vi segment URL duoc refresh lien tuc
Thumbnail URL30 ngayThumbnail cong khai, khong can bao mat cao

5.5 Watermarking cho Leak Tracking

Loai watermarkChi tietDung khi
Visible watermarkLogo/text hien thi tren videoNgan sao chep re-upload
Invisible watermarkEmbed thong tin user vao pixel data (khong nhin thay)Tracking leak source
Forensic watermarkMoi user nhan version video hoi khac nhauXac dinh chinh xac ai leak

Forensic watermarking: Netflix dung ky thuat nay — moi subscriber nhan video voi watermark khac nhau (invisible). Neu video bi leak, phân tich watermark → biet chinh xac account nao leak.


6. DevOps & Monitoring — Van hanh Video Platform

6.1 Transcoding Pipeline Monitoring

MetricMo taAlert threshold
Queue depthSo job dang cho trong transcoding queue> 10,000 jobs → scale up workers
Processing time p99Thoi gian encode 1 video (p99)> 2h cho video 10 phut → investigate
Failure rate% job bi fail> 2% → alert, > 5% → page on-call
Worker utilizationCPU usage cua transcoding workers< 30% → scale down, > 80% → scale up
Queue wait timeThoi gian job cho trong queue truoc khi duoc pick upP0 > 1 phut → alert ngay

6.2 CDN Performance Monitoring

MetricMo taTarget
Cache hit ratio% request duoc serve tu cache (khong can fetch origin)> 95% cho edge, > 85% cho regional
Origin offload% traffic KHONG phai di ve origin> 90%
Edge latency (TTFB)Time to First Byte tu CDN edge< 50ms
Cache fill rateToc do content duoc cache lan dauMonitor de detect cache stampede
Bandwidth per PoPBandwidth tai moi CDN edgeDung de capacity planning
Error rate per PoP% request loi tai moi edge> 0.1% → investigate edge health

6.3 Streaming Quality Monitoring

MetricMo taTargetAnh huong UX
Buffering ratio% thoi gian user phai doi buffer< 1%Cao nhat — user roi di neu buffer nhieu
Video Start Time (VST)Thoi gian tu nhan Play → frame dau tien< 2sUser mat kien nhan sau 3s
Rebuffer frequencySo lan buffer/gio xem< 0.5 lan/gioMoi lan buffer = trai nghiem xau
Resolution distribution% user xem o moi resolutionTrend trackingGiam 1080p → mang co van de
Bitrate switchesSo lan player doi resolution/phut< 2 lan/phutNhieu lan doi = mang khong on dinh
Playback failure rate% video khong play duoc< 0.1%User khong xem duoc = worst case

6.4 Error Rate per Region

RegionMetric can theo doiHanh dong
VietnamBuffering ratio, CDN latencyNeu cao → them edge o VN hoac dung ISP peering
SEACache hit ratioNeu thap → tang cache capacity o Singapore
US/EUPlayback failure rateNeu cao → check CDN provider health
Emerging marketsResolution distributionNeu da so 240p/360p → toi uu cho low bandwidth

6.5 Cost Dashboard

DashboardMetricUpdate frequency
CDN Cost per ViewTong CDN cost / tong viewsHourly
CDN Cost per RegionCDN cost breakdown theo regionDaily
Transcoding Cost per VideoAvg cost de transcode 1 videoDaily
Storage Growth RateTB added/day, projected cost in 30 daysDaily
Cost per DAUTong infra cost / DAUWeekly
CDN Provider ComparisonCost + performance cua moi CDN providerWeekly

6.6 Alerting Tiers

TierSeverityResponse timeVi du
P0 — CriticalToan bo streaming down5 phutCDN origin unreachable, DB master down
P1 — Major1 region bi anh huong15 phutEdge HCM down, transcoding queue stuck
P2 — MinorPerformance degrade1 gioCache hit ratio < 90%, encode time tang 2x
P3 — WarningAnomaly detectedNext business dayCost spike, unusual traffic pattern

7. Mermaid Diagrams — Tong hop

7.1 Upload Pipeline — End to End

graph TD
    A[Client] -->|1. Request pre-signed URL| B[API Server]
    B -->|2. Validate auth + quota| B
    B -->|3. Return pre-signed URL| A
    A -->|4. Upload chunks truc tiep| C[Blob Storage<br/>S3/GCS]
    C -->|5. All chunks received<br/>Trigger event| D[Message Queue<br/>Kafka]
    D -->|6. Dispatch| E{Priority Router}

    E -->|P0| F[Critical Queue]
    E -->|P1| G[High Queue]
    E -->|P2| H[Normal Queue]
    E -->|P3| I[Low Queue]

    F & G & H & I --> J[DAG Scheduler]

    J --> K[Video Splitting]
    K --> L[Parallel Encoding<br/>H.264 / VP9 / AV1<br/>Multiple resolutions]
    K --> M[Audio Extraction<br/>AAC / Opus]
    K --> N[Thumbnail Generation]
    K --> O[Watermark Overlay]

    L --> P[Segment Merging<br/>Per resolution/codec]
    P --> Q[Manifest Generation<br/>HLS .m3u8 / DASH .mpd]
    M --> Q
    Q --> R[Transcoded Storage<br/>S3/GCS]
    R --> S[CDN Origin Push]
    S --> T[CDN Edge Distribution]

    Q --> U[Metadata DB Update<br/>status = ready]
    U --> V[Notification Service<br/>Video san sang!]
    V --> A

7.2 Streaming Architecture — Full Flow

sequenceDiagram
    participant Client as Client (Player)
    participant DNS as DNS / GSLB
    participant Edge as CDN Edge
    participant Regional as CDN Regional
    participant Origin as CDN Origin
    participant Storage as Blob Storage
    participant Meta as Metadata Service
    participant Cache as Redis Cache

    Client->>Meta: GET /api/video/abc123 (video info)
    Meta->>Cache: Lookup video metadata
    Cache-->>Meta: Cache hit ✓
    Meta-->>Client: Video info + manifest URL

    Client->>DNS: Resolve CDN hostname
    DNS-->>Client: Nearest edge IP (GSLB)

    Client->>Edge: GET manifest.m3u8
    Edge-->>Client: Manifest (resolution list)

    loop Moi segment (2-10s video)
        Client->>Edge: GET /1080p/seg_N.ts
        alt Cache HIT
            Edge-->>Client: Segment (from cache)
        else Cache MISS
            Edge->>Regional: Fetch segment
            alt Regional cache HIT
                Regional-->>Edge: Segment
            else Regional cache MISS
                Regional->>Origin: Fetch segment
                Origin->>Storage: Read from blob
                Storage-->>Origin: Raw segment
                Origin-->>Regional: Segment (cached)
            end
            Regional-->>Edge: Segment (cached)
            Edge-->>Client: Segment (cached for next user)
        end
    end

    Note over Client: ABR: neu bandwidth giam<br/>→ chuyen sang resolution thap<br/>cho segment tiep theo

7.3 Adaptive Bitrate Decision Flow

graph TD
    A[Player bat dau] --> B[Request manifest file]
    B --> C[Nhan danh sach resolutions<br/>240p / 360p / 480p / 720p / 1080p / 4K]
    C --> D[Do bandwidth hien tai]
    D --> E{Bandwidth level?}

    E -->|"> 8 Mbps"| F[Chon 4K<br/>~15 Mbps bitrate]
    E -->|"5-8 Mbps"| G[Chon 1080p<br/>~5 Mbps bitrate]
    E -->|"2.5-5 Mbps"| H[Chon 720p<br/>~2.5 Mbps bitrate]
    E -->|"1-2.5 Mbps"| I[Chon 480p<br/>~1 Mbps bitrate]
    E -->|"0.5-1 Mbps"| J[Chon 360p<br/>~0.5 Mbps bitrate]
    E -->|"< 0.5 Mbps"| K[Chon 240p<br/>~0.25 Mbps bitrate]

    F & G & H & I & J & K --> L[Tai segment N voi resolution da chon]
    L --> M[Play segment N]
    M --> N{Tai segment N+1?}
    N --> O[Do lai bandwidth]
    O --> P{Bandwidth thay doi?}

    P -->|"Giam > 20%"| Q[Ha resolution<br/>cho segment N+1]
    P -->|"Tang > 30%"| R[Tang resolution<br/>cho segment N+1]
    P -->|"On dinh"| S[Giu nguyen resolution]

    Q & R & S --> T[Pre-fetch 2-3 segments tiep]
    T --> L

7.4 CDN Multi-tier Decision Flow

graph TD
    A[User request video segment] --> B[DNS/GSLB<br/>Route to nearest Edge]
    B --> C{Edge cache<br/>co segment?}

    C -->|HIT| D[Serve tu Edge<br/>Latency: 5-20ms]
    C -->|MISS| E{Regional cache<br/>co segment?}

    E -->|HIT| F[Serve tu Regional<br/>Cache len Edge<br/>Latency: 20-50ms]
    E -->|MISS| G{Origin cache<br/>co segment?}

    G -->|HIT| H[Serve tu Origin<br/>Cache len Regional + Edge<br/>Latency: 50-200ms]
    G -->|MISS| I[Fetch tu Blob Storage<br/>Cache len Origin + Regional + Edge<br/>Latency: 100-500ms]

    D --> J[User xem video]
    F --> J
    H --> J
    I --> J

    J --> K{Video popular?}
    K -->|"Top 1%"| L[Proactive push<br/>to all edges]
    K -->|"Top 10%"| M[Cache dai han<br/>o regional]
    K -->|"Bottom 90%"| N[Ngan cache TTL<br/>hoac khong cache]

8. Aha Moments & Pitfalls

8.1 Aha Moments — Nhung dieu bat ngo

#Aha MomentGiai thich
1CDN cost la #1 expenseKhong phai server, khong phai storage — CDN bandwidth chiem 50-70% tong chi phi. Moi quyet dinh thiet ke phai xoay quanh viec giam CDN cost.
2Video khong duoc “gui” — no duoc “cat thanh lat mong va dut”Streaming = gui hang ngan segment nho, moi segment 2-10 giay. Player quyет dinh resolution cho tung segment. Khong phai download toan bo roi play.
3Adaptive Bitrate la key cua UXKhong co ABR, user phai tu chon resolution. Chon sai → buffer. ABR tu dong dieu chinh → video luon chay, chi doi chat luong. Day la ly do YouTube hiém khi buffer.
4Long-tail strategy quyet dinh loi nhuan90% video chi co 5% view. Neu cache tat ca tren CDN → pha san. Bi quyet: chi cache 10% popular content, 90% con lai serve tu origin (re hon 10x).
5Transcoding la CPU-heavy, phu hop spot instancesTranscoding la batch job, co the retry. Spot instances re hon 70-90% → tiet kiem hang trieu USD/nam. Chap nhan duoc vi transcoding la async, khong anh huong user truc tiep.
6Encode 1 lan, serve trieu lanChi phi encode VP9/AV1 gap 10-50x H.264, nhung tiet kiem 20-30% bandwidth moi lan xem. Voi video 10 trieu view → tiet kiem khong lo.
7DAG architecture cho transcodingKhong phai “1 job = 1 video.” 1 video = nhieu task doc lap (split, encode, merge, thumbnail). DAG cho phep parallel, retry tung task, va priority scheduling.
8Pre-signed URL la must-haveUpload: tranh API server lam bottleneck. Playback: kiem soat ai duoc xem, bao lau, chong share link trai phep.

8.2 Pitfalls — Nhung loi thuong gap trong interview

#PitfallGiai thichCach tranh
1Khong tach Upload va StreamingHai luong nay co yeu cau hoan toan khac nhau. Thiet ke chung → khong toi uu duoc.Luon ve 2 pipeline rieng biet
2Quen CDN costNoi “dung CDN” ma khong ban ve cost optimization → interviewer nghi ban khong hieu scale thuc teLuon de cap long-tail strategy, multi-CDN, codec optimization
3Chi noi “transcode video”Khong giai thich DAG, parallel processing, priority queue → interviewer nghi ban chi biet surface levelMo ta DAG: split → encode tung segment → merge. Noi ve priority va spot instances
4Quen Adaptive BitrateNoi “stream video cho user” ma khong noi ve ABR → interviewer se hoi “neu mang cham thi sao?”Luon noi ve HLS/DASH, ABR, segment-based streaming
5Khong noi ve resumable uploadVoi file 2-50GB, upload thuong bi gian doan. Khong co resumable = UX teNoi ve chunked upload, checksum, retry
6Thiet ke monolithicGop tat ca vao 1 service → khong scale duoc tung phanUpload service, transcoding service, streaming (CDN), metadata service — tach rieng
7Quen error handling”Video upload xong, transcode xong, xem duoc” — qua ly tuong. Thuc te: network fail, disk full, codec errorNoi ve retry, exponential backoff, dead letter queue, alerting

8.3 Interview Tips — Chien luoc trinh bay

BuocThoi gianNoi gi
1. Clarify3-5 phutHoi: upload + streaming? Scale? Resolution? Paid vs free?
2. High-level5-7 phutVe 2 pipeline: Upload (pre-signed URL → storage → transcode → CDN) + Streaming (CDN → ABR)
3. Deep dive15-20 phutChon 2-3 topic de deep dive: transcoding DAG, CDN multi-tier, ABR. Khong co lam tat ca
4. Cost & Scale5 phutBack-of-envelope cho storage + bandwidth. Nhan manh CDN cost la #1
5. Wrap up3 phutNoi ve DRM, error handling, monitoring. Goi y extensions: live streaming, recommendation

Pro tip: Khi interviewer hoi “Design YouTube,” ho khong mong doi ban thiet ke tat ca. Ho muon thay ban chon dung van de de deep dive va giai thich ro rang tai sao.


9. Wrap Up — Step 4: Extensions & Trade-offs

9.1 Possible Extensions (neu co thoi gian)

ExtensionMo ta ngán
Live StreamingRTMP ingest → transcode real-time → HLS/DASH out. Khac VOD: latency quan trong, khong co pre-transcode
Recommendation EngineCollaborative filtering + content-based. Input: watch history, like, search. Output: suggested videos
Comment SystemThreaded comments, real-time update, spam detection. Sharding by video_id
Analytics DashboardCreator analytics: views, watch time, demographics. Dung data warehouse (BigQuery)
Multi-languageAuto-generated captions (Speech-to-Text), translation, subtitle management
MonetizationAd insertion (pre-roll, mid-roll), subscription tiers, super chat
Offline ViewingDownload video cho xem offline. DRM van phai hoat dong offline (license pre-fetched)

9.2 Trade-offs Summary

Quyet dinhOption AOption BYouTube chon
Transcoding timingEager (encode tat ca resolution ngay)Lazy (encode khi co request)Eager cho popular codecs, Lazy cho rare resolutions
CDN strategyPush (day len CDN truoc)Pull (CDN fetch khi can)Push cho popular, Pull cho long-tail
Storage1 regionMulti-regionMulti-region voi cross-region replication
Codec priorityChi H.264 (nhanh, compatible)Multi-codec (H.264 + VP9 + AV1)Multi-codec — tiet kiem bandwidth cho popular content
UploadQua API serverPre-signed URL truc tiep len storagePre-signed URL — giam tai API server
Metadata consistencyStrong consistencyEventual consistencyStrong cho video status, Eventual cho view count
Queue architecture1 queue chungMulti-queue voi priorityMulti-queue — P0/P1/P2/P3 rieng biet
TopicLinkLien quan
CDN va DNSTuan-03-Networking-DNS-CDNCDN multi-tier, DNS-based routing, GSLB
Cache StrategyTuan-06-Cache-StrategyCDN cache policy, Redis metadata cache, cache invalidation
Message QueueTuan-08-Message-QueueTranscoding queue, priority queue, event-driven architecture
Database ShardingTuan-07-Database-Sharding-ReplicationMetadata DB sharding by video_id, replication cho read
Rate LimiterTuan-09-Rate-LimiterUpload rate limiting, API protection
MonitoringTuan-13-Monitoring-ObservabilityStreaming quality metrics, alerting tiers
SecurityTuan-15-Data-Security-EncryptionDRM, encryption, pre-signed URL
Back-of-envelopeTuan-02-Back-of-the-envelopeCapacity estimation methodology

“YouTube khong phai la 1 he thong — no la hang chuc he thong lam viec cung nhau. Upload pipeline, transcoding DAG, CDN network, metadata service, recommendation engine… Moi phan co the la 1 bai system design rieng. Bi quyet trong interview la biet focus vao dung phan quan trong nhat: CDN cost optimization va adaptive bitrate streaming.”