Tuần 04: API Design — REST, gRPC, GraphQL

“API là menu của nhà hàng. Khách (client) không cần biết bếp (server) nấu thế nào — họ chỉ cần biết gọi món gì, gửi yêu cầu ra sao, và nhận lại món đúng format.”

Tags: system-design api rest grpc graphql alex-xu Student: Hieu Prerequisite: Tuan-03-Networking-DNS-CDN Liên quan: Tuan-05-Load-Balancer · Tuan-09-Rate-Limiter · Tuan-11-Microservices-Pattern · Tuan-14-AuthN-AuthZ-Security · Tuan-18-Design-News-Feed

1. Context & Why

Hieu, tưởng tượng em vào một nhà hàng. Cái menu chính là API:

Menu liệt kê các món → API liệt kê các endpoints (resources)
Mô tả món ăn + giá → Request/Response schema + documentation
Cách gọi món (gọi bồi bàn, quét QR, bấm máy tính) → HTTP methods (GET, POST, PUT, DELETE)
Format phục vụ (dĩa, tô, hộp mang về) → Response format (JSON, Protobuf, XML)
Quy tắc nhà hàng (không mang đồ ăn ngoài vào, tối đa 10 người/bàn) → Rate limiting, authentication, validation

Nếu menu viết lộn xộn, không rõ ràng → khách gọi sai món, bồi bàn hiểu nhầm, bếp nấu sai → chaos. Tương tự, nếu API design tệ → client hiểu sai, integration khó, bug nhiều, maintenance nightmare.

API là contract (hợp đồng) giữa client và server. Một khi đã publish, thay đổi contract sẽ break tất cả client đang dùng — giống như nhà hàng đổi menu mà không báo khách quen.

Tại sao API Design quan trọng trong System Design?

Trong mọi System Design Interview, sau khi xác định requirements và estimation, bước tiếp theo là define API. Interviewer muốn thấy:

Em hiểu interface giữa các component — không chỉ vẽ boxes và arrows
Em biết chọn đúng API paradigm — REST cho public API, gRPC cho internal microservices, GraphQL cho mobile-first
Em nghĩ về backward compatibility — API versioning, deprecation strategy
Em tính đến security — authentication, authorization, input validation
Em hiểu trade-offs — latency vs flexibility, bandwidth vs developer experience

Alex Xu approach: Trong mỗi chương design, Alex Xu luôn có phần “API Design” ngay sau requirements. Đó là vì API chính là skeleton của hệ thống — xương sống mà mọi thứ khác bám vào.

2. Deep Dive — Các khái niệm cốt lõi

2.1 REST (Representational State Transfer)

REST không phải là protocol — nó là architectural style được Roy Fielding đề xuất năm 2000 trong luận văn tiến sĩ.

6 Constraints của REST

Constraint	Giải thích	Ví dụ
Client-Server	Tách biệt UI và data storage	Frontend React gọi Backend Express
Stateless	Mỗi request chứa đủ thông tin, server không lưu session state	JWT token gửi kèm mỗi request
Cacheable	Response phải khai báo có cache được không	`Cache-Control: max-age=3600`
Uniform Interface	Giao diện thống nhất dùng resources + HTTP methods	`GET /users/123` thay vì `getUser?id=123`
Layered System	Client không biết đang nói chuyện trực tiếp với server hay qua proxy	API Gateway → Load Balancer → Server
Code on Demand (optional)	Server có thể gửi executable code cho client	JavaScript từ CDN

Richardson Maturity Model — Đo độ “REST” của API

Leonard Richardson đề xuất 4 mức trưởng thành của REST API:

Level	Tên	Mô tả	Ví dụ
Level 0	The Swamp of POX	Một endpoint, dùng POST cho mọi thứ	`POST /api` với body `{"action": "getUser", "id": 123}`
Level 1	Resources	Tách thành nhiều resources, nhưng chỉ dùng POST	`POST /users/123` với body `{"action": "get"}`
Level 2	HTTP Verbs	Dùng đúng HTTP methods + status codes	`GET /users/123` → `200 OK`
Level 3	HATEOAS	Response chứa hyperlinks tới related resources	`{"id": 123, "links": {"orders": "/users/123/orders"}}`

Thực tế: Hầu hết production APIs đạt Level 2. Level 3 (HATEOAS) hiếm khi được implement đầy đủ vì overhead lớn, nhưng lý thuyết hay hỏi trong interview.

Resource Naming Convention — Quy tắc đặt tên

Rule	Đúng	Sai	Tại sao
Dùng danh từ số nhiều	`/users`	`/getUsers`, `/user`	Resource là “what”, không phải “action”
Dùng nested resources cho quan hệ	`/users/123/orders`	`/getUserOrders?userId=123`	Thể hiện hierarchy rõ ràng
Dùng kebab-case	`/user-profiles`	`/userProfiles`, `/user_profiles`	Convention phổ biến nhất cho URLs
Không dùng verbs trong URL	`POST /orders`	`POST /createOrder`	HTTP method đã là verb rồi
Dùng query params cho filtering	`/orders?status=pending&sort=date`	`/orders/pending/sort-by-date`	Path cho hierarchy, query cho filter
Không expose implementation	`/users/123`	`/db/table/users/row/123`	Client không cần biết internals

HTTP Methods — Mapping CRUD

Method	CRUD	Idempotent?	Safe?	Ví dụ
`GET`	Read	Yes	Yes	`GET /users/123` — Lấy thông tin user
`POST`	Create	No	No	`POST /users` — Tạo user mới
`PUT`	Replace	Yes	No	`PUT /users/123` — Replace toàn bộ user
`PATCH`	Partial Update	No (*)	No	`PATCH /users/123` — Update một vài fields
`DELETE`	Delete	Yes	No	`DELETE /users/123` — Xoá user

(*) PATCH có thể idempotent nếu implement đúng (ví dụ: set field = value thay vì increment). Nhưng spec không guarantee.

Idempotent (luỹ đẳng) là gì? Gọi API nhiều lần với cùng input → kết quả giống nhau. GET /users/123 gọi 10 lần → cùng kết quả. POST /orders gọi 10 lần → tạo 10 orders khác nhau! Đó là lý do POST không idempotent.

HTTP Status Codes — Ngôn ngữ chung

Range	Ý nghĩa	Codes quan trọng
2xx	Success	`200 OK`, `201 Created`, `204 No Content`
3xx	Redirection	`301 Moved Permanently`, `304 Not Modified`
4xx	Client Error	`400 Bad Request`, `401 Unauthorized`, `403 Forbidden`, `404 Not Found`, `409 Conflict`, `422 Unprocessable Entity`, `429 Too Many Requests`
5xx	Server Error	`500 Internal Server Error`, `502 Bad Gateway`, `503 Service Unavailable`, `504 Gateway Timeout`

Anti-pattern phổ biến: Trả 200 OK cho mọi thứ rồi đặt error code trong body:

// SAI — đừng làm thế này
{
  "status": 200,
  "error": true,
  "message": "User not found"
}

Đúng: Trả 404 Not Found với error body:

// ĐÚNG
// HTTP 404 Not Found
{
  "error": {
    "code": "USER_NOT_FOUND",
    "message": "User with ID 123 does not exist",
    "details": {
      "resource": "User",
      "id": "123"
    }
  }
}

Pagination — Cursor vs Offset

Khi dataset lớn, không thể trả hết trong một response. Có 2 strategies:

Offset-based Pagination (truyền thống):

GET /posts?offset=20&limit=10

Ưu điểm	Nhược điểm
Đơn giản, dễ implement	Performance tệ khi offset lớn (`OFFSET 1000000` rất chậm)
Có thể nhảy tới trang bất kỳ	Data inconsistency khi có insert/delete giữa 2 requests
Dễ hiển thị “trang 1/100”	DB phải scan qua offset rows rồi mới lấy

Cursor-based Pagination (recommended cho feed/timeline):

GET /posts?cursor=eyJpZCI6MTAwfQ==&limit=10

Cursor thường là Base64-encoded ID hoặc timestamp của item cuối cùng.

Ưu điểm	Nhược điểm
Performance ổn định (dùng `WHERE id > cursor`)	Không nhảy trang được
Consistent khi data thay đổi	Phức tạp hơn để implement
Scalable cho dataset lớn	Client phải giữ cursor state

Rule of thumb: Dùng cursor cho infinite scroll, news feed, chat history. Dùng offset cho admin dashboard cần nhảy trang.

Response pattern chuẩn cho pagination:

{
  "data": [...],
  "pagination": {
    "next_cursor": "eyJpZCI6MTEwfQ==",
    "has_more": true,
    "limit": 10
  }
}

API Versioning — URL vs Header

Khi API thay đổi breaking changes, cần versioning để không break existing clients.

Strategy	Ví dụ	Ưu điểm	Nhược điểm
URL Path	`/v1/users`, `/v2/users`	Rõ ràng, dễ route, dễ cache	URL xấu, phải maintain nhiều versions
Query Param	`/users?version=2`	Dễ implement	Dễ quên, cache khó hơn
Header	`Accept: application/vnd.api+json;version=2`	URL sạch, chuẩn HTTP	Client khó debug, CDN cache phức tạp
Content Negotiation	`Accept: application/vnd.company.v2+json`	RESTful nhất	Quá phức tạp cho hầu hết teams

Industry standard: URL path versioning (/v1/, /v2/) được dùng nhiều nhất (Google, Stripe, GitHub) vì đơn giản và explicit. Header versioning được dùng bởi một số (Microsoft, GitHub cũng support).

Idempotency — Tại sao quan trọng?

Scenario: Client gửi POST /orders → network timeout → client không biết order đã tạo chưa → retry → tạo duplicate order!

Giải pháp: Idempotency Key

Client tạo unique key (UUID) trước khi gửi request
Gửi kèm header: Idempotency-Key: 550e8400-e29b-41d4-a716-446655440000
Server kiểm tra key đã xử lý chưa:
- Chưa → xử lý request, lưu key + response vào cache
- Rồi → trả lại response đã lưu, không xử lý lại

Stripe dùng pattern này cho mọi payment API. Đây là best practice cho bất kỳ API nào có side effects (tạo order, chuyển tiền, gửi email).

2.2 gRPC (Google Remote Procedure Call)

gRPC là framework do Google phát triển, dùng Protocol Buffers (protobuf) làm IDL (Interface Definition Language) và serialization format, chạy trên HTTP/2.

So sánh REST vs gRPC

Tiêu chí	REST	gRPC
Protocol	HTTP/1.1 (thường)	HTTP/2 (bắt buộc)
Data format	JSON (text)	Protobuf (binary)
Payload size	Lớn hơn (~2x-10x)	Nhỏ, compact
Speed	Chậm hơn (text parsing)	Nhanh hơn 2-10x
Streaming	Không native (workaround: SSE, WebSocket)	Native bidirectional streaming
Browser support	Full support	Cần gRPC-Web proxy
Documentation	OpenAPI/Swagger	Proto files tự document
Human-readable	Có (JSON)	Không (binary)
Code generation	Phải dùng tool riêng	Built-in cho 10+ languages
Use case	Public API, web/mobile clients	Internal microservices, low-latency

Protocol Buffers — Schema-first approach

Protobuf là strongly-typed, binary serialization format. So sánh JSON vs Protobuf:

// JSON — 82 bytes (human-readable nhưng lớn)
{
  "id": 12345,
  "name": "Hieu Nguyen",
  "email": "[email protected]",
  "age": 25
}

// Protobuf schema — khai báo structure
message User {
  int32 id = 1;
  string name = 2;
  string email = 3;
  int32 age = 4;
}
// Serialized: ~35 bytes (binary, không đọc được nhưng nhỏ hơn ~2.3x)

gRPC Streaming Modes

Mode	Mô tả	Use case
Unary	1 request → 1 response	Giống REST call thông thường
Server Streaming	1 request → stream responses	Real-time stock prices, log tailing
Client Streaming	Stream requests → 1 response	File upload, batch processing
Bidirectional Streaming	Stream requests ↔ stream responses	Chat, gaming, collaborative editing

Khi nào dùng gRPC?

Internal microservices communication — latency thấp, bandwidth nhỏ
Polyglot environment — auto-generate client/server code cho nhiều ngôn ngữ
High-performance systems — gaming, real-time trading, IoT
Streaming data — video processing pipeline, event streaming

Khi nào KHÔNG dùng gRPC?

Public-facing API — browser không support native, tooling kém hơn REST
Simple CRUD — overhead của protobuf + HTTP/2 không đáng
Team chưa quen — learning curve cao hơn REST
Debug/test dễ — không thể curl một gRPC endpoint dễ dàng

2.3 GraphQL

GraphQL được Facebook phát triển (2012, open-source 2015) để giải quyết vấn đề over-fetching và under-fetching trên mobile.

Vấn đề mà GraphQL giải quyết

Over-fetching (REST):

GET /users/123
→ Trả về 50 fields, client chỉ cần name và avatar
→ Waste bandwidth, đặc biệt trên mobile 3G

Under-fetching (REST):

GET /users/123          → User info
GET /users/123/posts    → User's posts
GET /posts/456/comments → Comments on first post
→ 3 round trips! Trên mobile = chậm

GraphQL giải quyết cả hai:

query {
  user(id: 123) {
    name
    avatar
    posts(first: 5) {
      title
      comments(first: 3) {
        text
        author { name }
      }
    }
  }
}
# 1 request, chỉ lấy đúng fields cần thiết

GraphQL Pros & Cons

Pros	Cons
Client chọn chính xác data cần	Phức tạp hơn REST để implement server
Một endpoint duy nhất (`/graphql`)	Caching khó hơn (mỗi query khác nhau)
Strongly typed schema	N+1 query problem
Tự document (introspection)	Rate limiting phức tạp (không biết query “nặng” bao nhiêu)
Giảm number of requests	Security: malicious deep queries có thể DDoS server
Tốt cho mobile (tiết kiệm bandwidth)	Tooling & ecosystem chưa mature bằng REST

N+1 Query Problem — Pitfall kinh điển

Khi client query:

query {
  posts(first: 10) {
    title
    author {
      name
    }
  }
}

Naive implementation:

1 query: SELECT * FROM posts LIMIT 10
+
10 queries: SELECT * FROM users WHERE id = ? (cho mỗi post)
= 11 queries total! (N+1 problem)

Giải pháp: DataLoader pattern (batching + caching)

1 query: SELECT * FROM posts LIMIT 10
1 query: SELECT * FROM users WHERE id IN (1, 2, 3, ..., 10)
= 2 queries total

Facebook’s DataLoader library implement pattern này. Nó batch tất cả individual queries trong cùng một tick thành một batch query.

Khi nào dùng GraphQL?

Mobile-first applications — bandwidth tiết kiệm, giảm round trips
Aggregation layer — frontend cần data từ nhiều microservices
Rapid iteration — frontend team tự chọn data, không cần backend thay đổi endpoint
Complex, interconnected data — social graphs, e-commerce catalogs

Khi nào KHÔNG dùng GraphQL?

Simple CRUD APIs — overhead không đáng
File upload/download — GraphQL không optimal cho binary data
Real-time streaming — subscriptions có nhưng gRPC streaming mạnh hơn
Public API cho third-party — REST dễ hiểu hơn, documentation tốt hơn

2.4 API Gateway Pattern

API Gateway là single entry point cho tất cả client requests, đóng vai trò reverse proxy với nhiều chức năng.

Tại sao cần API Gateway?

Không có Gateway:

Client phải biết address của từng microservice
Mỗi service phải tự implement auth, rate limiting, logging
CORS, SSL termination phải configure ở mỗi service
Service discovery phức tạp cho client

Có Gateway:

Client chỉ biết 1 endpoint: api.example.com
Cross-cutting concerns (auth, rate limit, logging) tập trung 1 chỗ
Backend service có thể thay đổi mà client không biết

Chức năng của API Gateway

Chức năng	Mô tả
Request Routing	Route `/users/` → User Service, `/orders/` → Order Service
Authentication/Authorization	Verify JWT, API key trước khi forward
Rate Limiting	Giới hạn requests per client/IP/API key
Load Balancing	Distribute requests across service instances
SSL Termination	Handle HTTPS, backend dùng HTTP nội bộ
Request/Response Transformation	Thêm/xoá headers, transform body format
Caching	Cache GET responses phổ biến
Circuit Breaking	Ngắt mạch khi backend service down
Logging & Monitoring	Centralized access log, metrics
API Composition	Aggregate responses từ nhiều services thành 1

2.5 Rate Limiting at Gateway Level

Rate limiting tại Gateway là tuyến phòng thủ đầu tiên chống abuse và DDoS.

Các thuật toán Rate Limiting

Algorithm	Mô tả	Ưu điểm	Nhược điểm
Token Bucket	Bucket chứa tokens, mỗi request tiêu 1 token, tokens được refill theo rate	Cho phép burst, smooth	Cần track token count per client
Leaky Bucket	Queue cố định, process requests ở constant rate	Output rate ổn định	Burst bị drop, không flexible
Fixed Window	Đếm requests trong window cố định (ví dụ: mỗi phút)	Đơn giản	Spike ở boundary (59s + 01s = 2x limit)
Sliding Window Log	Ghi timestamp mỗi request, đếm trong window trượt	Chính xác	Memory cao (lưu mọi timestamp)
Sliding Window Counter	Kết hợp fixed window + weighted count từ window trước	Cân bằng accuracy vs memory	Approximate

Chi tiết: Tuan-09-Rate-Limiter

Rate Limiting Headers (chuẩn RFC 6585 + draft-ietf-httpapi-ratelimit-headers)

HTTP/1.1 200 OK
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1672531200
Retry-After: 60

2.6 Request/Response Design Patterns

Envelope Pattern (Response wrapper)

{
  "status": "success",
  "data": {
    "user": { "id": 123, "name": "Hieu" }
  },
  "meta": {
    "request_id": "req_abc123",
    "timestamp": "2026-03-18T10:00:00Z"
  }
}

Error Response Pattern (RFC 7807 — Problem Details)

{
  "type": "https://api.example.com/errors/insufficient-funds",
  "title": "Insufficient Funds",
  "status": 422,
  "detail": "Account balance is 100, but transaction requires 150",
  "instance": "/transactions/txn_abc123",
  "trace_id": "trace_xyz789"
}

Bulk/Batch Operations

Thay vì gọi 100 requests riêng lẻ:

// POST /users/batch
{
  "operations": [
    { "method": "POST", "body": { "name": "User A" } },
    { "method": "POST", "body": { "name": "User B" } },
    { "method": "POST", "body": { "name": "User C" } }
  ]
}
 
// Response
{
  "results": [
    { "status": 201, "data": { "id": 1, "name": "User A" } },
    { "status": 201, "data": { "id": 2, "name": "User B" } },
    { "status": 409, "error": { "message": "Duplicate name" } }
  ]
}

Long-running Operations (Async Pattern)

Cho operations mất hơn vài giây:

POST /reports/generate → 202 Accepted
  Location: /reports/jobs/job_123
  Retry-After: 30

GET /reports/jobs/job_123 → 200 OK
{
  "status": "processing",
  "progress": 45,
  "estimated_completion": "2026-03-18T10:05:00Z"
}

// Sau khi hoàn thành:
GET /reports/jobs/job_123 → 303 See Other
  Location: /reports/rpt_456

3. Estimation — API Payload Sizing & Gateway Throughput

3.1 API Payload Size Estimation

Typical Payload Sizes

API Type	Avg Request Size	Avg Response Size	Ghi chú
REST JSON (simple)	200 bytes - 1 KB	500 bytes - 5 KB	User CRUD
REST JSON (complex)	1 KB - 10 KB	5 KB - 50 KB	Nested resources, lists
REST JSON (list/page)	100 bytes	10 KB - 100 KB	Pagination response
gRPC Protobuf	50 bytes - 500 bytes	100 bytes - 2 KB	~2-10x nhỏ hơn JSON
GraphQL	200 bytes - 2 KB	500 bytes - 20 KB	Query string lớn hơn REST request

JSON vs Protobuf Size Comparison

Cho một User object với 10 fields:

S i z e_{J SON} \approx \sum (key_lengths + value_lengths + syntax_overhead)

S i z e_{J SON} = 10 \times (15 + 20 + 6) \approx 410 b y t es

Syntax overhead: "key": "value", → dấu ngoặc kép, dấu hai chấm, dấu phẩy ~ 6 bytes/field

S i z e_{P ro t o b u f} \approx \sum (tag + length + value)

S i z e_{P ro t o b u f} = 10 \times (1 + 1 + 20) \approx 220 b y t es

Compression ratio = \frac{S i z e _{P ro t o b u f}}{S i z e _{J SON}} = \frac{220}{410} \approx 0.54 \approx 54%

Kết luận: Protobuf tiết kiệm ~46% bandwidth so với JSON. Với hệ thống internal gọi hàng triệu requests/s giữa microservices, con số này rất đáng kể.

3.2 API Gateway Throughput Estimation

Scenario: E-commerce platform, 50M DAU

Assumptions

Thông số	Giá trị
DAU	50M
Avg API calls/user/day	30 (browse, search, cart, checkout)
Avg request size	1 KB
Avg response size	5 KB
Peak multiplier	5x (flash sale)
Read:Write ratio	8:1

QPS Calculation

T o t a l re q u es t s / d a y = 50 M \times 30 = 1.5 B re q u es t s / d a y

QP S_{a vg} = \frac{1.5 B}{86 , 400} \approx 17, 361 re q / s

QP S_{p e ak} = 17, 361 \times 5 = 86, 805 re q / s \approx 87 K re q / s

Bandwidth tại Gateway

B an d w i d t h_{in (p e ak)} = 87, 000 \times 1 K B = 87 MB / s = 696 M b p s

B an d w i d t h_{o u t (p e ak)} = 87, 000 \times 5 K B = 435 MB / s = 3.48 G b p s

Alert: 3.48 Gbps outbound — cần multiple gateway instances + load balancer phía trước. Một instance Nginx handle ~50K req/s, nên cần ít nhất 2 gateway instances (với headroom: 3-4 instances).

Gateway Memory for Rate Limiting

Nếu rate limiting per user (50M users), mỗi entry cần ~100 bytes (user_id + counters + timestamps):

M e m or y_{r a t e_l imi t} = 50 M \times 100 b y t es = 5 GB

Dùng Redis cluster cho distributed rate limiting state. 5GB dễ dàng fit vào 1 Redis node (thường có 16-64GB RAM).

Connection Pool Sizing

Giả sử mỗi backend request mất trung bình 50ms:

C o nn ec t i o n s_{n ee d e d} = QP S_{p e ak} \times a vg_l a t e n cy = 87, 000 \times 0.05 s = 4, 350 co n c u rre n t co nn ec t i o n s

Phải tune gateway connection pool size ≥ 4,350. Nginx default worker_connections = 1024, cần tăng lên.

Tóm tắt Gateway Estimation

Metric	Value
QPS (avg)	~17K req/s
QPS (peak, flash sale)	~87K req/s
Bandwidth in (peak)	~696 Mbps
Bandwidth out (peak)	~3.48 Gbps
Rate limit storage	~5 GB (Redis)
Concurrent connections	~4,350
Gateway instances needed	3-4 (Nginx)

4. Security — API Security Best Practices

4.1 API Authentication (Xác thực — Authentication)

API Keys

GET /api/v1/users
X-API-Key: sk_live_abc123def456

Ưu điểm	Nhược điểm
Đơn giản nhất	Không identify user, chỉ identify application
Dễ rotate	Dễ bị leak (commit vào Git, log ra console)
Dùng cho server-to-server	Không nên dùng cho end-user authentication

Rule: API key dùng để identify application (ví dụ: “request từ mobile app”), không phải để identify user. Kết hợp API key + OAuth2 token cho full auth.

OAuth 2.0 Bearer Token

GET /api/v1/users/me
Authorization: Bearer eyJhbGciOiJSUzI1NiIs...

Flow phổ biến nhất — Authorization Code Flow (cho web app):

Client redirect user tới Authorization Server
User đăng nhập, consent
Authorization Server redirect về client với code
Client exchange code → access_token + refresh_token (server-to-server)
Client dùng access_token gọi API

Security note: Access token nên short-lived (15 min - 1 hour). Refresh token long-lived (7-30 days) và phải lưu an toàn. Xem Tuan-14-AuthN-AuthZ-Security.

JWT (JSON Web Token) — Structure

Header.Payload.Signature
eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiIxMjMiLCJyb2xlIjoiYWRtaW4ifQ.signature

Ưu điểm	Nhược điểm
Stateless — server không cần lưu session	Không thể revoke individual token (phải đợi hết hạn)
Chứa claims (role, permissions)	Payload size lớn hơn opaque token
Verify bằng public key (asymmetric)	Nếu secret bị leak → toàn bộ tokens bị compromise

4.2 Input Validation — Tuyến phòng thủ đầu tiên

Mọi input từ client đều là untrusted. Validate ở cả 3 tầng:

Tầng	Validate gì	Tool
API Gateway	Schema validation, request size limit	JSON Schema, Kong plugin
Application	Business logic validation	Joi, Zod, class-validator
Database	Constraints, types	SQL constraints, triggers

Quy tắc validation:

1. Whitelist > Blacklist (chỉ cho phép known-good, không cố block known-bad)
2. Validate type, length, range, format, encoding
3. Reject early — fail fast tại gateway nếu có thể
4. Never trust client-side validation alone

4.3 Injection Attacks

SQL Injection

// NGUY HIỂM — String concatenation
const query = `SELECT * FROM users WHERE id = '${req.params.id}'`;
// Attacker: req.params.id = "1' OR '1'='1"
// → SELECT * FROM users WHERE id = '1' OR '1'='1' → trả về toàn bộ users!

// AN TOÀN — Parameterized query
const query = `SELECT * FROM users WHERE id = $1`;
db.query(query, [req.params.id]);

NoSQL Injection

// NGUY HIỂM — MongoDB query injection
// POST /login với body:
{
  "username": "admin",
  "password": { "$gt": "" }
}
// → db.users.find({username: "admin", password: {$gt: ""}})
// → Trả về admin user vì mọi password đều > "" !
 
// AN TOÀN — Validate type trước
if (typeof req.body.password !== 'string') {
  return res.status(400).json({ error: 'Invalid password format' });
}

CORS là cơ chế browser dùng để kiểm soát cross-origin requests.

// Server response headers
Access-Control-Allow-Origin: https://app.example.com  // KHÔNG dùng * cho API có auth
Access-Control-Allow-Methods: GET, POST, PUT, DELETE
Access-Control-Allow-Headers: Authorization, Content-Type
Access-Control-Max-Age: 86400  // Cache preflight 24h
Access-Control-Allow-Credentials: true

Sai lầm phổ biến: Đặt Access-Control-Allow-Origin: * cho API có authentication → BẤT KỲ website nào cũng gọi được API của bạn bằng credentials của user! Luôn whitelist specific origins.

4.5 OWASP API Security Top 10 (2023)

#	Risk	Giải thích	Phòng chống
API1	Broken Object Level Authorization	User A truy cập data của User B qua `/users/B_id`	Kiểm tra ownership mỗi request
API2	Broken Authentication	Weak auth mechanisms	OAuth2, MFA, rate limit login
API3	Broken Object Property Level Authorization	Mass assignment — user set `role: admin` qua API	Whitelist allowed fields
API4	Unrestricted Resource Consumption	Không limit request size, pagination size	Max payload size, max page size
API5	Broken Function Level Authorization	Regular user gọi admin endpoints	RBAC, middleware auth per route
API6	Unrestricted Access to Sensitive Business Flows	Bot mua hết vé concert, coupon abuse	CAPTCHA, rate limiting, behavior analysis
API7	Server Side Request Forgery (SSRF)	API fetch URL do user cung cấp → access internal network	Whitelist allowed domains, block internal IPs
API8	Security Misconfiguration	Debug mode on production, default credentials	Security headers, disable unnecessary methods
API9	Improper Inventory Management	Old API versions vẫn chạy, không ai monitor	API registry, deprecation policy
API10	Unsafe Consumption of APIs	Trust data từ third-party API mà không validate	Validate ALL external data

API1 (BOLA) là lỗi phổ biến nhất và nguy hiểm nhất. Ví dụ: GET /api/v1/invoices/12345 — nếu server không check “invoice 12345 có thuộc về user đang request không?” → data breach.

5. DevOps — API Gateway & Operations

5.1 API Gateway Options

Gateway	Type	Ưu điểm	Nhược điểm
Kong	Open-source / Enterprise	Plugin ecosystem phong phú, Lua extensible	Resource heavy, learning curve
AWS API Gateway	Managed	Serverless, tích hợp AWS ecosystem	Vendor lock-in, cold start latency
Nginx	Open-source	Cực kỳ performant, lightweight	Cần config thủ công, ít built-in features
Envoy	Open-source (CNCF)	Service mesh ready, gRPC native	Phức tạp, designed cho Kubernetes
Traefik	Open-source	Auto-discovery (Docker, K8s), dễ config	Performance thấp hơn Nginx
Azure API Management	Managed	Developer portal, analytics built-in	Vendor lock-in

5.2 OpenAPI/Swagger — API Documentation

OpenAPI Specification (OAS) là standard để mô tả REST APIs. Swagger là toolset phổ biến nhất.

# openapi.yaml
openapi: 3.0.3
info:
  title: User Service API
  version: 1.0.0
  description: API for managing users
 
servers:
  - url: https://api.example.com/v1
 
paths:
  /users:
    get:
      summary: List users
      operationId: listUsers
      tags: [Users]
      parameters:
        - name: cursor
          in: query
          schema:
            type: string
          description: Pagination cursor
        - name: limit
          in: query
          schema:
            type: integer
            minimum: 1
            maximum: 100
            default: 20
      responses:
        '200':
          description: Successful response
          content:
            application/json:
              schema:
                type: object
                properties:
                  data:
                    type: array
                    items:
                      $ref: '#/components/schemas/User'
                  pagination:
                    $ref: '#/components/schemas/Pagination'
        '401':
          $ref: '#/components/responses/Unauthorized'
        '429':
          $ref: '#/components/responses/RateLimited'
 
    post:
      summary: Create user
      operationId: createUser
      tags: [Users]
      parameters:
        - name: Idempotency-Key
          in: header
          required: true
          schema:
            type: string
            format: uuid
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/CreateUserRequest'
      responses:
        '201':
          description: User created
        '409':
          description: Duplicate (idempotency key already used)
 
components:
  schemas:
    User:
      type: object
      properties:
        id:
          type: string
          format: uuid
        name:
          type: string
          maxLength: 100
        email:
          type: string
          format: email
        created_at:
          type: string
          format: date-time
 
    CreateUserRequest:
      type: object
      required: [name, email]
      properties:
        name:
          type: string
          minLength: 1
          maxLength: 100
        email:
          type: string
          format: email
 
    Pagination:
      type: object
      properties:
        next_cursor:
          type: string
          nullable: true
        has_more:
          type: boolean
        limit:
          type: integer
 
  responses:
    Unauthorized:
      description: Missing or invalid authentication
      content:
        application/json:
          schema:
            type: object
            properties:
              error:
                type: object
                properties:
                  code:
                    type: string
                    example: UNAUTHORIZED
                  message:
                    type: string
 
    RateLimited:
      description: Too many requests
      headers:
        Retry-After:
          schema:
            type: integer
        X-RateLimit-Limit:
          schema:
            type: integer
        X-RateLimit-Remaining:
          schema:
            type: integer
 
  securitySchemes:
    bearerAuth:
      type: http
      scheme: bearer
      bearerFormat: JWT
 
security:
  - bearerAuth: []

5.3 API Versioning Strategy — Operational Perspective

Deprecation Timeline (best practice):

v1 release → v2 release → v1 deprecated (6 months notice) → v1 sunset (12 months)

Khi deprecate:

# Response headers cho deprecated API
Deprecation: true
Sunset: Sat, 18 Sep 2027 00:00:00 GMT
Link: <https://api.example.com/v2/users>; rel="successor-version"

Kong configuration cho versioning:

# kong.yml — Route-based versioning
services:
  - name: user-service-v1
    url: http://user-service-v1:3000
    routes:
      - name: users-v1
        paths:
          - /v1/users
 
  - name: user-service-v2
    url: http://user-service-v2:3001
    routes:
      - name: users-v2
        paths:
          - /v2/users

5.4 Health Check Endpoints

Mỗi API service phải expose health check endpoints cho load balancer và monitoring:

// Liveness — "Service có đang chạy không?"
// GET /health/live → 200 OK
// Dùng cho: Kubernetes liveness probe, load balancer
 
// Readiness — "Service có sẵn sàng nhận traffic không?"
// GET /health/ready → 200 OK hoặc 503 Service Unavailable
// Dùng cho: Kubernetes readiness probe
// Check: DB connection, Redis connection, downstream dependencies
 
// Detailed health (chỉ expose cho internal monitoring, KHÔNG public)
// GET /health/detail
{
  "status": "healthy",
  "version": "1.5.2",
  "uptime": "72h34m",
  "checks": {
    "database": { "status": "healthy", "latency_ms": 2 },
    "redis": { "status": "healthy", "latency_ms": 1 },
    "user-service": { "status": "degraded", "latency_ms": 450 }
  }
}

Kubernetes probe configuration:

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-service
spec:
  template:
    spec:
      containers:
        - name: api
          image: api-service:1.5.2
          ports:
            - containerPort: 3000
          livenessProbe:
            httpGet:
              path: /health/live
              port: 3000
            initialDelaySeconds: 10
            periodSeconds: 15
            failureThreshold: 3
          readinessProbe:
            httpGet:
              path: /health/ready
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 10
            failureThreshold: 2

6. Code Examples

6.1 Express.js REST API — Pagination + Idempotency Key

// server.js — Express REST API with cursor pagination and idempotency
const express = require('express');
const { v4: uuidv4 } = require('uuid');
const Redis = require('ioredis');
 
const app = express();
app.use(express.json({ limit: '10kb' })); // Limit payload size (Security)
 
const redis = new Redis(process.env.REDIS_URL || 'redis://localhost:6379');
 
// ============================================================
// Middleware: Request ID (cho tracing/debugging)
// ============================================================
app.use((req, res, next) => {
  req.requestId = req.headers['x-request-id'] || uuidv4();
  res.setHeader('X-Request-Id', req.requestId);
  next();
});
 
// ============================================================
// Middleware: Rate Limiting (simple token bucket via Redis)
// ============================================================
async function rateLimiter(req, res, next) {
  const key = `rate_limit:${req.ip}`;
  const limit = 100;          // 100 requests
  const windowSec = 60;       // per 60 seconds
 
  const current = await redis.incr(key);
  if (current === 1) {
    await redis.expire(key, windowSec);
  }
 
  res.setHeader('X-RateLimit-Limit', limit);
  res.setHeader('X-RateLimit-Remaining', Math.max(0, limit - current));
 
  if (current > limit) {
    const ttl = await redis.ttl(key);
    res.setHeader('Retry-After', ttl);
    return res.status(429).json({
      error: {
        code: 'RATE_LIMITED',
        message: `Too many requests. Retry after ${ttl} seconds.`,
      },
    });
  }
  next();
}
 
app.use(rateLimiter);
 
// ============================================================
// Mock Database (in-memory for demo)
// ============================================================
const orders = [];
for (let i = 1; i <= 150; i++) {
  orders.push({
    id: `order_${String(i).padStart(4, '0')}`,
    user_id: `user_${(i % 10) + 1}`,
    product: `Product ${i}`,
    amount: Math.floor(Math.random() * 10000) + 100,
    status: ['pending', 'confirmed', 'shipped', 'delivered'][i % 4],
    created_at: new Date(Date.now() - i * 3600000).toISOString(),
  });
}
 
// ============================================================
// GET /api/v1/orders — Cursor-based Pagination
// ============================================================
app.get('/api/v1/orders', (req, res) => {
  const limit = Math.min(parseInt(req.query.limit) || 20, 100); // Max 100
  const cursor = req.query.cursor;
  const status = req.query.status; // Optional filter
 
  let filtered = orders;
  if (status) {
    filtered = filtered.filter((o) => o.status === status);
  }
 
  // Decode cursor: Base64 encoded index
  let startIndex = 0;
  if (cursor) {
    try {
      const decoded = JSON.parse(Buffer.from(cursor, 'base64').toString());
      startIndex = filtered.findIndex((o) => o.id === decoded.after_id);
      if (startIndex === -1) {
        return res.status(400).json({
          error: { code: 'INVALID_CURSOR', message: 'Cursor is invalid or expired' },
        });
      }
      startIndex += 1; // Start after the cursor item
    } catch {
      return res.status(400).json({
        error: { code: 'INVALID_CURSOR', message: 'Malformed cursor' },
      });
    }
  }
 
  const page = filtered.slice(startIndex, startIndex + limit);
  const hasMore = startIndex + limit < filtered.length;
  const nextCursor = hasMore
    ? Buffer.from(JSON.stringify({ after_id: page[page.length - 1].id })).toString('base64')
    : null;
 
  res.json({
    data: page,
    pagination: {
      next_cursor: nextCursor,
      has_more: hasMore,
      limit,
      total: filtered.length,
    },
  });
});
 
// ============================================================
// POST /api/v1/orders — Idempotency Key Pattern
// ============================================================
app.post('/api/v1/orders', async (req, res) => {
  // 1. Validate Idempotency-Key header
  const idempotencyKey = req.headers['idempotency-key'];
  if (!idempotencyKey) {
    return res.status(400).json({
      error: {
        code: 'MISSING_IDEMPOTENCY_KEY',
        message: 'Idempotency-Key header is required for POST requests',
      },
    });
  }
 
  // 2. Check if this key was already processed
  const cacheKey = `idempotency:${idempotencyKey}`;
  const cached = await redis.get(cacheKey);
  if (cached) {
    const cachedResponse = JSON.parse(cached);
    res.setHeader('X-Idempotency-Replayed', 'true');
    return res.status(cachedResponse.status).json(cachedResponse.body);
  }
 
  // 3. Validate request body
  const { user_id, product, amount } = req.body;
  if (!user_id || !product || !amount) {
    return res.status(400).json({
      error: {
        code: 'VALIDATION_ERROR',
        message: 'user_id, product, and amount are required',
      },
    });
  }
 
  if (typeof amount !== 'number' || amount <= 0) {
    return res.status(422).json({
      error: {
        code: 'INVALID_AMOUNT',
        message: 'Amount must be a positive number',
      },
    });
  }
 
  // 4. Create order
  const newOrder = {
    id: `order_${uuidv4().split('-')[0]}`,
    user_id,
    product,
    amount,
    status: 'pending',
    created_at: new Date().toISOString(),
  };
  orders.unshift(newOrder);
 
  // 5. Cache response with idempotency key (TTL: 24 hours)
  const response = { status: 201, body: { data: newOrder } };
  await redis.set(cacheKey, JSON.stringify(response), 'EX', 86400);
 
  res.status(201).json(response.body);
});
 
// ============================================================
// Health Check Endpoints
// ============================================================
app.get('/health/live', (req, res) => {
  res.json({ status: 'ok' });
});
 
app.get('/health/ready', async (req, res) => {
  try {
    await redis.ping();
    res.json({ status: 'ready', redis: 'connected' });
  } catch (err) {
    res.status(503).json({ status: 'not_ready', redis: 'disconnected' });
  }
});
 
// ============================================================
// Start Server
// ============================================================
const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
  console.log(`API Server running on port ${PORT}`);
});
 
module.exports = app;

6.2 gRPC Proto File + Server

// proto/order_service.proto
syntax = "proto3";
 
package order;
 
option go_package = "github.com/example/order-service/pb";
 
// ============================================================
// Messages
// ============================================================
 
message Order {
  string id = 1;
  string user_id = 2;
  string product = 3;
  int64 amount_cents = 4;          // Store money as cents to avoid float issues
  OrderStatus status = 5;
  string created_at = 6;
}
 
enum OrderStatus {
  ORDER_STATUS_UNSPECIFIED = 0;
  ORDER_STATUS_PENDING = 1;
  ORDER_STATUS_CONFIRMED = 2;
  ORDER_STATUS_SHIPPED = 3;
  ORDER_STATUS_DELIVERED = 4;
  ORDER_STATUS_CANCELLED = 5;
}
 
message CreateOrderRequest {
  string user_id = 1;
  string product = 2;
  int64 amount_cents = 3;
  string idempotency_key = 4;      // Built into the message
}
 
message CreateOrderResponse {
  Order order = 1;
  bool was_replayed = 2;            // True if idempotency key was reused
}
 
message GetOrderRequest {
  string order_id = 1;
}
 
message ListOrdersRequest {
  string user_id = 1;
  int32 page_size = 2;             // Max 100
  string page_token = 3;           // Cursor for pagination
  OrderStatus status_filter = 4;   // Optional filter
}
 
message ListOrdersResponse {
  repeated Order orders = 1;
  string next_page_token = 2;
  bool has_more = 3;
}
 
// Server streaming — real-time order status updates
message WatchOrderRequest {
  string order_id = 1;
}
 
message OrderStatusUpdate {
  string order_id = 1;
  OrderStatus old_status = 2;
  OrderStatus new_status = 3;
  string updated_at = 4;
}
 
// ============================================================
// Service Definition
// ============================================================
 
service OrderService {
  // Unary RPCs
  rpc CreateOrder(CreateOrderRequest) returns (CreateOrderResponse);
  rpc GetOrder(GetOrderRequest) returns (Order);
  rpc ListOrders(ListOrdersRequest) returns (ListOrdersResponse);
 
  // Server streaming — client subscribes to order status changes
  rpc WatchOrderStatus(WatchOrderRequest) returns (stream OrderStatusUpdate);
}

// grpc-server.js — gRPC Server implementation (Node.js)
const grpc = require('@grpc/grpc-js');
const protoLoader = require('@grpc/proto-loader');
const { v4: uuidv4 } = require('uuid');
 
const PROTO_PATH = './proto/order_service.proto';
 
const packageDefinition = protoLoader.loadSync(PROTO_PATH, {
  keepCase: true,
  longs: String,
  enums: String,
  defaults: true,
  oneofs: true,
});
const orderProto = grpc.loadPackageDefinition(packageDefinition).order;
 
// In-memory storage (replace with DB in production)
const orders = new Map();
const idempotencyCache = new Map();
const statusWatchers = new Map(); // order_id -> [callback]
 
// ============================================================
// RPC Implementations
// ============================================================
 
function createOrder(call, callback) {
  const { user_id, product, amount_cents, idempotency_key } = call.request;
 
  // Idempotency check
  if (idempotency_key && idempotencyCache.has(idempotency_key)) {
    const existingOrder = idempotencyCache.get(idempotency_key);
    return callback(null, { order: existingOrder, was_replayed: true });
  }
 
  // Validation
  if (!user_id || !product || amount_cents <= 0) {
    return callback({
      code: grpc.status.INVALID_ARGUMENT,
      message: 'user_id, product, and positive amount_cents are required',
    });
  }
 
  const order = {
    id: `order_${uuidv4().split('-')[0]}`,
    user_id,
    product,
    amount_cents,
    status: 'ORDER_STATUS_PENDING',
    created_at: new Date().toISOString(),
  };
 
  orders.set(order.id, order);
  if (idempotency_key) {
    idempotencyCache.set(idempotency_key, order);
  }
 
  callback(null, { order, was_replayed: false });
}
 
function getOrder(call, callback) {
  const order = orders.get(call.request.order_id);
  if (!order) {
    return callback({
      code: grpc.status.NOT_FOUND,
      message: `Order ${call.request.order_id} not found`,
    });
  }
  callback(null, order);
}
 
function listOrders(call, callback) {
  const { user_id, page_size, page_token, status_filter } = call.request;
  const limit = Math.min(page_size || 20, 100);
 
  let result = Array.from(orders.values());
 
  if (user_id) {
    result = result.filter((o) => o.user_id === user_id);
  }
  if (status_filter && status_filter !== 'ORDER_STATUS_UNSPECIFIED') {
    result = result.filter((o) => o.status === status_filter);
  }
 
  // Cursor pagination
  let startIdx = 0;
  if (page_token) {
    startIdx = result.findIndex((o) => o.id === page_token);
    if (startIdx === -1) startIdx = 0;
    else startIdx += 1;
  }
 
  const page = result.slice(startIdx, startIdx + limit);
  const hasMore = startIdx + limit < result.length;
 
  callback(null, {
    orders: page,
    next_page_token: hasMore ? page[page.length - 1].id : '',
    has_more: hasMore,
  });
}
 
// Server streaming — push order status updates to client
function watchOrderStatus(call) {
  const { order_id } = call.request;
  const order = orders.get(order_id);
  if (!order) {
    call.emit('error', {
      code: grpc.status.NOT_FOUND,
      message: `Order ${order_id} not found`,
    });
    return;
  }
 
  // Register watcher
  if (!statusWatchers.has(order_id)) {
    statusWatchers.set(order_id, []);
  }
  statusWatchers.get(order_id).push((update) => {
    call.write(update);
  });
 
  // Clean up on client disconnect
  call.on('cancelled', () => {
    const watchers = statusWatchers.get(order_id) || [];
    statusWatchers.set(order_id, watchers.filter((w) => w !== call.write));
  });
}
 
// ============================================================
// Start gRPC Server
// ============================================================
function main() {
  const server = new grpc.Server();
 
  server.addService(orderProto.OrderService.service, {
    CreateOrder: createOrder,
    GetOrder: getOrder,
    ListOrders: listOrders,
    WatchOrderStatus: watchOrderStatus,
  });
 
  const PORT = process.env.GRPC_PORT || '50051';
  server.bindAsync(`0.0.0.0:${PORT}`, grpc.ServerCredentials.createInsecure(), (err) => {
    if (err) {
      console.error('Failed to bind gRPC server:', err);
      return;
    }
    console.log(`gRPC Server running on port ${PORT}`);
  });
}
 
main();

6.3 Nginx Rate Limiting Configuration

# /etc/nginx/nginx.conf — API Gateway with Rate Limiting
 
# ============================================================
# Rate Limiting Zones (shared memory)
# ============================================================
http {
    # Zone 1: Per IP — 10 requests/second, burst 20
    # 10m shared memory ≈ 160,000 IP addresses
    limit_req_zone $binary_remote_addr zone=per_ip:10m rate=10r/s;
 
    # Zone 2: Per API Key — 100 requests/second
    # Extract API key from header
    map $http_x_api_key $api_key {
        default         $http_x_api_key;
        ""              $binary_remote_addr;  # Fallback to IP if no key
    }
    limit_req_zone $api_key zone=per_api_key:10m rate=100r/s;
 
    # Zone 3: Global — 50,000 requests/second total
    limit_req_zone $server_name zone=global:1m rate=50000r/s;
 
    # Connection limiting — max 100 concurrent connections per IP
    limit_conn_zone $binary_remote_addr zone=per_ip_conn:10m;
 
    # Custom error responses for rate limiting
    limit_req_status 429;
    limit_conn_status 429;
 
    # Logging format with rate limit info
    log_format api_log '$remote_addr - $request_id [$time_local] '
                       '"$request" $status $body_bytes_sent '
                       '"$http_referer" "$http_user_agent" '
                       'rt=$request_time '
                       'api_key=$http_x_api_key';
 
    # ============================================================
    # Upstream Backend Services
    # ============================================================
    upstream user_service {
        least_conn;
        server user-service-1:3000 weight=5;
        server user-service-2:3000 weight=5;
        server user-service-3:3000 backup;
 
        keepalive 64;  # Connection pooling
    }
 
    upstream order_service {
        least_conn;
        server order-service-1:3000;
        server order-service-2:3000;
 
        keepalive 64;
    }
 
    upstream grpc_order_service {
        server order-grpc-1:50051;
        server order-grpc-2:50051;
    }
 
    # ============================================================
    # API Gateway Server
    # ============================================================
    server {
        listen 443 ssl http2;
        server_name api.example.com;
 
        # SSL/TLS Configuration
        ssl_certificate     /etc/nginx/certs/api.example.com.crt;
        ssl_certificate_key /etc/nginx/certs/api.example.com.key;
        ssl_protocols       TLSv1.2 TLSv1.3;
        ssl_ciphers         HIGH:!aNULL:!MD5;
 
        # Security Headers
        add_header X-Content-Type-Options nosniff always;
        add_header X-Frame-Options DENY always;
        add_header X-XSS-Protection "1; mode=block" always;
        add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
 
        # Request size limit (prevent large payload attacks)
        client_max_body_size 10m;
        client_body_timeout 10s;
        client_header_timeout 10s;
 
        # Connection limiting
        limit_conn per_ip_conn 100;
 
        access_log /var/log/nginx/api_access.log api_log;
 
        # Generate unique request ID
        add_header X-Request-Id $request_id always;
 
        # ============================================================
        # CORS Configuration
        # ============================================================
        set $cors_origin "";
        if ($http_origin ~* "^https://(app|admin)\.example\.com$") {
            set $cors_origin $http_origin;
        }
 
        add_header Access-Control-Allow-Origin $cors_origin always;
        add_header Access-Control-Allow-Methods "GET, POST, PUT, PATCH, DELETE, OPTIONS" always;
        add_header Access-Control-Allow-Headers "Authorization, Content-Type, X-API-Key, Idempotency-Key, X-Request-Id" always;
        add_header Access-Control-Max-Age 86400 always;
 
        # Preflight requests
        if ($request_method = 'OPTIONS') {
            return 204;
        }
 
        # ============================================================
        # API Routes with Rate Limiting
        # ============================================================
 
        # User Service (REST)
        location /v1/users {
            limit_req zone=per_ip burst=20 nodelay;
            limit_req zone=per_api_key burst=50 nodelay;
 
            proxy_pass http://user_service;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
            proxy_set_header X-Request-Id $request_id;
 
            proxy_http_version 1.1;
            proxy_set_header Connection "";  # Enable keepalive
 
            proxy_connect_timeout 5s;
            proxy_read_timeout 30s;
        }
 
        # Order Service (REST)
        location /v1/orders {
            limit_req zone=per_ip burst=10 nodelay;
            limit_req zone=per_api_key burst=30 nodelay;
 
            proxy_pass http://order_service;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Request-Id $request_id;
 
            proxy_http_version 1.1;
            proxy_set_header Connection "";
 
            proxy_connect_timeout 5s;
            proxy_read_timeout 30s;
        }
 
        # gRPC Proxy (for internal/special clients)
        location /order.OrderService/ {
            grpc_pass grpc://grpc_order_service;
            grpc_set_header X-Request-Id $request_id;
 
            # gRPC-specific timeouts
            grpc_read_timeout 300s;
            grpc_send_timeout 300s;
        }
 
        # Health Check (no rate limiting)
        location /health {
            limit_req off;
            proxy_pass http://user_service;
        }
 
        # Block common attack paths
        location ~* \.(php|asp|aspx|jsp|cgi)$ {
            return 403;
        }
 
        # Custom 429 error page
        error_page 429 @rate_limited;
        location @rate_limited {
            default_type application/json;
            return 429 '{"error":{"code":"RATE_LIMITED","message":"Too many requests. Please retry later."}}';
        }
    }
 
    # Redirect HTTP to HTTPS
    server {
        listen 80;
        server_name api.example.com;
        return 301 https://$host$request_uri;
    }
}

7. Architecture Diagram — API Gateway

flowchart TD
    subgraph "Clients"
        WEB["Web App<br/>(React/Vue)"]
        MOB["Mobile App<br/>(iOS/Android)"]
        EXT["Third-party<br/>Partners"]
        INT["Internal<br/>Microservices"]
    end

    subgraph "Edge Layer"
        CDN["CDN<br/>(CloudFront/Cloudflare)"]
        WAF["WAF<br/>(Web Application Firewall)"]
        LB["Load Balancer<br/>(ALB / NLB)"]
    end

    subgraph "API Gateway Layer"
        GW1["API Gateway #1<br/>(Nginx/Kong)"]
        GW2["API Gateway #2<br/>(Nginx/Kong)"]
    end

    subgraph "Gateway Responsibilities"
        direction LR
        AUTH["Authentication<br/>(JWT verify)"]
        RL["Rate Limiting<br/>(Token Bucket)"]
        ROUTE["Request<br/>Routing"]
        LOG["Logging &<br/>Metrics"]
        CACHE["Response<br/>Cache"]
    end

    subgraph "Backend Services"
        US["User Service<br/>(REST / Express)"]
        OS["Order Service<br/>(gRPC / Go)"]
        PS["Product Service<br/>(REST / Spring)"]
        NS["Notification Service<br/>(gRPC / Python)"]
        GQL["GraphQL<br/>Aggregation Layer"]
    end

    subgraph "Data Stores"
        PG["PostgreSQL"]
        RD["Redis<br/>(Cache + Rate Limit)"]
        MQ["Message Queue<br/>(Kafka/RabbitMQ)"]
    end

    WEB -->|HTTPS / REST| CDN
    MOB -->|HTTPS / REST + GraphQL| CDN
    EXT -->|HTTPS / REST + API Key| WAF
    INT -->|gRPC / mTLS| LB

    CDN --> WAF
    WAF --> LB
    LB --> GW1
    LB --> GW2

    GW1 --- AUTH
    GW1 --- RL
    GW1 --- ROUTE
    GW1 --- LOG
    GW1 --- CACHE
    GW2 --- AUTH
    GW2 --- RL
    GW2 --- ROUTE
    GW2 --- LOG
    GW2 --- CACHE

    GW1 -->|REST| US
    GW1 -->|gRPC| OS
    GW1 -->|REST| PS
    GW1 -->|gRPC| NS
    GW1 -->|GraphQL| GQL
    GW2 -->|REST| US
    GW2 -->|gRPC| OS
    GW2 -->|REST| PS
    GW2 -->|gRPC| NS
    GW2 -->|GraphQL| GQL

    GQL --> US
    GQL --> OS
    GQL --> PS

    US --> PG
    OS --> PG
    PS --> PG
    US --> RD
    OS --> RD
    NS --> MQ

    RL -.->|Read/Write counters| RD
    CACHE -.->|Cache responses| RD

    style GW1 fill:#e1f5fe,stroke:#0288d1,stroke-width:2px
    style GW2 fill:#e1f5fe,stroke:#0288d1,stroke-width:2px
    style AUTH fill:#fff3e0,stroke:#f57c00,stroke-width:2px
    style RL fill:#fce4ec,stroke:#c62828,stroke-width:2px
    style WAF fill:#fce4ec,stroke:#c62828,stroke-width:2px

8. Aha Moments & Pitfalls

Aha Moments

#1 — API is a Product: API không chỉ là technical artifact — nó là product mà developers (internal hoặc external) phải “mua” và “dùng”. Nếu API khó hiểu, khó dùng → developers sẽ tìm workaround hoặc build lại từ đầu. Developer Experience (DX) quan trọng không kém User Experience (UX).

#2 — REST ≠ HTTP JSON API: Nhiều người nghĩ “dùng HTTP + JSON = RESTful”. Sai. REST có 6 constraints nghiêm ngặt. Hầu hết API tự xưng “RESTful” thực ra chỉ đạt Level 2 trên Richardson Maturity Model. Và đó hoàn toàn ổn cho production — đừng over-engineer HATEOAS nếu không cần.

#3 — Choose the right tool: REST cho public API + browser clients. gRPC cho internal microservices cần low-latency. GraphQL cho mobile-first apps cần flexibility. Không có silver bullet — hệ thống lớn thường dùng cả 3.

#4 — Idempotency saves lives: Trong distributed systems, network failure là certainty, không phải possibility. Nếu API có side effects (tạo order, chuyển tiền) mà không có idempotency → sẽ có ngày user bị charge 2 lần. Stripe biết điều này nên bắt buộc Idempotency-Key cho mọi mutation.

#5 — API Gateway là Swiss Army Knife: Thay vì mỗi microservice tự implement auth, rate limiting, logging, CORS → đặt tất cả ở Gateway. Single Responsibility Principle cho infrastructure: services chỉ lo business logic, Gateway lo cross-cutting concerns.

Pitfalls — Sai lầm thường gặp

Pitfall 1: Over-fetching & Chatty APIs

Sai: Mobile app gọi 15 REST endpoints để render 1 trang home → 15 round trips trên 4G = 3-5 giây. Đúng: Dùng BFF (Backend for Frontend) pattern hoặc GraphQL để aggregate. Hoặc tạo composite endpoint: GET /api/v1/home-feed trả về tất cả data cần cho home page.

Pitfall 2: Breaking Changes không versioning

Sai: Rename field username → user_name rồi deploy → 100% mobile clients bị crash vì app cũ vẫn expect username. Đúng: Thêm field mới, giữ field cũ (backward compatible). Hoặc bump version /v2/ rồi deprecate /v1/ với timeline 6-12 tháng. Never remove, only add.

Pitfall 3: Exposing internal IDs

Sai: GET /users/42 → auto-increment ID → attacker enumerate 41, 43, 44... để scrape toàn bộ user data (IDOR attack — OWASP API1). Đúng: Dùng UUID hoặc obfuscated ID: GET /users/usr_a1b2c3d4. Không leak thứ tự, không đoán được.

Pitfall 4: Không validate response size

Sai: GET /posts?limit=999999 → server cố load 1 triệu records → OOM crash. Đúng: Server-side max limit (Math.min(requestedLimit, 100)). Luôn có default pagination. Đừng tin client.

Pitfall 5: N+1 queries trong GraphQL

Sai: Naive GraphQL resolver fetch data từng item một → 1 query cho list + N queries cho details = N+1 problem → database overload. Đúng: Dùng DataLoader để batch. Cùng tick → batch thành 1 query WHERE id IN (...).

Pitfall 6: Rate limiting chỉ theo IP

Sai: Rate limit per IP = 100 req/s. Attacker dùng botnet 10,000 IPs → bypass hoàn toàn. Đúng: Rate limit theo nhiều dimensions: IP + API key + user ID + endpoint. Kết hợp với WAF và behavior analysis.

Pitfall 7: Logging sensitive data

Sai: Log full request/response body → log chứa passwords, credit card numbers, PII → compliance violation (GDPR, PCI-DSS). Đúng: Redact sensitive fields trước khi log. Chỉ log request ID, status code, latency, endpoint. Never log credentials or PII.

9. Internal Links — Liên kết kiến thức

Tuần	Chủ đề	Liên quan đến API Design
Tuan-01-Scale-From-Zero-To-Millions	Scaling fundamentals	API Gateway là component đầu tiên cần scale horizontally
Tuan-02-Back-of-the-envelope	Estimation	Tính QPS, bandwidth cho API Gateway capacity planning
Tuan-03-Networking-DNS-CDN	DNS, CDN	DNS routing cho API endpoints, CDN cache cho GET responses
Tuan-05-Load-Balancer	Load Balancing	LB phía trước API Gateway instances
Tuan-06-Cache-Strategy	Caching	API response caching, ETag/If-None-Match
Tuan-07-Database-Sharding-Replication	Database	API pagination maps to DB queries, cursor = DB index
Tuan-08-Message-Queue	Message Queue	Async API patterns, webhook delivery
Tuan-09-Rate-Limiter	Rate Limiting	Deep dive vào algorithms tại Gateway level
Tuan-11-Microservices-Pattern	Microservices	API Gateway là core pattern, gRPC cho inter-service communication
Tuan-13-Monitoring-Observability	Monitoring	API metrics (QPS, latency, error rate), distributed tracing
Tuan-14-AuthN-AuthZ-Security	Auth	OAuth2, JWT, API key management
Tuan-15-Data-Security-Encryption	Security	TLS termination, data encryption, input validation
Tuan-18-Design-News-Feed	Case Study	API pagination (cursor) cho news feed, GraphQL cho aggregation

Tham khảo

Alex Xu, System Design Interview — Chapter: Design a Rate Limiter, API Design sections in all case studies
Roy Fielding, Architectural Styles and the Design of Network-based Software Architectures (2000) — REST dissertation
Leonard Richardson, Richardson Maturity Model
Google, API Design Guide — https://cloud.google.com/apis/design
Stripe API Documentation — Gold standard for REST API design
gRPC Documentation — https://grpc.io/docs/
GraphQL Specification — https://spec.graphql.org/
OWASP API Security Top 10 (2023) — https://owasp.org/API-Security/
RFC 7807 — Problem Details for HTTP APIs
Tuan-03-Networking-DNS-CDN — HTTP protocol fundamentals
Tuan-09-Rate-Limiter — Rate limiting deep dive
Tuan-14-AuthN-AuthZ-Security — Authentication & Authorization

Tuần tới: Tuan-05-Load-Balancer — Phân phối traffic, đảm bảo không server nào bị quá tải

lthieu's notes

Explorer

Tuan-04-API-Design-REST-gRPC