Tuan 09: Rate Limiter
“Mot he thong khong co Rate Limiter giong nhu san van dong khong co cong soat ve — ai cung vao duoc, va ket qua la giam dap.”
Tags: system-design rate-limiter security devops alex-xu Prerequisite: Tuan-02-Back-of-the-envelope · Tuan-05-Load-Balancer · Tuan-06-Cache-Strategy Lien quan: Tuan-14-AuthN-AuthZ-Security · Tuan-15-Data-Security-Encryption · Tuan-13-Monitoring-Observability · Tuan-11-Microservices-Pattern
1. Context & Why
Analogy doi thuong
Hieu, tuong tuong em di xem concert o san van dong 50,000 cho ngoi. O cong soat ve (ticket gate), nhan vien chi cho dung so nguoi/phut di qua — vi du 200 nguoi/phut. Neu 10,000 nguoi u o mot luc ma khong co kiem soat:
- Cong bi qua tai (server overload)
- Nguoi ta giam dap nhau (cascading failure)
- Nhung nguoi co ve VIP cung bi ket (legitimate users affected)
- Ke gian tron ve de dang (malicious traffic gets through)
Rate Limiter chinh la cai cong soat ve do. No gioi han so luong request mot client co the gui den server trong mot khoang thoi gian nhat dinh. Qua gioi han → tra ve HTTP 429 “Too Many Requests” va yeu cau doi.
Tai sao can Rate Limiter?
| Van de | Khong co Rate Limiter | Co Rate Limiter |
|---|---|---|
| DDoS Attack | Server bi danh sap | Chi attacker bi block, user binh thuong van truy cap |
| Brute Force Login | Attacker thu hang trieu password | Bi chan sau 5-10 lan thu sai |
| API Abuse | Mot client goi 1M req/s, nguoi khac khong dung duoc | Moi client co quota cong bang |
| Accidental Loop | Bug trong client gui request vo han | Bi rate limit, server an toan |
| Cost Control | Cloud bill phat no vi runaway requests | Chi phi duoc kiem soat |
Tai sao Alex Xu dat no o Chapter 4?
Vi Rate Limiter la component co ban ma moi he thong production deu can. Truoc khi design bat ky he thong nao (URL Shortener, Chat, News Feed), em phai biet cach bao ve no khoi abuse. No nam giua Load Balancer (Tuan 05) va application logic — la tuyen phong thu dau tien.
2. Deep Dive — Cac thuat toan Rate Limiting
2.1 Token Bucket Algorithm
Nguyen ly: Tuong tuong mot cai xo (bucket) chua cac token. Moi request can 1 token de duoc xu ly. Token duoc do vao xo deu dan theo toc do co dinh (refill rate). Neu xo het token → request bi tu choi.
Tham so:
- Bucket size (dung luong xo): So token toi da — cho phep burst
- Refill rate (toc do nap): So token duoc them moi giay
Vi du: Bucket size = 10, Refill rate = 2 tokens/s
- T=0: Bucket co 10 tokens. Client gui 10 requests lien tuc → tat ca duoc xu ly, bucket = 0
- T=1: Bucket duoc nap 2 tokens. Client gui 3 requests → 2 duoc xu ly, 1 bi reject
- T=5: Bucket da nap lai 10 tokens (cap tai bucket size). Client co the burst lai
Uu diem:
- Cho phep burst traffic (huu ich cho real-world usage patterns)
- Memory hieu qua: chi can luu
last_refill_timevatokensper user - Amazon va Stripe dung Token Bucket
Nhuoc diem:
- Can tune 2 tham so (bucket size + refill rate) — khong truc quan
- Burst co the lam overload backend neu khong can than
2.2 Leaky Bucket Algorithm
Nguyen ly: Tuong tuong mot cai xo bi thung day. Request do vao tu tren, va chay ra (duoc xu ly) voi toc do co dinh tu lo thung. Neu xo day → request moi bi do ra ngoai (reject).
Tham so:
- Bucket size: So request toi da trong hang doi
- Outflow rate (toc do chay ra): So request duoc xu ly moi giay
Vi du: Bucket size = 10, Outflow rate = 2 req/s
- Client gui 10 requests cung luc → tat ca vao hang doi
- He thong xu ly deu 2 req/s, mat 5s de xu ly het
- Request thu 11 bi reject vi bucket day
Uu diem:
- Output rate co dinh va on dinh — rat tot cho downstream services
- Shopify dung Leaky Bucket cho API rate limiting
Nhuoc diem:
- Khong cho phep burst — moi request deu phai cho trong queue
- Request cu co the chiem hang doi, lam request moi bi reject
2.3 Fixed Window Counter
Nguyen ly: Chia thoi gian thanh cac cua so co dinh (vi du: moi phut). Dem so request trong moi cua so. Neu vuot gioi han → reject.
Vi du: Limit = 100 req/min
- 14:00:00 - 14:00:59: Dem requests. Neu dat 100 → reject cac request con lai
- 14:01:00: Counter reset ve 0. Bat dau dem lai
Uu diem:
- Don gian nhat de implement
- Memory cuc thap: chi can 1 counter per window per user
Nhuoc diem (QUAN TRONG):
- Boundary burst problem: Neu client gui 100 req o giay 59 cua phut 1, va 100 req o giay 0 cua phut 2 → 200 requests trong 2 giay, gap doi limit!
- Day la ly do Fixed Window khong duoc dung cho security-critical systems
2.4 Sliding Window Log
Nguyen ly: Luu timestamp cua moi request vao mot sorted set. Khi request moi den, xoa cac timestamp cu hon window size, dem so timestamp con lai. Neu vuot limit → reject.
Vi du: Limit = 100 req/min
- Request moi den luc 14:01:30
- Xoa tat ca timestamps truoc 14:00:30
- Dem so timestamps con lai trong [14:00:30, 14:01:30]
- Neu >= 100 → reject
Uu diem:
- Chinh xac tuyet doi — khong co boundary burst problem
- Rate limit duoc enforce chinh xac tai moi thoi diem
Nhuoc diem:
- Ton memory: phai luu moi timestamp. Neu 1M users, moi user 1000 req/window → 1B timestamps
- Redis ZSET per user co the rat lon
2.5 Sliding Window Counter
Nguyen ly: Ket hop Fixed Window Counter va Sliding Window Log. Dung counter cua window hien tai va window truoc, tinh weighted average dua tren vi tri trong window.
Cong thuc:
Vi du: Limit = 100 req/min, window = 1 phut
- Window truoc (14:00 - 14:01): 84 requests
- Window hien tai (14:01 - 14:02): 36 requests
- Thoi diem hien tai: 14:01:15 (da qua 25% cua window hien tai)
Request tiep theo se dat 100 → vua du limit, request sau do bi reject.
Uu diem:
- Memory hieu qua nhu Fixed Window (chi can 2 counters per user)
- Chinh xac gan tuyet doi — smooth ra boundary problem
- Cloudflare dung Sliding Window Counter
Nhuoc diem:
- Chi la xap xi (approximation), khong chinh xac 100%
- Trong thuc te, sai so < 1% — chap nhan duoc
2.6 Bang so sanh cac thuat toan
| Thuat toan | Memory | Do chinh xac | Burst support | Do phuc tap | Dung khi |
|---|---|---|---|---|---|
| Token Bucket | Thap (2 vars/user) | Cao | Co (configurable) | Thap | API general purpose, AWS, Stripe |
| Leaky Bucket | Thap (queue + pointer) | Cao | Khong | Thap | Can output on dinh, Shopify |
| Fixed Window Counter | Rat thap (1 counter/user) | Thap (boundary burst) | Co (o ranh gioi) | Rat thap | Internal services, non-critical |
| Sliding Window Log | Rat cao (all timestamps) | Tuyet doi | Khong | Cao | Security-critical, brute force prevention |
| Sliding Window Counter | Thap (2 counters/user) | Gan tuyet doi (~99%) | Mot phan | Trung binh | Production API, Cloudflare |
Ket luan thuc te: Hau het production systems dung Token Bucket hoac Sliding Window Counter. Fixed Window chi dung cho internal/non-critical. Sliding Window Log chi dung khi can chinh xac tuyet doi (vi du: financial transactions).
2.7 Distributed Rate Limiting (Redis-based)
Trong he thong distributed voi nhieu API server instances, rate limiter can shared state. Redis la lua chon pho bien nhat vi:
- Atomic operations: INCR, EXPIRE, Lua scripting
- In-memory: latency < 1ms
- Single-threaded: khong co race condition trong 1 command
Kien truc:
Client → Load Balancer → API Server 1 ─┐
→ API Server 2 ─┤→ Redis Cluster (shared counters)
→ API Server 3 ─┘
Van de voi distributed rate limiting:
- Race condition: 2 servers doc counter = 99 (limit = 100), ca hai tang len 100 va cho phep → thuc te 101 requests. Giai phap: Dung Lua script de atomic read-check-increment.
- Redis failure: Neu Redis down, rate limiter mat tac dung. Giai phap: Fail-open (cho phep tat ca) hoac fail-closed (reject tat ca) tuy policy. Dung Redis Sentinel/Cluster cho HA.
- Latency overhead: Moi request phai query Redis. Giai phap: Local cache + periodic sync, hoac dung Redis co-located with API servers.
2.8 Rate Limiting tai cac layers khac nhau
| Layer | Vi tri | Cong cu | Dac diem |
|---|---|---|---|
| Client-side | Browser/Mobile app | Throttle library, debounce | De bi bypass, chi la UX protection |
| CDN/Edge | Cloudflare, AWS CloudFront | Built-in rate limiting | Chan DDoS truoc khi den origin |
| API Gateway | Kong, AWS API Gateway, Nginx | Plugin/module rate limiting | Centralized, de quan ly |
| Application | Code trong service | Redis-based custom logic | Flexible, business-logic aware |
| Database | Connection pool, query limiter | pgbouncer, MySQL proxy | Bao ve DB khoi overload |
Best practice: Rate limit o nhieu layers (defense in depth). CDN chan DDoS, API Gateway chan per-user abuse, Application chan business-logic violations.
2.9 HTTP 429 + Retry-After Header
Khi rate limit bi triggered, server tra ve:
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 30
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1678886400
{
"error": {
"code": "RATE_LIMITED",
"message": "Rate limit exceeded. Please retry after 30 seconds.",
"retry_after": 30
}
}Cac headers quan trong:
Retry-After: So giay client nen doi truoc khi thu laiX-RateLimit-Limit: Tong so request duoc phep trong windowX-RateLimit-Remaining: So request con laiX-RateLimit-Reset: Unix timestamp khi window reset
Aha Moment: Client tot se doc Retry-After va implement exponential backoff. Client xau se retry ngay luc → can rate limiter manh hon cho nhung client nay (progressive penalties).
2.10 Rate Limit by IP / User / API Key
| Phuong phap | Khi nao dung | Han che |
|---|---|---|
| By IP | Anonymous traffic, DDoS mitigation | Shared IP (NAT, VPN, corporate proxy) → nhieu user bi anh huong |
| By User ID | Authenticated APIs | Can authn truoc rate limiting, khong chan duoc pre-auth attacks |
| By API Key | Public APIs, 3rd-party integrations | API key co the bi share hoac leak |
| Compound | IP + User + Endpoint | Chinh xac nhat, nhung phuc tap hon |
Thuc te: Dung compound key la tot nhat. Vi du: rate limit = {user_id}:{endpoint}:{minute}. Nhu vay user A goi /api/search 100 lan/phut khong anh huong quota cua user A goi /api/profile.
2.11 Tiered Rate Limiting (Free vs Premium)
| Tier | Rate Limit | Burst | Features |
|---|---|---|---|
| Free | 60 req/min | 10 req burst | Basic endpoints only |
| Basic ($29/mo) | 600 req/min | 50 req burst | All endpoints |
| Pro ($99/mo) | 3,000 req/min | 200 req burst | All endpoints + priority queue |
| Enterprise (custom) | 30,000 req/min | 1,000 req burst | Dedicated pool, custom limits |
Implementation: Moi API key co tier field. Rate limiter lookup tier → apply tuong ung. Dung Redis hash:
HSET ratelimit:config:free max_requests 60 window_seconds 60 burst 10
HSET ratelimit:config:pro max_requests 3000 window_seconds 60 burst 200
2.12 API Gateway Rate Limiting
Kong Gateway
# kong.yml - Rate Limiting Plugin
plugins:
- name: rate-limiting
config:
second: 10 # 10 req/s
minute: 100 # 100 req/min
hour: 5000 # 5000 req/h
policy: redis # Dung Redis cho distributed
redis_host: redis-cluster.internal
redis_port: 6379
redis_timeout: 2000
fault_tolerant: true # Fail-open neu Redis down
hide_client_headers: false # Tra ve X-RateLimit-* headers
limit_by: consumer # Rate limit per consumer (user)AWS API Gateway
{
"usagePlan": {
"name": "ProPlan",
"description": "Rate limit for Pro tier",
"throttle": {
"rateLimit": 50,
"burstLimit": 200
},
"quota": {
"limit": 100000,
"period": "MONTH"
}
}
}Nhan xet: API Gateway rate limiting rat tien nhung khong flexible bang custom implementation. Khi can rate limit theo business logic (vi du: limit so don hang/ngay), van can application-level rate limiter.
3. Estimation — Tinh toan cho Rate Limiter
Tham chieu: Tuan-02-Back-of-the-envelope, sdi.anhvy.dev — Rate Limiter
3.1 Tinh rate limit thresholds tu QPS estimation
Assumptions: API co 10M DAU, moi user trung binh 100 requests/ngay.
Tinh per-user rate limit:
Giai thich:
safety_factor = 10vi user co the tap trung su dung trong vai phut (khong phan bo deu ca ngay).burst_allowance = 20cho phep burst khi load trang.
Kiem tra: Neu tat ca 10M users deu hit rate limit cung luc:
He thong can handle 2.5M QPS trong worst case → can horizontal scaling + CDN protection.
3.2 Redis memory cho Sliding Window Counter
Assumptions: 10M users, moi user can 2 counters (current + previous window) + metadata.
Nhan xet: Chi ~1 GB cho 10M users! Redis single node (16 GB) du suc. Neu dung Sliding Window Log (luu moi timestamp):
Van chap nhan duoc, nhung gap 3x so voi Sliding Window Counter.
3.3 Overhead cua Rate Limiter len latency
Redis round-trip time (same AZ): ~0.5ms
Neu API avg latency = 50ms:
Chap nhan duoc. Tuy nhien, neu Redis o khac AZ (cross-AZ latency ~1-2ms):
Giai phap: Co-locate Redis voi API servers cung AZ. Hoac dung local cache voi periodic sync (tang throughput nhung giam accuracy).
Voi Lua script (atomic operation):
Lua script chay server-side tren Redis nen khong them network round trip, nhung execution time co the lau hon single command.
4. Security — Bao ve he thong bang Rate Limiter
4.1 DDoS Mitigation
DDoS attack vectors va rate limiting response:
| Attack Type | Dac diem | Rate Limiting Strategy |
|---|---|---|
| Volumetric (UDP flood, DNS amplification) | Millions of req/s tu botnet | Layer 3/4: CDN + ISP-level filtering (Cloudflare, AWS Shield) |
| Protocol (SYN flood, Ping of Death) | Exploit network protocol | Layer 4: Connection rate limiting, SYN cookies |
| Application (HTTP flood, Slowloris) | Legitimate-looking requests | Layer 7: Per-IP + per-endpoint rate limiting |
Multi-layer DDoS defense:
- CDN/Edge (Cloudflare): Challenge suspicious IPs, block known bad actors
- Load Balancer: Connection rate limiting, geographic blocking
- API Gateway: Per-IP rate limiting (100 req/min/IP)
- Application: Per-user rate limiting + anomaly detection
Quan trong: Rate limiter khong the chong DDoS mot minh. Can ket hop voi CDN, WAF (Web Application Firewall), va ISP-level mitigation. Rate limiter la tuyen cuoi, khong phai tuyen dau.
4.2 Brute Force Prevention
Vi du: Login endpoint bao ve khoi password brute force.
Strategy da tang:
| Lan thu that bai | Response |
|---|---|
| 1-3 | Cho phep binh thuong, tra ve “Invalid credentials” |
| 4-5 | Them CAPTCHA |
| 6-10 | Rate limit: 1 attempt/30s + CAPTCHA |
| 11-20 | Rate limit: 1 attempt/5min |
| >20 | Lock account 30 phut, notify user qua email |
Rate limit key: login_attempt:{ip}:{username} — ket hop IP va username de tranh:
- Attacker dung 1 IP brute force nhieu account
- Attacker dung nhieu IP brute force 1 account
4.3 Credential Stuffing Protection
Credential stuffing = attacker dung danh sach username/password bi leak tu website khac de thu dang nhap hang loat.
Dau hieu nhan biet:
- Login attempts tu nhieu IP khac nhau nhung pattern giong nhau
- Failure rate bat thuong (>95% failed logins)
- User agents/fingerprints giong het nhau
Rate limiting cho credential stuffing:
Neu normal login = 500 req/s, set global limit = 750 req/s. Khi vuot → trigger enhanced security:
- Bat buoc CAPTCHA cho tat ca login
- Tang delay giua cac login attempts
- Alert security team
4.4 API Abuse Detection
Cac pattern abuse thuong gap:
| Pattern | Mo ta | Detection |
|---|---|---|
| Scraping | Request lien tuc den listing/search endpoints | High req rate tren specific endpoints |
| Enumeration | Thu sequential IDs (/users/1, /users/2, …) | Sequential access pattern |
| Data exfiltration | Download so luong lon data | Bandwidth per user bat thuong |
| Price manipulation | Gui hang ngan order requests | Order rate >> normal |
Giai phap: Rate limiting ket hop anomaly detection. Khong chi dem so request ma con phan tich pattern:
- Ratio giua cac endpoint (user binh thuong: 80% read, 20% write. Attacker: 95% read specific endpoint)
- Request timing (human co jitter, bot co pattern deu dan)
- Response consumption (user binh thuong doc response, bot ignore)
4.5 Rate Limiting Bypass Techniques va Prevention
| Bypass Technique | Mo ta | Prevention |
|---|---|---|
| IP Rotation | Dung proxy pool / VPN de doi IP lien tuc | Rate limit by fingerprint (TLS, headers) + behavior analysis |
| Distributed attack | Dung botnet tu nhieu IP | Global rate limiting + CAPTCHA + anomaly detection |
| Slowloris | Gui request cuc cham de giu connection | Connection timeout + concurrent connection limit |
| Header spoofing | Gia mao X-Forwarded-For de doi IP | Trust chi proxy noi bo, dung X-Real-IP tu trusted proxy |
| API key sharing | Nhieu attacker dung chung 1 API key | Monitor usage pattern per key, revoke neu suspicious |
| Account rotation | Tao nhieu account de chia rate limit | Rate limit per IP va per account, limit account creation |
Aha Moment: Khong co rate limiter nao la hoan hao. Attacker luon tim cach bypass. Muc tieu la tang chi phi cua attack cho den khi khong con kinh te nua — khong phai chan 100%.
5. DevOps — Trien khai va van hanh Rate Limiter
5.1 Redis-based Rate Limiter Deployment
# docker-compose.yml - Rate Limiter Infrastructure
version: '3.8'
services:
redis-ratelimit:
image: redis:7-alpine
command: >
redis-server
--maxmemory 2gb
--maxmemory-policy allkeys-lru
--appendonly no
--save ""
--tcp-backlog 511
--timeout 0
--tcp-keepalive 300
ports:
- "6380:6379"
deploy:
resources:
limits:
memory: 2.5g
cpus: '2'
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 3
networks:
- ratelimit-net
redis-sentinel-1:
image: redis:7-alpine
command: redis-sentinel /etc/redis/sentinel.conf
volumes:
- ./sentinel.conf:/etc/redis/sentinel.conf
depends_on:
- redis-ratelimit
networks:
- ratelimit-net
# API Server voi rate limiting
api-server:
build: .
environment:
- REDIS_RATELIMIT_HOST=redis-ratelimit
- REDIS_RATELIMIT_PORT=6379
- RATE_LIMIT_DEFAULT=100/min
- RATE_LIMIT_AUTH=10/min
depends_on:
redis-ratelimit:
condition: service_healthy
networks:
- ratelimit-net
networks:
ratelimit-net:
driver: bridge# sentinel.conf - Redis Sentinel cho HA
sentinel monitor ratelimit-master redis-ratelimit 6379 2
sentinel down-after-milliseconds ratelimit-master 5000
sentinel failover-timeout ratelimit-master 10000
sentinel parallel-syncs ratelimit-master 15.2 Monitoring Rate Limit Hits — Prometheus + Grafana
# prometheus-alerts.yml
groups:
- name: rate_limiting
rules:
# Alert khi rate limit hit rate tang dot bien
- alert: RateLimitHitSpike
expr: >
rate(rate_limit_hits_total[5m]) >
rate(rate_limit_hits_total[1h] offset 1d) * 3
for: 5m
labels:
severity: warning
annotations:
summary: "Rate limit hits tang gap 3x so voi cung gio hom qua"
description: "Current: {{ $value }}/s. Co the la DDoS hoac API abuse."
# Alert khi qua nhieu user bi rate limited
- alert: TooManyUsersRateLimited
expr: >
rate(rate_limit_hits_total{result="rejected"}[5m]) /
rate(http_requests_total[5m]) > 0.1
for: 10m
labels:
severity: critical
annotations:
summary: ">10% requests bi rate limited — co the rate limit qua chat"
# Alert khi Redis rate limiter latency cao
- alert: RateLimiterLatencyHigh
expr: >
histogram_quantile(0.99,
rate(rate_limiter_duration_seconds_bucket[5m])
) > 0.005
for: 5m
labels:
severity: warning
annotations:
summary: "Rate limiter P99 latency > 5ms"
# Alert khi Redis rate limiter down
- alert: RateLimiterRedisDown
expr: redis_up{instance=~".*ratelimit.*"} == 0
for: 1m
labels:
severity: critical
annotations:
summary: "Rate limiter Redis instance down!"Grafana Dashboard panels:
| Panel | PromQL | Muc dich |
|---|---|---|
| Rate Limit Hits/s | rate(rate_limit_hits_total[1m]) | Xu huong bi rate limit |
| Rejection Rate (%) | rate(rate_limit_hits_total{result="rejected"}[5m]) / rate(http_requests_total[5m]) * 100 | % request bi reject |
| Top Rate Limited Users | topk(10, sum by (user_id)(rate(rate_limit_hits_total{result="rejected"}[1h]))) | Tim abuse users |
| Top Rate Limited Endpoints | topk(10, sum by (endpoint)(rate(rate_limit_hits_total{result="rejected"}[1h]))) | Tim endpoint bi tan cong |
| Redis Latency P99 | histogram_quantile(0.99, rate(rate_limiter_duration_seconds_bucket[5m])) | Performance rate limiter |
| Redis Memory Usage | redis_memory_used_bytes{instance=~".*ratelimit.*"} | Capacity planning |
5.3 Alerting on Spike Patterns
Spike detection strategies:
- Absolute threshold:
rate > 10,000/s→ alert - Relative to baseline:
rate > 3x average of same hour yesterday - Rate of change:
deriv(rate_limit_hits_total[5m]) > 100(tang nhanh bat thuong) - Anomaly detection: Dung Grafana ML hoac external system (Datadog Anomaly Monitors)
# Ket hop cac strategies
- alert: SuspiciousTrafficPattern
expr: >
(
rate(http_requests_total{path="/api/login"}[5m]) > 50
and
rate(http_requests_total{path="/api/login", status="401"}[5m]) /
rate(http_requests_total{path="/api/login"}[5m]) > 0.9
)
for: 2m
labels:
severity: critical
annotations:
summary: "Possible brute force attack: >90% login failures at >50 req/s"5.4 Nginx Rate Limiting Module
# /etc/nginx/conf.d/rate-limiting.conf
# --- Zone definitions ---
# Shared memory zone cho rate limiting
# 10m = 10MB shared memory ≈ 160,000 IP addresses
limit_req_zone $binary_remote_addr zone=general:10m rate=10r/s;
limit_req_zone $binary_remote_addr zone=login:10m rate=1r/s;
limit_req_zone $binary_remote_addr zone=api:10m rate=30r/s;
# Rate limit by API key (extracted from header)
map $http_x_api_key $api_key_zone {
default "anonymous";
~^.+$ $http_x_api_key;
}
limit_req_zone $api_key_zone zone=api_key:10m rate=100r/min;
# --- Connection limiting ---
limit_conn_zone $binary_remote_addr zone=conn_per_ip:10m;
server {
listen 80;
server_name api.example.com;
# Global connection limit
limit_conn conn_per_ip 20;
# Custom error page cho 429
error_page 429 = @rate_limited;
location @rate_limited {
default_type application/json;
return 429 '{"error":"rate_limited","retry_after":30}';
}
# --- General API ---
location /api/ {
limit_req zone=general burst=20 nodelay;
limit_req zone=api_key burst=50 nodelay;
limit_req_status 429;
limit_req_log_level warn;
# Truyen rate limit headers ve client
add_header X-RateLimit-Limit 10;
add_header X-RateLimit-Burst 20;
proxy_pass http://backend;
}
# --- Login endpoint (strict) ---
location /api/auth/login {
limit_req zone=login burst=3 nodelay;
limit_req_status 429;
limit_req_log_level error;
proxy_pass http://backend;
}
# --- Search endpoint (moderate) ---
location /api/search {
limit_req zone=api burst=10 delay=5;
# delay=5: 5 requests duoc xu ly ngay,
# 5 tiep theo bi delay (queued),
# sau do reject
limit_req_status 429;
proxy_pass http://backend;
}
}Giai thich
burstvanodelay:
burst=20: Cho phep 20 requests vuot rate truoc khi rejectnodelay: Xu ly burst requests ngay lap tuc (khong queue)delay=5: 5 requests dau duoc xu ly ngay, con lai bi queue
6. Code — Production-grade Rate Limiter
6.1 Python: Sliding Window Rate Limiter voi Redis
"""
Production-grade Sliding Window Counter Rate Limiter
Su dung Redis + Lua script cho atomic operations
"""
import time
import hashlib
import logging
from dataclasses import dataclass
from enum import Enum
from typing import Optional
import redis
logger = logging.getLogger(__name__)
class RateLimitTier(Enum):
FREE = "free"
BASIC = "basic"
PRO = "pro"
ENTERPRISE = "enterprise"
@dataclass(frozen=True)
class RateLimitConfig:
max_requests: int # So request toi da trong window
window_seconds: int # Kich thuoc window (giay)
burst_size: int # Burst cho phep (Token Bucket component)
@classmethod
def for_tier(cls, tier: RateLimitTier) -> "RateLimitConfig":
configs = {
RateLimitTier.FREE: cls(max_requests=60, window_seconds=60, burst_size=10),
RateLimitTier.BASIC: cls(max_requests=600, window_seconds=60, burst_size=50),
RateLimitTier.PRO: cls(max_requests=3000, window_seconds=60, burst_size=200),
RateLimitTier.ENTERPRISE: cls(max_requests=30000, window_seconds=60, burst_size=1000),
}
return configs[tier]
@dataclass
class RateLimitResult:
allowed: bool
limit: int
remaining: int
reset_at: float # Unix timestamp khi window reset
retry_after: Optional[int] # Seconds to wait (None neu allowed)
@property
def headers(self) -> dict[str, str]:
"""Tra ve HTTP headers theo RFC draft."""
headers = {
"X-RateLimit-Limit": str(self.limit),
"X-RateLimit-Remaining": str(max(0, self.remaining)),
"X-RateLimit-Reset": str(int(self.reset_at)),
}
if not self.allowed and self.retry_after:
headers["Retry-After"] = str(self.retry_after)
return headers
# Lua script: atomic sliding window counter
# Chay toan bo tren Redis server — khong co race condition
SLIDING_WINDOW_LUA = """
local key = KEYS[1]
local now = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local limit = tonumber(ARGV[3])
local current_window = math.floor(now / window)
local previous_window = current_window - 1
local elapsed = now - (current_window * window)
local current_key = key .. ":" .. current_window
local previous_key = key .. ":" .. previous_window
local previous_count = tonumber(redis.call("GET", previous_key) or "0")
local current_count = tonumber(redis.call("GET", current_key) or "0")
-- Sliding window counter formula
local weighted_count = previous_count * (1 - elapsed / window) + current_count
if weighted_count >= limit then
-- Rate limited
local ttl = window - elapsed
return {0, limit, math.floor(limit - weighted_count), math.ceil(now + ttl), math.ceil(ttl)}
end
-- Allowed: increment current window counter
redis.call("INCR", current_key)
redis.call("EXPIRE", current_key, window * 2) -- TTL = 2 windows de giu previous
redis.call("EXPIRE", previous_key, window * 2)
local new_count = weighted_count + 1
local remaining = math.floor(limit - new_count)
local reset_at = (current_window + 1) * window
return {1, limit, remaining, math.ceil(reset_at), 0}
"""
class SlidingWindowRateLimiter:
"""
Production-grade sliding window counter rate limiter.
Features:
- Atomic Redis operations via Lua script (no race conditions)
- Tiered rate limiting (Free/Basic/Pro/Enterprise)
- Compound key support (IP + user + endpoint)
- Graceful degradation khi Redis down (fail-open)
- Metrics emission for Prometheus
"""
def __init__(
self,
redis_client: redis.Redis,
fail_open: bool = True,
metrics_callback=None,
):
self._redis = redis_client
self._fail_open = fail_open
self._metrics = metrics_callback
self._lua_sha: Optional[str] = None
def _ensure_script(self) -> str:
"""Load Lua script vao Redis (cached)."""
if self._lua_sha is None:
self._lua_sha = self._redis.script_load(SLIDING_WINDOW_LUA)
return self._lua_sha
def _build_key(
self,
identifier: str,
endpoint: Optional[str] = None,
) -> str:
"""Tao Redis key tu identifier + endpoint."""
parts = ["rl", identifier]
if endpoint:
# Hash endpoint de key khong qua dai
ep_hash = hashlib.md5(endpoint.encode()).hexdigest()[:8]
parts.append(ep_hash)
return ":".join(parts)
def check(
self,
identifier: str,
config: RateLimitConfig,
endpoint: Optional[str] = None,
) -> RateLimitResult:
"""
Kiem tra va ghi nhan 1 request.
Args:
identifier: User ID, API key, hoac IP address
config: Rate limit configuration
endpoint: Optional endpoint path cho per-endpoint limiting
Returns:
RateLimitResult voi allowed status va headers
"""
key = self._build_key(identifier, endpoint)
try:
sha = self._ensure_script()
now = time.time()
result = self._redis.evalsha(
sha,
1, # so keys
key,
now,
config.window_seconds,
config.max_requests,
)
allowed, limit, remaining, reset_at, retry_after = result
rate_result = RateLimitResult(
allowed=bool(allowed),
limit=limit,
remaining=remaining,
reset_at=reset_at,
retry_after=retry_after if retry_after > 0 else None,
)
# Emit metrics
if self._metrics:
self._metrics(
identifier=identifier,
endpoint=endpoint or "global",
allowed=rate_result.allowed,
)
return rate_result
except redis.ConnectionError:
logger.error("Redis connection failed for rate limiting")
if self._fail_open:
# Fail-open: cho phep request khi Redis down
return RateLimitResult(
allowed=True,
limit=config.max_requests,
remaining=config.max_requests,
reset_at=time.time() + config.window_seconds,
retry_after=None,
)
else:
# Fail-closed: reject tat ca khi Redis down
return RateLimitResult(
allowed=False,
limit=config.max_requests,
remaining=0,
reset_at=time.time() + 60,
retry_after=60,
)
except redis.RedisError as e:
logger.error(f"Redis error in rate limiter: {e}")
if self._fail_open:
return RateLimitResult(
allowed=True,
limit=config.max_requests,
remaining=config.max_requests,
reset_at=time.time() + config.window_seconds,
retry_after=None,
)
raise
# === Su dung ===
if __name__ == "__main__":
r = redis.Redis(host="localhost", port=6379, decode_responses=False)
limiter = SlidingWindowRateLimiter(redis_client=r, fail_open=True)
config = RateLimitConfig.for_tier(RateLimitTier.FREE)
for i in range(65):
result = limiter.check(
identifier="user:12345",
config=config,
endpoint="/api/search",
)
if not result.allowed:
print(f"Request {i+1}: REJECTED | retry_after={result.retry_after}s")
print(f" Headers: {result.headers}")
break
else:
print(f"Request {i+1}: OK | remaining={result.remaining}")6.2 Node.js: Express Rate Limiting Middleware
// middleware/rate-limiter.js
// Production-grade Express middleware voi Redis backend
const Redis = require("ioredis");
const crypto = require("crypto");
// === Tier Configs ===
const TIER_CONFIGS = {
free: { maxRequests: 60, windowSeconds: 60, burstSize: 10 },
basic: { maxRequests: 600, windowSeconds: 60, burstSize: 50 },
pro: { maxRequests: 3000, windowSeconds: 60, burstSize: 200 },
enterprise: { maxRequests: 30000, windowSeconds: 60, burstSize: 1000 },
};
// === Lua Script (same logic as Python version) ===
const SLIDING_WINDOW_LUA = `
local key = KEYS[1]
local now = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local limit = tonumber(ARGV[3])
local current_window = math.floor(now / window)
local previous_window = current_window - 1
local elapsed = now - (current_window * window)
local current_key = key .. ":" .. current_window
local previous_key = key .. ":" .. previous_window
local previous_count = tonumber(redis.call("GET", previous_key) or "0")
local current_count = tonumber(redis.call("GET", current_key) or "0")
local weighted_count = previous_count * (1 - elapsed / window) + current_count
if weighted_count >= limit then
local ttl = window - elapsed
return {0, limit, math.floor(limit - weighted_count), math.ceil(now + ttl), math.ceil(ttl)}
end
redis.call("INCR", current_key)
redis.call("EXPIRE", current_key, window * 2)
redis.call("EXPIRE", previous_key, window * 2)
local new_count = weighted_count + 1
local remaining = math.floor(limit - new_count)
local reset_at = (current_window + 1) * window
return {1, limit, remaining, math.ceil(reset_at), 0}
`;
class RateLimiter {
constructor({ redisClient, failOpen = true, prefix = "rl" }) {
this.redis = redisClient;
this.failOpen = failOpen;
this.prefix = prefix;
this.scriptSha = null;
}
async ensureScript() {
if (!this.scriptSha) {
this.scriptSha = await this.redis.script("LOAD", SLIDING_WINDOW_LUA);
}
return this.scriptSha;
}
buildKey(identifier, endpoint) {
const parts = [this.prefix, identifier];
if (endpoint) {
const hash = crypto
.createHash("md5")
.update(endpoint)
.digest("hex")
.slice(0, 8);
parts.push(hash);
}
return parts.join(":");
}
async check(identifier, config, endpoint = null) {
const key = this.buildKey(identifier, endpoint);
try {
const sha = await this.ensureScript();
const now = Date.now() / 1000;
const [allowed, limit, remaining, resetAt, retryAfter] =
await this.redis.evalsha(
sha,
1,
key,
now,
config.windowSeconds,
config.maxRequests
);
return {
allowed: Boolean(allowed),
limit,
remaining: Math.max(0, remaining),
resetAt,
retryAfter: retryAfter > 0 ? retryAfter : null,
};
} catch (err) {
console.error("[RateLimiter] Redis error:", err.message);
if (this.failOpen) {
return {
allowed: true,
limit: config.maxRequests,
remaining: config.maxRequests,
resetAt: Date.now() / 1000 + config.windowSeconds,
retryAfter: null,
};
}
throw err;
}
}
}
// === Express Middleware Factory ===
function rateLimitMiddleware(options = {}) {
const {
redisUrl = "redis://localhost:6379",
failOpen = true,
keyExtractor = defaultKeyExtractor,
tierExtractor = defaultTierExtractor,
onRejected = defaultOnRejected,
} = options;
const redisClient = new Redis(redisUrl);
const limiter = new RateLimiter({ redisClient, failOpen });
return async function rateLimit(req, res, next) {
try {
const identifier = keyExtractor(req);
const tier = tierExtractor(req);
const config = TIER_CONFIGS[tier] || TIER_CONFIGS.free;
const endpoint = `${req.method}:${req.route?.path || req.path}`;
const result = await limiter.check(identifier, config, endpoint);
// Set rate limit headers
res.set("X-RateLimit-Limit", String(result.limit));
res.set("X-RateLimit-Remaining", String(result.remaining));
res.set("X-RateLimit-Reset", String(result.resetAt));
if (!result.allowed) {
res.set("Retry-After", String(result.retryAfter));
return onRejected(req, res, result);
}
next();
} catch (err) {
console.error("[RateLimiter] Middleware error:", err.message);
// Neu co loi va failOpen = true, Lua script da handle
next();
}
};
}
// === Default Helpers ===
function defaultKeyExtractor(req) {
// Uu tien: user ID > API key > IP
if (req.user?.id) return `user:${req.user.id}`;
if (req.headers["x-api-key"]) return `key:${req.headers["x-api-key"]}`;
const ip = req.ip || req.headers["x-forwarded-for"] || "unknown";
return `ip:${ip}`;
}
function defaultTierExtractor(req) {
return req.user?.tier || "free";
}
function defaultOnRejected(req, res, result) {
return res.status(429).json({
error: {
code: "RATE_LIMITED",
message: "Rate limit exceeded. Please retry later.",
retry_after: result.retryAfter,
},
});
}
// === Usage voi Express ===
// const express = require('express');
// const app = express();
//
// app.use(rateLimitMiddleware({
// redisUrl: process.env.REDIS_URL || 'redis://localhost:6379',
// failOpen: true,
// }));
//
// app.get('/api/data', (req, res) => {
// res.json({ data: 'hello' });
// });
module.exports = { RateLimiter, rateLimitMiddleware, TIER_CONFIGS };6.3 Nginx Rate Limiting Config (Production)
# /etc/nginx/conf.d/rate-limiting.conf
# Production-ready Nginx rate limiting configuration
# === Shared memory zones ===
# $binary_remote_addr = 4 bytes (IPv4) hoac 16 bytes (IPv6)
# 10m zone ≈ 160,000 IPv4 addresses
# General API: 10 requests/second per IP
limit_req_zone $binary_remote_addr zone=api_per_ip:10m rate=10r/s;
# Auth endpoints: 1 request/second per IP (strict)
limit_req_zone $binary_remote_addr zone=auth_per_ip:10m rate=1r/s;
# Per API key: 100 requests/minute
map $http_x_api_key $limit_key {
default $binary_remote_addr;
"~^.+$" $http_x_api_key;
}
limit_req_zone $limit_key zone=per_api_key:10m rate=100r/m;
# Connection limiting per IP
limit_conn_zone $binary_remote_addr zone=conn_per_ip:5m;
# === Logging ===
log_format rate_limit '$remote_addr - $remote_user [$time_local] '
'"$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent" '
'limit_req_status=$limit_req_status';
server {
listen 443 ssl http2;
server_name api.example.com;
# Connection limits
limit_conn conn_per_ip 30;
limit_conn_log_level warn;
# Custom 429 response
limit_req_status 429;
access_log /var/log/nginx/rate_limit.log rate_limit;
# --- Standard API endpoints ---
location /api/ {
limit_req zone=api_per_ip burst=20 nodelay;
limit_req zone=per_api_key burst=50 nodelay;
proxy_pass http://api_upstream;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Rate-Limited $limit_req_status;
}
# --- Auth endpoints (strict rate limiting) ---
location /api/auth/ {
limit_req zone=auth_per_ip burst=3 nodelay;
proxy_pass http://api_upstream;
proxy_set_header X-Real-IP $remote_addr;
}
# --- Health check (no rate limiting) ---
location /health {
limit_req off;
return 200 '{"status":"ok"}';
add_header Content-Type application/json;
}
# --- Error handling ---
error_page 429 @rate_limited;
location @rate_limited {
default_type application/json;
add_header Retry-After 30 always;
add_header X-RateLimit-Limit 10 always;
return 429 '{"error":{"code":"RATE_LIMITED","message":"Too many requests","retry_after":30}}';
}
}6.4 Lua Script cho Atomic Redis Rate Limiting
-- rate_limiter.lua
-- Lua script chay tren Redis cho token bucket algorithm
-- Su dung: EVALSHA <sha> 1 <key> <now> <rate> <capacity> <requested>
-- Tra ve: {allowed (0/1), tokens_remaining, retry_after_ms}
local key = KEYS[1]
local now = tonumber(ARGV[1]) -- Current timestamp (ms)
local rate = tonumber(ARGV[2]) -- Tokens per second
local capacity = tonumber(ARGV[3]) -- Max tokens (bucket size)
local requested = tonumber(ARGV[4]) -- Tokens requested (usually 1)
-- Lay state hien tai tu Redis
local bucket = redis.call("HMGET", key, "tokens", "last_refill")
local tokens = tonumber(bucket[1])
local last_refill = tonumber(bucket[2])
-- Lan dau: khoi tao bucket day
if tokens == nil then
tokens = capacity
last_refill = now
end
-- Tinh so token duoc refill
local elapsed_ms = math.max(0, now - last_refill)
local new_tokens = elapsed_ms * rate / 1000 -- rate la tokens/second
tokens = math.min(capacity, tokens + new_tokens)
local allowed = 0
local retry_after_ms = 0
if tokens >= requested then
-- Du token: cho phep request
allowed = 1
tokens = tokens - requested
else
-- Khong du token: tinh thoi gian cho
local deficit = requested - tokens
retry_after_ms = math.ceil(deficit / rate * 1000)
end
-- Luu state moi
redis.call("HMSET", key, "tokens", tostring(tokens), "last_refill", tostring(now))
-- TTL = 2x thoi gian de bucket day lai (phong truong hop key bi giu vinh vien)
local ttl_seconds = math.ceil(capacity / rate * 2)
redis.call("EXPIRE", key, ttl_seconds)
return {allowed, math.floor(tokens), retry_after_ms}# Su dung Lua script tu Python
import time
import redis
r = redis.Redis(host="localhost", port=6379)
# Load script 1 lan
with open("rate_limiter.lua", "r") as f:
TOKEN_BUCKET_SHA = r.script_load(f.read())
def token_bucket_check(
user_id: str,
rate: float = 10.0, # 10 tokens/s
capacity: int = 50, # Max 50 tokens
requested: int = 1,
) -> dict:
"""Token bucket rate limit check via Lua script."""
key = f"tb:{user_id}"
now_ms = int(time.time() * 1000)
allowed, remaining, retry_after_ms = r.evalsha(
TOKEN_BUCKET_SHA,
1, # number of keys
key, # KEYS[1]
now_ms, # ARGV[1]
rate, # ARGV[2]
capacity, # ARGV[3]
requested, # ARGV[4]
)
return {
"allowed": bool(allowed),
"remaining": remaining,
"retry_after_ms": retry_after_ms,
}
# Test
for i in range(55):
result = token_bucket_check("user:42", rate=10, capacity=50)
status = "OK" if result["allowed"] else f"REJECTED (retry in {result['retry_after_ms']}ms)"
print(f"Request {i+1}: {status} | remaining={result['remaining']}")7. System Design Diagrams
7.1 Rate Limiter Architecture trong API Gateway
flowchart TD Client([Client]) -->|Request| CDN[CDN / Edge<br/>Cloudflare / AWS CloudFront] CDN -->|Layer 3-4 filtering<br/>DDoS mitigation| LB[Load Balancer<br/>Nginx / ALB] LB --> GW[API Gateway<br/>Kong / AWS API GW] subgraph "Rate Limiting Pipeline" GW --> EXTRACT[Extract Identity<br/>IP / API Key / User ID / JWT] EXTRACT --> LOOKUP[Lookup Tier Config<br/>Free / Basic / Pro / Enterprise] LOOKUP --> CHECK{Rate Limit Check<br/>via Redis} CHECK -->|Allowed| AUTH[Authentication<br/>& Authorization] CHECK -->|Rejected| REJECT[HTTP 429<br/>+ Retry-After header] end subgraph "Redis Rate Limit Store" REDIS[(Redis Cluster)] CHECK <-->|Lua Script<br/>Atomic check + increment| REDIS REDIS --- R1[Sliding Window Counters] REDIS --- R2[Token Bucket State] REDIS --- R3[Blocked IPs Set] end AUTH --> APP[Application Server] APP --> DB[(Database)] REJECT -->|429 + headers| Client subgraph "Monitoring" REDIS -->|Metrics| PROM[Prometheus] GW -->|rate_limit_hits_total| PROM PROM --> GRAF[Grafana Dashboard] GRAF -->|Alert| OPS[Ops Team / PagerDuty] end style CHECK fill:#ff9800,stroke:#333,stroke-width:2px style REJECT fill:#f44336,stroke:#333,stroke-width:2px,color:#fff style AUTH fill:#4caf50,stroke:#333,stroke-width:2px style REDIS fill:#d32f2f,stroke:#333,stroke-width:2px,color:#fff
7.2 Token Bucket Visualization
sequenceDiagram participant C as Client participant RL as Rate Limiter participant B as Token Bucket<br/>(capacity=5, rate=2/s) participant S as Server Note over B: Bucket: [*][*][*][*][*]<br/>tokens = 5 (full) C->>RL: Request 1 RL->>B: consume(1) B-->>RL: OK (tokens=4) RL->>S: Forward request S-->>C: 200 OK C->>RL: Request 2 RL->>B: consume(1) B-->>RL: OK (tokens=3) RL->>S: Forward request S-->>C: 200 OK C->>RL: Request 3, 4, 5 (burst) RL->>B: consume(3) B-->>RL: OK (tokens=0) RL->>S: Forward 3 requests S-->>C: 200 OK x3 Note over B: Bucket: [ ][ ][ ][ ][ ]<br/>tokens = 0 (empty!) C->>RL: Request 6 RL->>B: consume(1) B-->>RL: REJECTED (tokens=0) RL-->>C: 429 Too Many Requests<br/>Retry-After: 1 Note over B: +1 second passes...<br/>Refill: 2 tokens/s Note over B: Bucket: [*][*][ ][ ][ ]<br/>tokens = 2 C->>RL: Request 7 (after 1s) RL->>B: consume(1) B-->>RL: OK (tokens=1) RL->>S: Forward request S-->>C: 200 OK
7.3 Distributed Rate Limiting — Race Condition va Solution
sequenceDiagram participant S1 as API Server 1 participant S2 as API Server 2 participant R as Redis Note over S1,R: === WITHOUT Lua Script (Race Condition) === S1->>R: GET counter (= 99) S2->>R: GET counter (= 99) Note over S1: 99 < 100, allow! Note over S2: 99 < 100, allow! S1->>R: INCR counter (= 100) S2->>R: INCR counter (= 101) Note over R: Counter = 101<br/>LIMIT VIOLATED! Note over S1,R: === WITH Lua Script (Atomic) === S1->>R: EVALSHA lua_script Note over R: Lua: GET=99, 99<100<br/>INCR → 100, return ALLOW R-->>S1: ALLOWED (remaining=0) S2->>R: EVALSHA lua_script Note over R: Lua: GET=100, 100>=100<br/>return REJECT R-->>S2: REJECTED (retry_after=45s) Note over R: Counter = 100<br/>LIMIT RESPECTED!
8. Aha Moments & Pitfalls
Aha Moment #1: Race Conditions trong Distributed Rate Limiting
Khi co 10 API servers cung query Redis, read-then-write pattern tao race condition. 2 servers doc counter = 99, ca hai cho phep, counter thanh 101. Giai phap duy nhat dung: Lua script hoac Redis MULTI/EXEC. Khong bao gio dung GET roi INCR rieng le.
Aha Moment #2: Clock Skew co the pha rate limiter
Trong distributed system, dong ho cac server co the lech nhau vai milliseconds den vai giay. Neu rate limiter dung timestamp tu application server thay vi Redis server, 2 servers o 2 window khac nhau tai cung thoi diem. Giai phap: Luon dung
redis.call("TIME")trong Lua script, hoac dung Redis server timestamp. Khong bao gio trust client timestamp.
Aha Moment #3: Rate Limiting Internal Services — Can than!
Hieu, day la sai lam cuc ky pho bien: dat rate limit len tat ca services, ke ca internal service-to-service communication. Ket qua: khi traffic spike, Service A bi rate limit boi Service B, tao cascading failure con toi te hon khong co rate limit. Quy tac: Rate limit tai edge (API Gateway) cho external traffic. Internal services dung circuit breaker (xem Tuan-11-Microservices-Pattern) thay vi rate limiter.
Aha Moment #4: Over-aggressive Limits lam hai UX
Mot startup dat rate limit 10 req/min cho free tier. Ket qua: user load trang chu (5 API calls) + click 1 link (3 calls) + scroll (3 calls) = 11 calls → bi rate limit chi sau 2 giay su dung. User bo di va khong bao gio quay lai. Quy tac: Luon test rate limit bang cach mo app nhu user binh thuong va dem so request. Rate limit phai >= 3x normal usage pattern.
Pitfall #1: Fixed Window Boundary Burst
Su dung Fixed Window Counter voi limit 100/min. Attacker gui 100 req o giay 59 va 100 req o giay 60 → 200 requests trong 2 giay. Day la ly do Fixed Window khong bao gio duoc dung cho security-critical endpoints. Dung Sliding Window Counter hoac Token Bucket.
Pitfall #2: Fail-Open vs Fail-Closed — Chon sai co the chet
Fail-open (cho phep tat ca khi Redis down): An toan cho UX nhung DDoS se lot qua. Fail-closed (reject tat ca khi Redis down): An toan cho security nhung Redis down = toan bo he thong down. Giai phap: Fail-open + Redis HA (Sentinel/Cluster) + fallback local rate limiter (in-memory, less accurate but better than nothing).
Pitfall #3: Quen rate limit webhooks va background jobs
Hieu chi rate limit HTTP API ma quen rang webhooks (Stripe, GitHub) va background workers cung tao load. Mot webhook retry storm co the giong DDoS. Giai phap: Rate limit moi entry point vao system, khong chi user-facing API.
Pitfall #4: Khong co rate limit cho dang ky tai khoan
Attacker tao 100,000 accounts → moi account co rate limit rieng → bypass hoan toan. Giai phap: Rate limit account creation by IP (5 accounts/IP/ngay) + CAPTCHA + email verification.
9. Internal Links & Tham khao
Prerequisite (doc truoc)
- Tuan-02-Back-of-the-envelope — Can biet estimation de tinh rate limit thresholds
- Tuan-05-Load-Balancer — Rate limiter thuong nam sau Load Balancer
- Tuan-06-Cache-Strategy — Redis la backbone cua distributed rate limiting
Lien quan truc tiep
- Tuan-14-AuthN-AuthZ-Security — Rate limiting la mot phan cua security strategy
- Tuan-15-Data-Security-Encryption — Bao ve API keys dung cho rate limiting
- Tuan-13-Monitoring-Observability — Monitoring rate limit metrics
- Tuan-11-Microservices-Pattern — Circuit breaker vs rate limiter cho internal services
- Tuan-03-Networking-DNS-CDN — CDN-level rate limiting va DDoS mitigation
Ap dung trong Case Studies
- Tuan-16-Design-URL-Shortener — Rate limit URL creation de chong spam
- Tuan-17-Design-Chat-System — Rate limit messages de chong spam/flooding
- Tuan-18-Design-News-Feed — Rate limit feed refresh de giam load
- Tuan-19-Design-Notification-System — Rate limit notification sending
- Tuan-20-Design-Key-Value-Store — Rate limit writes de bao ve storage
Tham khao
- Alex Xu, System Design Interview — Chapter 4: Design a Rate Limiter
- sdi.anhvy.dev — Rate Limiter patterns & algorithms
- Cloudflare Blog: How we built rate limiting capable of scaling to millions of domains
- Stripe Engineering: Rate limiters and load shedders
- Kong Documentation: Rate Limiting Plugin
Tuan truoc: Tuan-08-Message-Queue — Message Queue lam buffer cho traffic spikes Tuan sau: Tuan-10-Consistent-Hashing — Phan phoi deu data across nodes