Case Study 01 — Data Layer cho E-commerce

“Design data layer cho e-commerce: 1M products, 10M users, 1K orders/sec đỉnh, inventory concurrency, recommendations, search. Apply mọi thứ đã học.”

Tags: case-study ecommerce database-design Prerequisites: Tất cả 16 weekly chapters Liên quan: Tuan-02-Schema-Design-Normalization · Tuan-05-Transactions-Locking · Tuan-15-Vector-DB-AI

1. Requirements

Functional

User signup, login, profile
Product catalog: 1M SKUs, attributes by category
Search: keyword + filter + sort
Cart, checkout, order
Payment integration (Stripe)
Inventory: real-time stock
Reviews & ratings
Recommendations
Admin (product mgmt, order mgmt)
Analytics dashboard

Non-functional

10M users, 1M products, 100M orders/year
Peak 1K orders/sec (Black Friday)
Search latency P99 < 200ms
Product page P99 < 100ms
Order placement P99 < 500ms
Inventory accurate (no oversell)
Multi-region (US, EU)
Compliance: GDPR, PCI-DSS

2. Data Stores Choice

graph TB
    Client[Web/Mobile] --> LB[Load Balancer]
    LB --> App[App Servers]

    App --> PG[(Postgres<br/>Source of truth)]
    App --> Redis[(Redis<br/>cache, session, rate limit)]
    App --> ES[(Elasticsearch<br/>search)]
    App --> Vector[(pgvector<br/>recommendations)]
    App --> S3[(S3<br/>product images)]

    PG -.CDC.-> Kafka[(Kafka)]
    Kafka --> ES
    Kafka --> Redis
    Kafka --> CH[(ClickHouse<br/>analytics)]

    App --> Stripe[Stripe API<br/>payments]

    style PG fill:#c8e6c9
    style Redis fill:#fff9c4
    style ES fill:#fff9c4
    style CH fill:#fff9c4

Justifications:

Postgres: ACID for orders, payments. Single source of truth. Use partitioning + read replica for scale.
Redis: Cache, session, rate limit, cart (in-memory ~30 min TTL)
Elasticsearch: Full-text search + filter aggregations (1M products)
pgvector: Product recommendation via embedding (since data already in Postgres)
ClickHouse: Analytics (revenue, top products, funnel)
Kafka + Debezium: Sync from Postgres to other stores

3. Postgres Schema

3.1 Core entities

-- Users
CREATE TABLE users (
    id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    email citext NOT NULL,
    name text NOT NULL,
    created_at timestamptz DEFAULT now(),
    deleted_at timestamptz
);
CREATE UNIQUE INDEX idx_users_email_active ON users(email) WHERE deleted_at IS NULL;
 
-- Addresses, payment methods (similar pattern)
 
-- Products
CREATE TABLE products (
    id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    sku text UNIQUE NOT NULL,
    name text NOT NULL,
    description text,
    category_id bigint REFERENCES categories(id),
    price_cents bigint NOT NULL CHECK (price_cents >= 0),
    currency char(3) NOT NULL DEFAULT 'USD',
    status text CHECK (status IN ('draft', 'active', 'archived')),
    attributes jsonb DEFAULT '{}',
    tags text[],
    embedding vector(1536),  -- for similarity
    created_at timestamptz DEFAULT now(),
    updated_at timestamptz DEFAULT now()
);
CREATE INDEX idx_products_category ON products(category_id) WHERE status = 'active';
CREATE INDEX idx_products_tags ON products USING gin(tags);
CREATE INDEX idx_products_embedding ON products USING hnsw(embedding vector_cosine_ops);
 
-- Inventory (separate for concurrency)
CREATE TABLE inventory (
    product_id bigint PRIMARY KEY REFERENCES products(id),
    quantity int NOT NULL CHECK (quantity >= 0),
    reserved int NOT NULL DEFAULT 0 CHECK (reserved >= 0),
    updated_at timestamptz DEFAULT now()
);
 
-- Orders (partitioned by month)
CREATE TABLE orders (
    id bigint GENERATED ALWAYS AS IDENTITY,
    user_id bigint NOT NULL REFERENCES users(id),
    status text NOT NULL CHECK (status IN ('pending', 'paid', 'shipping', 'delivered', 'cancelled', 'refunded')),
    currency char(3) NOT NULL,
    subtotal_cents bigint NOT NULL,
    tax_cents bigint NOT NULL,
    shipping_cents bigint NOT NULL,
    total_cents bigint GENERATED ALWAYS AS (subtotal_cents + tax_cents + shipping_cents) STORED,
    payment_intent_id text,
    placed_at timestamptz NOT NULL DEFAULT now(),
    paid_at timestamptz,
    PRIMARY KEY (placed_at, id)  -- composite for partitioning
) PARTITION BY RANGE (placed_at);
 
CREATE TABLE orders_2026_q2 PARTITION OF orders FOR VALUES FROM ('2026-04-01') TO ('2026-07-01');
-- create future partitions via pg_partman
 
CREATE INDEX idx_orders_user_placed ON orders (user_id, placed_at DESC);
CREATE INDEX idx_orders_status_placed ON orders (status, placed_at);
 
-- Order items with snapshot
CREATE TABLE order_items (
    id bigint GENERATED ALWAYS AS IDENTITY,
    order_placed_at timestamptz NOT NULL,
    order_id bigint NOT NULL,
    product_id bigint NOT NULL REFERENCES products(id),
    qty int NOT NULL CHECK (qty > 0),
    product_name_snapshot text NOT NULL,
    product_sku_snapshot text NOT NULL,
    unit_price_cents bigint NOT NULL,
    line_total_cents bigint GENERATED ALWAYS AS (unit_price_cents * qty) STORED,
    PRIMARY KEY (order_placed_at, id),
    FOREIGN KEY (order_placed_at, order_id) REFERENCES orders(placed_at, id) ON DELETE CASCADE
) PARTITION BY RANGE (order_placed_at);
 
-- Reviews
CREATE TABLE reviews (
    id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    product_id bigint REFERENCES products(id),
    user_id bigint REFERENCES users(id),
    rating smallint CHECK (rating BETWEEN 1 AND 5),
    title text,
    body text,
    created_at timestamptz DEFAULT now(),
    UNIQUE (product_id, user_id)
);
CREATE INDEX idx_reviews_product ON reviews (product_id, created_at DESC);
 
-- Outbox for CDC
CREATE TABLE outbox (
    id bigserial PRIMARY KEY,
    aggregate_type text,
    aggregate_id text,
    event_type text,
    payload jsonb,
    created_at timestamptz DEFAULT now()
);

3.2 Key decisions

Decision	Justification
`bigint identity` PK	OLTP single-region writes, no UUID overhead
`citext` email	Case-insensitive uniqueness
`price_cents bigint`	Exact arithmetic
Generated `total_cents`	Auto-compute
Orders partitioned by month	100M rows/year manageable
Snapshot price in order_items	Immutable order history
`embedding` in products	Recommendation lookups
`outbox` table	Atomic event publishing

4. Inventory Concurrency

4.1 The challenge

100 users try to buy last 1 unit simultaneously. Need exactly 1 succeeds.

4.2 Approach: Atomic decrement + check

-- Reserve at cart checkout
UPDATE inventory
SET reserved = reserved + 1
WHERE product_id = $1 AND (quantity - reserved) >= 1
RETURNING quantity - reserved AS remaining;
 
-- If 0 rows → out of stock

UPDATE ... WHERE atomic. No explicit lock needed.

4.3 Reservation timeout

Reservation expires if cart abandoned:

CREATE TABLE inventory_reservations (
    id bigserial PRIMARY KEY,
    product_id bigint REFERENCES products(id),
    user_id bigint,
    qty int,
    expires_at timestamptz NOT NULL
);
 
-- Cron job: release expired
DELETE FROM inventory_reservations WHERE expires_at < now() RETURNING product_id, qty;
-- Update inventory.reserved accordingly

4.4 On checkout payment success

BEGIN;
-- Convert reservation to actual decrement
UPDATE inventory SET quantity = quantity - 1, reserved = reserved - 1
    WHERE product_id = $1;
DELETE FROM inventory_reservations WHERE id = $2;
-- Insert order, order_items
-- Insert outbox event
COMMIT;

Read inventory from Redis cache, refresh every 1s
Write goes to Postgres
Eventually consistent for “remaining: 5” display
Strong consistency for actual checkout (Postgres atomic UPDATE)

5. Cart Management

5.1 Where to store cart?

Options:

A. Redis hash: user:42:cart
B. Postgres carts table
C. Postgres for logged-in, Redis for anonymous

Recommended: Redis for active carts (30-day TTL), Postgres for “saved for later”.

HSET cart:user:42 product:101 2 product:102 1
EXPIRE cart:user:42 2592000   # 30 days

Pros:

Fast (5ms)
Auto-expire (no cleanup)
Cart updates atomic

Cons:

Lost if Redis goes down (acceptable trade)
Migration to checkout = read Redis, write Postgres order

5.2 Cart → Order

async def checkout(user_id):
    cart = redis.hgetall(f"cart:user:{user_id}")
    if not cart: raise EmptyCart
 
    # Reserve inventory for each item
    reservations = []
    for product_id, qty in cart.items():
        reserved = await reserve_inventory(product_id, int(qty))
        if not reserved:
            # Rollback prior reservations
            for r in reservations: await release(r)
            raise OutOfStock(product_id)
        reservations.append((product_id, qty))
 
    # Create payment intent at Stripe
    intent = stripe.PaymentIntent.create(...)
 
    # Save pending order
    async with db.transaction():
        order_id = await create_order(user_id, intent.id, items)
        await create_outbox_event("OrderPlaced", order_id, ...)
 
    return order_id, intent.client_secret

After Stripe webhook confirms payment:

async def stripe_webhook(event):
    if event.type == 'payment_intent.succeeded':
        async with db.transaction():
            await update_order_status(event.data.order_id, 'paid')
            await commit_inventory(reservations)
            await create_outbox_event("OrderPaid", order_id)
 
    # Clear cart
    redis.delete(f"cart:user:{user_id}")

6. Search Architecture

6.1 Elasticsearch sync

graph LR
    PG[(Postgres products)] -.Debezium CDC.-> Kafka[(Kafka)]
    Kafka --> Sync[ES sync service]
    Sync --> ES[(Elasticsearch)]

ES index document:

{
    "id": 42,
    "sku": "...",
    "name": "iPhone 15 Pro",
    "description": "...",
    "category_path": ["electronics", "phones", "iphone"],
    "price_cents": 99900,
    "currency": "USD",
    "tags": ["new", "premium"],
    "attributes": {"color": "blue", "storage": "256GB"},
    "rating_avg": 4.7,
    "rating_count": 1234,
    "stock_status": "in_stock",
    "search_vector": "iphone pro premium phone"
}

6.2 Search query

POST /products/_search
{
  "query": {
    "bool": {
      "must": {
        "multi_match": {
          "query": "iphone case",
          "fields": ["name^3", "description", "tags^2"],
          "type": "best_fields"
        }
      },
      "filter": [
        {"term": {"category_path": "phones"}},
        {"range": {"price_cents": {"gte": 1000, "lte": 100000}}},
        {"term": {"stock_status": "in_stock"}}
      ]
    }
  },
  "aggs": {
    "categories": {"terms": {"field": "category_path", "size": 10}},
    "price_ranges": {"range": {"field": "price_cents", "ranges": [{"to": 5000}, {"from": 5000, "to": 50000}, {"from": 50000}]}}
  },
  "sort": [{"_score": "desc"}, {"rating_avg": "desc"}]
}

6.3 Search-as-you-type

ES completion suggester for autocomplete. Sub-50ms response.

7. Product Recommendations

7.1 “Similar products” via pgvector

Generate embedding for each product (name + description + category) using OpenAI or BGE.

-- On product create/update, compute embedding and store
UPDATE products SET embedding = $1 WHERE id = $2;
 
-- Find similar
SELECT id, name, price_cents
FROM products
WHERE status = 'active' AND id != $1
ORDER BY embedding <=> (SELECT embedding FROM products WHERE id = $1)
LIMIT 10;

7.2 Collaborative filtering

For “users who bought X also bought Y”:

-- Daily batch job, store in materialized view
CREATE MATERIALIZED VIEW product_co_purchases AS
SELECT
    oi1.product_id AS product_a,
    oi2.product_id AS product_b,
    count(*) AS co_purchase_count
FROM order_items oi1
JOIN order_items oi2 USING (order_placed_at, order_id)
WHERE oi1.product_id < oi2.product_id
GROUP BY oi1.product_id, oi2.product_id
HAVING count(*) > 5;
 
-- Query (cached in Redis with 1-day TTL)
SELECT product_b FROM product_co_purchases WHERE product_a = $1 ORDER BY co_purchase_count DESC LIMIT 10;

7.3 Personalized recommendations

User embedding from history → vector search → personalized products.

async def personalized_recs(user_id):
    user_history = get_recent_purchases(user_id)  # last 20
    user_embedding = np.mean([p.embedding for p in user_history], axis=0)
    return vector_search(user_embedding, exclude=user_history.product_ids, top_k=20)

Cache in Redis 30 min TTL.

8. Order Lifecycle

stateDiagram-v2
    [*] --> Pending: place order
    Pending --> Paid: Stripe success
    Pending --> Cancelled: timeout or user cancel
    Paid --> Shipping: fulfillment
    Shipping --> Delivered: shipping done
    Delivered --> [*]
    Paid --> Refunded: refund request

Each state transition:

Updates orders.status
Writes outbox event
CDC propagates to ES (for “my orders” search), to analytics

9. Analytics Pipeline

graph LR
    PG[(Postgres orders)] -.Debezium.-> Kafka
    App[(App events)] -.kafka producer.-> Kafka
    Kafka --> CH[(ClickHouse)]
    CH -.materialized.-> CHAgg[(Hourly/Daily aggs)]
    CHAgg --> BI[Grafana/Metabase]

ClickHouse tables:

CREATE TABLE events (
    ts DateTime,
    event_type LowCardinality(String),
    user_id UInt64,
    product_id UInt64,
    order_id UInt64,
    revenue_cents UInt64,
    payload String
) ENGINE = MergeTree() PARTITION BY toYYYYMM(ts) ORDER BY (ts, event_type);
 
CREATE MATERIALIZED VIEW daily_revenue
ENGINE = SummingMergeTree() ORDER BY (date, category) AS
SELECT toDate(ts) AS date, payload['category'] AS category, sum(revenue_cents) AS revenue
FROM events
WHERE event_type = 'order_paid'
GROUP BY date, category;

Dashboards: revenue by day/category, top products, funnel (view → cart → purchase), AOV (average order value).

10. Caching Strategy

10.1 Cache layers

graph TB
    Client[Client] --> CDN[CDN<br/>static images, product pages SSG]
    CDN --> App[App server]
    App --> AppCache[App-local cache<br/>5min TTL]
    App --> Redis[(Redis<br/>distributed cache)]
    Redis --> PG[(Postgres)]

10.2 What to cache where

Data	Cache layer	TTL
Product page (rendered)	CDN	5 min
Product details	Redis	5 min
User profile	Redis	30 min
Cart	Redis	30 days (sticky)
Inventory display	Redis	1 sec
Search results	Redis	30 sec
Recommendations	Redis	30 min
Aggregated counts (reviews)	Postgres (denormalized)	n/a

10.3 Cache invalidation

Via CDC events:

Product updated → invalidate product:42 cache
Review added → recompute product.rating_avg → invalidate product cache

# CDC consumer
def on_product_change(event):
    redis.delete(f"product:{event.id}")
    # ES sync handled separately

11. Multi-region Strategy

11.1 Read replicas in EU

Primary in US-East. Read replica in EU-West.

EU users:

Reads → EU replica (latency ~10ms vs 150ms)
Writes → US primary (write latency 150ms accepted at checkout)
Inventory check → must be primary (consistency)

11.2 Future: Multi-active

Move to Aurora DSQL or CockroachDB for true multi-region writes if growth justifies.

11.3 Data residency

EU users data physically in EU:

Separate tenant per region
Or Postgres FDW for cross-region read
Or accept “stored in US, replica in EU for reads”

GDPR requires care. Talk to legal.

12. Capacity Planning

12.1 Data sizing

Users: 10M × 5KB = 50GB
Products: 1M × 10KB (with description) = 10GB
Orders: 100M × 5KB = 500GB
Order items: 500M × 1KB = 500GB
Total Postgres: ~1.5TB working set, 2-3TB with indexes

12.2 Throughput

Read: 100K req/s peak (product page, search, cart)
Write: 1K orders/sec peak

Postgres r6i.4xlarge (16 vCPU, 128GB RAM): handles. Plus read replicas for scale-out reads.

12.3 Redis sizing

Active sessions + carts + cache: ~50GB RAM. Use ElastiCache cluster (3-node primary + 3 replicas).

12.4 ES sizing

1M products × ~5KB indexed = 5GB. Comfortable on 2-3 nodes for HA.

12.5 ClickHouse sizing

5 years history: 500M orders × 1KB = 500GB compressed. 1-2 nodes sufficient.

13. Failure Modes

13.1 Postgres primary down

Promote read replica via Patroni. RTO 30s.

13.2 Redis down

Cache miss → DB → app slower but works. Set sensible timeouts.

13.3 ES down

Search degraded. Fallback to basic Postgres search? Or display error.

13.4 Stripe down

Webhook delayed. Reconciliation job handles eventually.

13.5 Kafka down

Debezium pauses. WAL accumulates (monitor!). Recover when Kafka up.

14. Observability

pg_stat_statements: top queries
pgBouncer: pool health
Redis: hit rate, evictions
ES: query latency, indexing rate
ClickHouse: ingestion lag
Application: APM (Datadog/Honeycomb)
Business: dashboards on order rate, revenue, errors

Alerts:

Order rate drop >50% → page on-call
Stripe webhook lag >5 min
Inventory negative (impossible should alert)
Replication lag >10s
pg connection count >80% max

15. Implementation Phases

gantt
    title E-commerce Data Layer Build
    section MVP
    Postgres + basic schema    : m1, 0, 2w
    Redis cache + cart          : m2, after m1, 1w
    Stripe integration          : m3, after m2, 2w
    ES search                   : m4, after m3, 2w

    section Growth
    pgBouncer                   : g1, after m4, 1w
    CDC to ES/CH                : g2, after g1, 2w
    Vector recommendations      : g3, after g2, 2w
    Read replica                : g4, after g3, 1w

    section Scale
    Partitioning orders         : s1, after g4, 2w
    Multi-region                : s2, after s1, 4w
    ClickHouse analytics        : s3, after s1, 3w

16. Tiếp theo

Case-Design-Data-SaaS-Multi-tenant

Cập nhật: 2026-05-16

lthieu's notes

Explorer

Case-Design-Data-Ecommerce