Tuần 16 — ORM, N+1, CQRS, Multi-Tenancy

“DB nhanh không cứu app slow. ORM abstracts DB nhưng cũng abstracts performance. Hiểu pattern application-level = giữ vững 50% performance dù DB tốt hay tệ.”

Tags: database orm cqrs multi-tenancy rls application-patterns Thời lượng: 7 ngày (4-6h/ngày) Prerequisites: Tuan-04-Query-Optimization-EXPLAIN · Tuan-05-Transactions-Locking Liên quan: Case-Design-Data-SaaS-Multi-tenant · Tuan-Bonus-Multi-Tenancy-SaaS-Patterns (SD course)

1. Context & Why

1.1 Tại sao application patterns quan trọng

graph LR
    A[Slow app] --> B{Where is bottleneck?}
    B -->|DB query| C[Tune query, add index]
    B -->|Network round-trips| D[Batch loading, DataLoader]
    B -->|App CPU| E[Profile code]
    B -->|N+1| F[Eager loading, prefetch]

    style D fill:#fff9c4
    style F fill:#fff9c4

Many “DB problems” are actually app patterns. Tuần 16 closes the loop.

1.2 Mục tiêu tuần

1.3 Tham chiếu

Patterns of Enterprise Application Architecture — Martin Fowler
Domain-Driven Design — Eric Evans
Implementing DDD — Vaughn Vernon
Event Sourcing — Greg Young talks/articles
Building Multi-Tenant Applications — AWS whitepaper
Prisma, TypeORM, SQLAlchemy, GORM documentation

2. ORM Landscape 2024-2026

2.1 ORM categories

graph TB
    subgraph "Active Record style"
        AR1[Rails ActiveRecord]
        AR2[Eloquent - PHP]
        AR3[Django ORM]
    end

    subgraph "Data Mapper"
        DM1[SQLAlchemy - Python]
        DM2[Doctrine - PHP]
        DM3[Hibernate - Java]
    end

    subgraph "Modern / Light"
        M1[Prisma - TS/JS]
        M2[Drizzle - TS/JS]
        M3[TypeORM - TS/JS]
        M4[GORM - Go]
        M5[sqlx + sqlc - Go]
    end

    subgraph "Query builders / SQL-first"
        Q1[Kysely - TS]
        Q2[jOOQ - Java]
        Q3[sqlx - Rust]
        Q4[Knex - JS]
    end

2.2 ORM trade-offs

	Pros	Cons
Heavy ORM (TypeORM, Hibernate)	Productivity, magic	Performance black box, complex
Light ORM (Prisma, Drizzle)	Type safety, less magic	Less feature
Query builder (Kysely, jOOQ)	Closer to SQL, type safe	More code
Raw SQL + serializer (sqlc, pg)	Maximum control	More boilerplate

2.3 Prisma (TypeScript leader 2024)

// schema.prisma
model User {
  id        Int      @id @default(autoincrement())
  email     String   @unique
  name      String
  orders    Order[]
  createdAt DateTime @default(now())
}
 
model Order {
  id     Int    @id @default(autoincrement())
  total  Float
  user   User   @relation(fields: [userId], references: [id])
  userId Int
}

const users = await prisma.user.findMany({
  where: { email: { contains: '@x.com' } },
  include: { orders: { where: { total: { gt: 100 } } } },
  take: 10,
});

Pros:

Type-safe (generated types)
Migration manager
Studio (DB GUI)
Good DX

Cons:

Generated client = compile step
Complex queries awkward (raw SQL escape hatch)
Performance overhead for some ops

2.4 Drizzle (rising 2024)

SQL-like TypeScript ORM.

const result = await db.select()
  .from(users)
  .innerJoin(orders, eq(orders.userId, users.id))
  .where(and(eq(users.country, 'VN'), gt(orders.total, 100)))
  .limit(10);

Closer to SQL, less magic. Faster than Prisma.

2.5 SQLAlchemy 2.0 (Python)

from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, relationship
 
class User(DeclarativeBase):
    __tablename__ = "users"
    id: Mapped[int] = mapped_column(primary_key=True)
    email: Mapped[str] = mapped_column(unique=True)
    orders: Mapped[list["Order"]] = relationship(back_populates="user")
 
# Async
async with session.begin():
    stmt = select(User).where(User.email.like("%@x.com")).limit(10)
    result = await session.scalars(stmt)

SQLAlchemy Core (lower-level) vs ORM. 2.0 brings async + clearer API.

2.6 GORM (Go)

type User struct {
    ID    uint
    Email string `gorm:"unique"`
    Orders []Order
}
 
var users []User
db.Where("country = ?", "VN").Preload("Orders").Find(&users)

2.7 sqlc — code-gen from SQL (Go)

-- queries.sql
-- name: GetUser :one
SELECT * FROM users WHERE id = $1;
 
-- name: ListOrdersByUser :many
SELECT * FROM orders WHERE user_id = $1 ORDER BY created_at DESC LIMIT $2;

sqlc generate

Generates Go code:

user, err := queries.GetUser(ctx, 42)
orders, err := queries.ListOrdersByUser(ctx, 42, 10)

Best of both: SQL first, type-safe code.

2.8 ORM choice 2024-2026

Personal opinion:

TypeScript: Drizzle > Prisma > TypeORM (in that order for new projects)
Python: SQLAlchemy 2.0 async > Django ORM
Go: sqlc > GORM
Rust: sqlx > Diesel
Java: jOOQ > Hibernate (modern); Hibernate still dominant legacy

3. N+1 — The Sin

3.1 The pattern

const users = await prisma.user.findMany();  // 1 query
for (const user of users) {
    const orders = await prisma.order.findMany({ where: { userId: user.id } });
    // 1 query per user → N queries
    // Total: 1 + N queries (N+1)
}

100 users → 101 queries. Each ~5ms → 500ms total.

3.2 Fix: Eager loading

const users = await prisma.user.findMany({
    include: { orders: true }  // single query with JOIN
});
// 1 query (or 2: users + orders WHERE userId IN (...))

3.3 Fix: Batch loading (DataLoader)

For GraphQL or non-SQL contexts:

import DataLoader from 'dataloader';
 
const orderLoader = new DataLoader(async (userIds: number[]) => {
    const orders = await prisma.order.findMany({
        where: { userId: { in: userIds } }
    });
    // group by userId, return in same order
    return userIds.map(id => orders.filter(o => o.userId === id));
});
 
// Usage
const userOrders = await Promise.all(users.map(u => orderLoader.load(u.id)));
// Internally batches all IDs into ONE query

DataLoader: batches + caches within request.

3.4 Detection

In dev, log every SQL query. Look for repeated queries.

Prisma: log: ['query'] SQLAlchemy: engine = create_engine(url, echo=True) Postgres: log_min_duration_statement = 0 then count

Tool: pg_stat_statements top calls. If mean_exec_time tiny but calls huge → N+1 suspect.

Lazy loading default → invisible N+1
ORM for ... in entity.children triggers fetch per access
GraphQL resolver per field → naive N+1
Polymorphic relations → multi-query

3.6 ORM-specific fixes

SQLAlchemy:

users = session.query(User).options(joinedload(User.orders)).all()
# Or
users = session.query(User).options(selectinload(User.orders)).all()
# joinedload = LEFT JOIN; selectinload = WHERE id IN (...)

ActiveRecord:

users = User.includes(:orders).all

Django:

users = User.objects.prefetch_related('orders').all()

4. Repository, Unit of Work

4.1 Repository pattern

Abstract DB access behind interface. Domain code doesn’t know SQL.

interface UserRepository {
    findById(id: number): Promise<User>;
    findByEmail(email: string): Promise<User | null>;
    save(user: User): Promise<void>;
}
 
class PostgresUserRepository implements UserRepository {
    async findById(id: number) {
        const row = await db.query('SELECT * FROM users WHERE id = $1', [id]);
        return User.fromRow(row);
    }
    // ...
}

Pros:

Test domain without real DB (mock repository)
Swap DB implementation
Centralize query logic

Cons:

Boilerplate
ORM already does this somewhat
Over-abstraction risk

4.2 Unit of Work

Track changes in transaction, commit atomically.

async function transferMoney(fromId, toId, amount) {
    await prisma.$transaction(async (tx) => {
        await tx.account.update({ where: { id: fromId }, data: { balance: { decrement: amount } } });
        await tx.account.update({ where: { id: toId }, data: { balance: { increment: amount } } });
    });
}

ORM typically implements this. Custom UoW useful for complex domain.

4.3 Domain-Driven Design glance

Entity: has identity (User with ID)
Value object: identified by attributes (Money, Address)
Aggregate: cluster of entities with one root, transactional boundary
Repository: persist aggregates
Service: stateless operations across aggregates

DDD heavy for small apps. Useful when domain complex.

5. CQRS — Command Query Responsibility Segregation

5.1 Concept

Separate write (command) and read (query) models.

graph TB
    Client --> Command[Command<br/>e.g. PlaceOrder]
    Command --> Aggregate[Aggregate / Write Model<br/>strict validation, normalized]
    Aggregate --> WriteDB[(Write DB - Postgres)]
    WriteDB -.events.-> ReadModel[Read Model Builder]
    ReadModel --> ReadDB[(Read DB - denormalized<br/>cache, ES, materialized)]

    Client --> Query
    Query --> ReadDB

    style Command fill:#fff9c4
    style Query fill:#c8e6c9

5.2 When CQRS helps

Read pattern very different from write
Read >>> write (10:1+)
Complex aggregates for write, simple flat for read
Different scaling needs

5.3 When NOT

Simple CRUD apps — overkill
Small team — complexity tax
No clear read/write asymmetry

5.4 Implementation levels

Light CQRS: separate read/write methods, same DB.

class OrderService {
    async placeOrder(cmd: PlaceOrderCommand) { /* command */ }
    async getOrderDetail(id: number) { /* query */ }
}

Medium: write to normalized, read from materialized view.

Heavy: separate DB for read (Postgres for write, ES or denorm DB for read).

5.5 Eventual consistency

CQRS often introduces lag between write commit and read availability.

sequenceDiagram
    User->>Write: PlaceOrder
    Write->>WriteDB: INSERT
    WriteDB-->>Write: OK
    Write->>EventBus: OrderPlaced event
    Write-->>User: 201 Created

    User->>Read: GetOrder (immediately)
    Read->>ReadDB: SELECT
    ReadDB-->>Read: NOT FOUND (event not processed yet)

    Note over EventBus,ReadDB: Async update
    EventBus->>ReadDB: Update read model

UX consideration: show optimistic UI, refetch, or read-after-write to write DB.

6. Event Sourcing

6.1 Concept

Source of truth = events, not current state. State = fold(events).

Traditional:

accounts table: {id: 1, balance: 100}
UPDATE balance = 50
accounts table: {id: 1, balance: 50}    -- lost history

Event-sourced:

events stream:
  AccountCreated(id=1)
  Deposited(id=1, amount=100)
  Withdrawn(id=1, amount=50)

State = events.reduce((s, e) => apply(s, e), initial)
     → {id: 1, balance: 50}

6.2 Benefits

Full audit log (free)
Time travel: state at any point
Replay to debug or rebuild model
Event-driven natural

6.3 Costs

Schema evolution of events tricky
Read requires fold (or snapshot + replay)
Higher complexity
Storage grows linearly

6.4 Snapshots

events 1..1000 → snapshot_1000 = state at event 1000
events 1001..2000 → state = snapshot_1000 then fold(1001..2000)

6.5 Frameworks

EventStoreDB — purpose-built
Apache Kafka + Kafka Streams — log + processor
Axon Framework (Java)
Eventuous (.NET)
DIY on Postgres + outbox table

6.6 When to use

flowchart TD
    A[Consider event sourcing?] --> B{Domain naturally event-driven?<br/>Audit critical?<br/>Time-travel needed?}
    B -->|Yes| C{Team experienced with patterns?}
    C -->|Yes| D[Event sourcing OK]
    C -->|No| E[Start simpler, evolve]
    B -->|No| F[Don't]

    style D fill:#c8e6c9
    style F fill:#c8e6c9

Most apps: don’t need event sourcing. Use simple audit log + CDC.

7. Multi-Tenancy Patterns

7.1 3 Models

graph TB
    subgraph "A. Shared DB / Shared Schema"
        AT1[tenant_id column<br/>every row tagged]
        AT2[Single DB, single schema]
        AT3[Cheap, easy scale<br/>Noisy neighbor risk]
    end

    subgraph "B. Shared DB / Separate Schema"
        BT1[Schema per tenant<br/>tenant_42, tenant_43, ...]
        BT2[Single DB instance]
        BT3[Better isolation<br/>Migration complexity]
    end

    subgraph "C. Separate DB per Tenant"
        CT1[Full isolation]
        CT2[Each tenant own DB instance]
        CT3[Max security<br/>Operational cost high]
    end

7.2 Decision matrix

	Shared (RLS)	Schema-per	DB-per
Cost	Cheapest	Medium	Expensive
Isolation	Logical	Strong	Strongest
Per-tenant customize	Hard	Some	Full
Migration	Easy (1x)	Hard (Nx)	Hard (Nx)
Backup per tenant	Hard	Per schema	Easy
Number tenants	1K-100K	10-1000	1-100
Compliance needs	Low-Medium	Medium-High	Highest (HIPAA, SOX)
Example	Notion, Linear	Some Salesforce	Banking SaaS

7.3 Hybrid pattern (2024 common)

graph TB
    subgraph "Tenant pool 1 - small tenants"
        P1[(DB 1)]
        P1 --> RLS1[RLS by tenant_id]
        T1[tenant 1]
        T2[tenant 2]
        T3[tenant 100]
        T1 -.-> RLS1
        T2 -.-> RLS1
        T3 -.-> RLS1
    end

    subgraph "Tenant pool 2"
        P2[(DB 2)]
        T4[tenant 101]
        T5[tenant 200]
    end

    subgraph "Big tenants"
        DB3[(DB 3 - tenant Acme)]
        DB4[(DB 4 - tenant Globex)]
    end

Small tenants share DB with RLS. Big tenants get dedicated DB.

8. Row Level Security (RLS) Production

8.1 Setup

-- Every table has tenant_id
CREATE TABLE orders (
    id bigserial PRIMARY KEY,
    tenant_id bigint NOT NULL,
    user_id bigint NOT NULL,
    total numeric NOT NULL
);
 
CREATE INDEX idx_orders_tenant ON orders(tenant_id);
 
-- Enable RLS
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
 
-- Policy
CREATE POLICY orders_tenant_isolation ON orders
    USING (tenant_id = current_setting('app.current_tenant_id')::bigint);

8.2 App sets context per request

async def request_handler(request):
    tenant_id = extract_tenant(request)  # from JWT/subdomain/header
 
    async with db.transaction():
        await db.execute("SET LOCAL app.current_tenant_id = $1", str(tenant_id))
        # All subsequent queries auto-filtered
        results = await db.fetch("SELECT * FROM orders LIMIT 10")
        # Returns only orders for tenant_id

SET LOCAL — scope to transaction. Auto-reset on commit/rollback. Works with PgBouncer transaction pooling.

8.3 Bypass for admin

-- Superuser bypasses RLS by default
-- For monitor user, FORCE:
ALTER TABLE orders FORCE ROW LEVEL SECURITY;
 
-- For special role:
CREATE ROLE app_admin BYPASSRLS;

8.4 Migration safety

When adding tenant_id to existing table:

Add column nullable
Backfill
Enable RLS with permissive policy initially
Add NOT NULL
Tighten policy

8.5 Performance considerations

Every query has implicit WHERE tenant_id = ... filter
Index every table by (tenant_id, ...) as composite
Partition by tenant_id for very large tables

8.6 RLS pitfalls

Forget to set app.current_tenant_id → empty results (silent bug)
Forget RLS on new table → leak
Direct DB access (DBA) bypass RLS

→ Test: write integration test that asserts cross-tenant query returns empty.

8.7 RLS for soft-delete

ALTER TABLE users ENABLE ROW LEVEL SECURITY;
 
CREATE POLICY filter_deleted ON users
    USING (deleted_at IS NULL OR current_setting('app.show_deleted', true)::boolean);

Cleaner than WHERE deleted_at IS NULL everywhere.

9. Tenant Routing

9.1 Identification

From request:

Subdomain: acme.app.com → tenant Acme
Path: /t/acme/orders → tenant Acme
JWT claim: {"tenant": "acme"}
Custom header: X-Tenant-ID: 42

9.2 Connection routing

Different patterns:

Pattern A: same conn pool, set tenant_id per request (RLS).

Pattern B: pool per tenant (schema-per-tenant):

pool = pools[tenant_id]  # connection to right schema

Pattern C: tenant → DB instance routing (db-per-tenant):

db_url = tenant_db_map[tenant_id]
async with get_db(db_url).acquire() as conn:
    ...

9.3 Tenant catalog

Master DB / table tracking tenants:

CREATE TABLE tenants (
    id bigserial PRIMARY KEY,
    name text NOT NULL,
    slug text UNIQUE,
    db_host text,
    schema_name text,
    status text,
    created_at timestamptz
);

App boots: load catalog, route requests accordingly.

10. Tenant Migration

10.1 Move tenant from shared → dedicated

When tenant grows. Steps:

Provision new DB
Snapshot tenant data (export by tenant_id)
Import to new DB
Brief downtime: switch routing
Verify
Delete from old DB after grace period

10.2 Schema migration across all tenants

Shared schema: 1 migration covers all. Easy.

Schema-per-tenant: N migrations needed.

Tooling: Run migration in loop over tenants
Risk: partial state if some fail
Mitigation: idempotent migrations, retry

DB-per-tenant: each tenant’s deploy pipeline runs migration.

10.3 Per-tenant rollout

New feature requires schema change. Roll out per tenant for safety:

for tenant in tenants_in_phase_1:
    enable_feature(tenant)
    # monitor

11. Connection Pooling Per Tenant

11.1 Problem

Shared pool with thousands of tenants → noisy neighbor: one tenant slow query blocks others.

11.2 Patterns

A. Single pool + statement_timeout — minimal isolation, but timeout prevents runaway.

B. Pool per workload tier:

Pool A: critical API
Pool B: background jobs
Pool C: reporting

C. Pool per big tenant:

Top 10 tenants: dedicated pool of 5 conns each
Rest: shared pool of 50

D. Supavisor multi-tenant pool (Supabase) — designed for SaaS:

Native tenant-aware pooling
Per-tenant rate limits

11.3 Resource quotas

Postgres 14+:

ALTER USER tenant_42_user SET statement_timeout = '5s';
ALTER USER tenant_42_user SET idle_in_transaction_session_timeout = '30s';

Per-tenant role with limits.

12. Patterns by Use Case

12.1 Audit log

-- Audit table per tenant or shared?
CREATE TABLE audit_log (
    id bigserial,
    tenant_id bigint NOT NULL,
    user_id bigint,
    action text,
    target_id bigint,
    metadata jsonb,
    occurred_at timestamptz DEFAULT now()
);
 
CREATE INDEX ON audit_log (tenant_id, occurred_at DESC);

12.2 Feature flags per tenant

CREATE TABLE tenant_features (
    tenant_id bigint,
    feature text,
    enabled boolean DEFAULT false,
    config jsonb,
    PRIMARY KEY (tenant_id, feature)
);

App checks: if (tenantFeatures[tenant][feature]) { ... }

12.3 Billing per tenant

Track usage (events, queries, storage), aggregate, bill.

12.4 Search per tenant

Elasticsearch index per tenant (small) or shared index with filter on tenant_id (large).

13. Anti-patterns

Pattern	Why bad	Fix
`application.tenant_id = something` forgotten	Empty results bug	Test cross-tenant isolation in CI
Cross-tenant queries (accidental)	Data leak	RLS or always parameterized tenant_id
Schema-per-tenant for 1000s of tenants	Migration nightmare	Shared with RLS
DB-per-tenant for 100K SaaS	Operational hell	Hybrid pool
ORM lazy loading default	N+1 invisible	Eager / DataLoader
Use ORM for analytics	Slow, wrong tool	Raw SQL or warehouse
Event sourcing for CRUD	Overkill	Plain CRUD
CQRS for 10-endpoint app	Complexity tax	Direct queries
No connection pooling per tier	Noisy neighbor	Pool per tier/tenant
Tenant ID as string everywhere	Confusion, joins awkward	bigint surrogate + slug separate

14. Lab

14.1 Day 1: ORM comparison

Same simple CRUD in Prisma, Drizzle, sqlc. Compare boilerplate, performance, type safety.

14.2 Day 2: N+1 detection

Build endpoint returning users with orders. Naive (N+1). Add Prisma include. Compare query log. Add DataLoader.

14.3 Day 3: Repository pattern

Refactor ORM-heavy code to use Repository. Run tests.

14.4 Day 4: CQRS light

Build feature with separate write/read services. Use Postgres LISTEN/NOTIFY or outbox for read model.

14.5 Day 5: RLS setup

Setup multi-tenant DB with RLS. Try queries with/without setting context. Confirm isolation.

14.6 Day 6: Tenant routing

Implement subdomain-based tenant routing in Express/FastAPI. Test 3 tenants in parallel.

14.7 Day 7: Capstone

Build complete SaaS skeleton:

Tenant signup
Tenant catalog
RLS-protected tables
Background jobs per-tenant
Audit log
Feature flags

15. Self-check

ORM categories — 4 styles, pick mỗi cái khi nào?
N+1 — describe + 3 fix methods?
DataLoader vs Eager loading — pick mỗi khi nào?
CQRS — khi nào justified? Khi nào overkill?
Event sourcing — pros + cons?
Multi-tenant 3 patterns — pick cho 100 tenants × 1GB each?
RLS setup — 4 steps?
SET LOCAL vs SET — vì sao SET LOCAL for transaction pooling?
Tenant routing — 4 methods?
Schema migration across 1000 tenants — strategy?

16. Tiếp theo

Phase 5 complete. Main weekly chapters DONE. Tiếp theo: Bonus chapters (8 topics) + Case studies (5 designs).

Tuần 16 hoàn thành. App patterns close the loop. Cập nhật: 2026-05-16

lthieu's notes

Explorer

Tuan-16-ORM-CQRS-Multi-Tenancy