Index Decision Tree
Quick reference: bài toán nào dùng index nào.
Tags: attachment cheatsheet indexing Liên quan: Tuan-03-Indexing-Mastery
1. Master Decision Tree
flowchart TD Q[Bạn có query slow?] --> A{Bottleneck<br/>seq scan?} A -->|No| Skip[Không cần index mới] A -->|Yes| B{Query type?} B -->|Equality + range scalar| BTREE[B-tree<br/>default] B -->|Array containment| GIN[GIN] B -->|JSONB containment| GIN B -->|Full-text search| GIN_FTS[GIN on tsvector] B -->|Fuzzy text similarity| TRGM[GIN/GiST + pg_trgm] B -->|Geometric / spatial| GIST[GiST + PostGIS] B -->|Range overlap| GIST B -->|Append-only time-series huge| BRIN[BRIN] B -->|Vector similarity| HNSW[HNSW / IVFFlat pgvector] B -->|IP prefix tree| SPGIST[SP-GiST] B -->|Multi-column equality bag| BLOOM[Bloom filter ext] style BTREE fill:#c8e6c9 style GIN fill:#c8e6c9 style HNSW fill:#fff9c4
2. B-tree — When + What Column Order
When:
WHERE col = XequalityWHERE col > XrangeWHERE col BETWEEN X AND YWHERE col LIKE 'prefix%'WHERE col IS NULL/IS NOT NULLORDER BY col- JOIN on col
Multi-column order — ESR rule:
- Equality columns first
- Sort columns next
- Range columns last
-- Query
SELECT * FROM orders WHERE user_id = $1 AND status = 'pending' ORDER BY created_at DESC;
-- Optimal: (user_id, status, created_at DESC)
CREATE INDEX ON orders (user_id, status, created_at DESC);3. Specialized Index Cheatsheet
| Need | Index | Example |
|---|---|---|
| Tags / labels array | GIN | CREATE INDEX ON t USING gin(tags); WHERE tags @> ARRAY['rust'] |
| JSONB partial match | GIN | CREATE INDEX ON t USING gin(meta); WHERE meta @> '{"k":"v"}' |
| JSONB specific path | GIN with path_ops | CREATE INDEX ON t USING gin((meta->'k')) |
| Full-text English | GIN tsvector | CREATE INDEX ON t USING gin(to_tsvector('english', body)) |
| ILIKE / fuzzy | pg_trgm GIN | CREATE INDEX ON t USING gin(name gin_trgm_ops) |
| Geo within distance | PostGIS GiST | CREATE INDEX ON places USING gist(location) |
| K-nearest geo | PostGIS GiST | Same, used with <-> |
| Range overlap | GiST | CREATE INDEX ON bookings USING gist(during) |
| Time-series append | BRIN | CREATE INDEX ON events USING brin(created_at) |
| Vector embedding | HNSW (pgvector) | CREATE INDEX ON docs USING hnsw(embedding vector_cosine_ops) |
4. Covering / Partial / Expression
Covering (INCLUDE)
CREATE INDEX ON orders (user_id) INCLUDE (total, created_at);
-- Enable index-only scan if VM cleanWhen: hot query selecting few columns alongside indexed key.
Partial
CREATE INDEX ON users (email) WHERE deleted_at IS NULL;
CREATE INDEX ON orders (created_at) WHERE status = 'pending';When: filter always present + small subset of table.
Expression
CREATE INDEX ON users (lower(email));
CREATE INDEX ON events ((data->>'country'));When: query uses function — must match exact expression.
5. Anti-patterns Quick Check
- ❌ Index on low-cardinality alone (
status3 values) — seq scan faster - ❌ Index every FK column blindly — write cost
- ❌
LIKE '%suffix'expecting B-tree — won’t help, use trigram - ❌ Function in WHERE not matching expression index
- ❌ Many indexes on hot-write table — kills throughput
- ❌ INCLUDE huge TEXT column — index bloats
6. Workflow When Query Slow
flowchart TD A[Slow query] --> B[EXPLAIN ANALYZE] B --> C{Bottleneck node?} C -->|Seq Scan| D[Add index matching WHERE] C -->|Index Scan + Filter big| E[Add INCLUDE / composite] C -->|Sort spill| F[Match index ORDER] C -->|Hash spill| G[Increase work_mem] C -->|Heap fetches in IO scan| H[VACUUM table] D --> Test[Test new plan] E --> Test F --> Test G --> Test H --> Test Test --> I{Better?} I -->|Yes| J[Keep, monitor] I -->|No| K[Try alternative]
7. Index Maintenance
-- Bloat check
SELECT * FROM pgstattuple('idx_orders_user'); -- avg_leaf_density >50% OK
-- Rebuild (online)
REINDEX INDEX CONCURRENTLY idx_orders_user;
-- Unused (idx_scan = 0 over time)
SELECT indexrelname FROM pg_stat_user_indexes
WHERE idx_scan = 0 AND indexrelname NOT LIKE '%_pkey';
-- Build new (online)
CREATE INDEX CONCURRENTLY idx_new ON orders(...);8. Multi-DB Index Quick Compare
| Postgres | MySQL InnoDB | MongoDB | DynamoDB | |
|---|---|---|---|---|
| Default | B-tree | B+ clustered (PK) | B-tree | Hash + sort |
| Multi-column | Yes (ESR) | Yes (leftmost) | Yes | LSI/GSI |
| Partial | Yes | Functional only | Yes | Sparse GSI |
| Expression/Function | Yes | Yes (8.0+) | Yes | No |
| Full-text | GIN tsvector | InnoDB FTS | Text index | No native |
| Geo | GiST + PostGIS | Spatial | 2dsphere | No native |
| Vector | HNSW pgvector | No native | Atlas Vector | No |
| Concurrent build | CONCURRENTLY | Online DDL | Background | N/A |
Cập nhật: 2026-05-16