Bonus 3 — Embedded DBs: SQLite, DuckDB, RocksDB, LanceDB

“Server DB không phải answer cho mọi case. SQLite chạy trên 4 tỷ device. DuckDB là ‘SQLite for analytics’. RocksDB là engine behind nhiều big-data products. Hiểu khi nào embedded thắng server.”

Tags: database sqlite duckdb rocksdb embedded lancedb Thời lượng: 5-6 ngày Prerequisites: Tuan-01-DB-Internals-Refresh Liên quan: Tuan-14-OLAP-Columnar-ClickHouse · Tuan-15-Vector-DB-AI


1. Embedded DB Definition

  • Runs in-process with app (linked library)
  • No network roundtrip
  • Single-writer typical (some support multi-writer)
  • File-based persistence
  • App owns lifecycle (start/stop/backup)
graph LR
    subgraph "Server DB"
        App1[App] -.network.-> Server[(DB Server<br/>separate process)]
        Server --> Disk1[Disk]
    end

    subgraph "Embedded DB"
        App2[App + DB library] --> File[(DB file)]
    end

    style App2 fill:#c8e6c9

2. SQLite — Most Deployed DB on Earth

2.1 Where SQLite runs

  • Every iPhone, Android device
  • All major browsers (Chrome, Firefox, Safari)
  • Boeing/Airbus airplane systems
  • Most apps with local data (Notion local cache, Slack offline, IntelliJ, VSCode)
  • Test DBs, CI pipelines
  • Embedded devices, IoT

Estimated 4+ billion deployments. More than every other DB combined.

2.2 Why SQLite is great

  • Public domain (most permissive license possible)
  • Single file = single DB (easy backup, transport)
  • ACID (with WAL mode), full SQL
  • Battle-tested 30 years
  • ~150KB binary
  • Quality: 100% MC/DC test coverage (NASA-level), TH3 test harness has hundreds of millions of test cases

2.3 Limitations

  • Single writer at a time (multi-reader OK)
  • No network (in-process only — kindof, via socket forwarders)
  • Theoretical max ~281 TB (4KB pages × 2^48), practical less due to single-writer
  • Not for high-concurrency server workload
  • No row-level locking (database lock for writers)
  • No procedures / triggers as rich as Postgres

2.4 SQLite at scale 2024-2026

Litestream

Continuous WAL replication SQLite → S3 (or compatible).

# Background daemon replicates WAL changes
litestream replicate /path/to/db.sqlite s3://bucket/db

Recovery on container start:

litestream restore -o /path/to/db.sqlite s3://bucket/db

Pattern: SQLite + Litestream = read-heavy web app on cheap container.

LiteFS

Fly.io’s distributed SQLite layer with leader-follower replication.

Status update 2024: Fly.io pivoted away from LiteFS as primary product. Project still maintained but less active. Use cautiously for new projects — Litestream more reliable.

Cloudflare D1

Managed SQLite at edge. Multi-region replication. Pay-per-query model.

Good for: low-latency global apps where data fits SQLite scale.

Turso

Managed SQLite (LibSQL fork) with replication + edge. Postgres-style billing.

2.5 SQLite mode tuning

PRAGMA journal_mode = WAL;        -- allow concurrent reads while writing
PRAGMA synchronous = NORMAL;       -- faster, slightly less safe
PRAGMA cache_size = -200000;       -- 200MB cache (negative = KB)
PRAGMA foreign_keys = ON;          -- NOT default!
PRAGMA temp_store = MEMORY;
PRAGMA mmap_size = 30000000000;    -- 30GB memory-mapped I/O
PRAGMA busy_timeout = 5000;        -- wait 5s for lock

WAL mode crucial for any concurrent app.

2.6 SQLite extensions

  • JSON1 — JSON functions (built-in modern)
  • FTS5 — full-text search
  • R*Tree — spatial
  • SpatiaLite — full GIS
  • sqlite-vec — vector search (2024)
  • sqlite-vss — vector via FAISS
  • rqlite — distributed SQLite over Raft

2.7 SQLite vector search (2024)

sqlite> .load sqlite-vec0
sqlite> CREATE VIRTUAL TABLE vec USING vec0(embedding float[1536]);
sqlite> INSERT INTO vec VALUES (1, '[0.1, 0.2, ...]');
sqlite> SELECT rowid, distance FROM vec WHERE embedding MATCH '[0.1, 0.2, ...]' ORDER BY distance LIMIT 10;

For small (< 1M vectors), in-process: fast + simple + no server.

2.8 When SQLite

  • Mobile, desktop apps (offline-first)
  • Single-server web app low traffic
  • Read-heavy with Litestream
  • CI test DB (faster than Postgres)
  • Local dev replace heavy Postgres
  • Edge deployment (Cloudflare D1)
  • AI agent state (LangChain, AutoGen)

2.9 When NOT SQLite

  • Multi-process write (multi-server web)
  • High concurrent write (>100/s sustained)
  • Need rich features (RLS, advanced types)
  • Need network access

3. DuckDB — SQLite for Analytics

3.1 Concept

In-process columnar OLAP. C++ library, embeddable from Python, R, Node, Rust, Java, Go.

import duckdb
con = duckdb.connect("analytics.db")  # or :memory:
con.execute("CREATE TABLE events AS SELECT * FROM 'events.parquet'")
df = con.execute("SELECT date, count(*) FROM events GROUP BY date").df()

Vs ClickHouse: server vs embedded. Vs SQLite: row vs columnar; analytics vs transactional.

3.2 Use cases

  • Notebook analytics (Jupyter, R)
  • ETL transformations (instead of Spark for medium data)
  • Local file analytics on Parquet/CSV/JSON
  • Embedded analytics in apps
  • BI tool backends (Metabase, Lightdash)
  • LLM agents doing data tasks
  • Test fixture for analytical queries

3.3 Strengths

  • Single binary
  • No server needed
  • Read Parquet/Arrow/CSV/JSON directly
  • Full SQL: joins, window functions, CTEs
  • Vectorized execution
  • Often faster than Postgres for analytics on local files
  • Plays well with Pandas/Polars dataframes

3.4 Limitations

  • Single-process (mostly)
  • Memory pressure on huge data (streaming exec but some ops need RAM)
  • Not designed for OLTP
  • Single writer

3.5 DuckDB Features 2024-2026

Extensions ecosystem 2024-2026:

  • httpfs — read S3/HTTPS directly
  • postgres — query Postgres tables as DuckDB tables
  • mysql — query MySQL
  • sqlite — query SQLite (interop)
  • json — JSON support
  • parquet — Parquet read/write
  • iceberg — Apache Iceberg (2024)
  • delta — Delta Lake (2024)
  • spatial — GIS
  • fts — full-text search
  • vss — vector similarity (2024)
  • excel — read xlsx
INSTALL httpfs;
LOAD httpfs;
SELECT * FROM 's3://bucket/events.parquet' LIMIT 10;

3.6 MotherDuck

Cloud DuckDB service. Hybrid local + cloud:

  • Local: instant queries on local data
  • Cloud: scale to TB on managed compute
  • Federated: query both with single SQL

Pattern: dev local, prod cloud. Same code.

3.7 DuckDB vs ClickHouse

DuckDBClickHouse
DeploymentEmbedded libraryServer cluster
ConcurrencySingle-processMulti-user, many connections
Data sizeSingle machine (TB)Cluster (PB)
Best forAnalyst notebooks, embedded analytics, ETLProduction analytics service
Setuppip install duckdbCluster setup
CostFree, no infraServer cost

3.8 Reading from cloud

-- Direct from S3
SELECT count(*) FROM 's3://bucket/events/*.parquet';
 
-- From Postgres
INSTALL postgres;
ATTACH 'postgres://...' AS pg;
SELECT count(*) FROM pg.users;

3.9 Performance tricks

-- Set threads
SET threads = 8;
 
-- Memory limit
SET memory_limit = '4GB';
 
-- Out-of-core execution (spill to disk)
SET temp_directory = '/tmp/duckdb';
 
-- Profile
EXPLAIN ANALYZE SELECT ...;

4. RocksDB — KV Storage Primitive

4.1 What RocksDB is

LSM-tree key-value storage engine. Embeddable C++ library. Forked from LevelDB (Google) by Facebook.

Not a database per se — a storage engine that other databases build on.

4.2 Who uses RocksDB

  • CockroachDB — uses Pebble (RocksDB inspired, Go rewrite)
  • TiDB / TiKV — uses RocksDB underneath
  • MyRocks — MySQL storage engine alternative to InnoDB (Facebook)
  • YugabyteDB — DocDB built on RocksDB
  • Cassandra (some setups via sidecar)
  • Apache Kafka — state stores
  • Apache Flink — RocksDB state backend
  • Apache Spark structured streaming
  • Solana validator — account state
  • Many distributed systems

4.3 LSM-tree mechanics

graph TB
    Write[Write] --> WAL[Write-Ahead Log]
    Write --> MemTable[MemTable in RAM<br/>sorted skip list]
    MemTable -->|flush when full| L0[Level 0 SST files]
    L0 -->|compaction| L1[Level 1]
    L1 -->|compaction| L2[Level 2]
    L2 -->|compaction| LN[Level N]

    Read[Read] --> MemTable
    Read --> L0
    Read --> L1
    Read --> L2
    Read --> LN
  • Write: append to WAL (durability) + MemTable (sorted in-memory)
  • MemTable full: flush to L0 SST file on disk
  • Background compaction: merges files level by level, deduplicates

4.4 Trade-offs vs B-tree

B-tree (InnoDB, Postgres)LSM (RocksDB)
WriteIn-place (random I/O)Append (sequential I/O)
Read1-2 disk I/O (log N)Possibly N levels
Write throughputLowerHigher
Read throughputHigherLower
Write ampLowerHigher (compaction rewrites)
Space ampLowerHigher (multiple versions)
Best forOLTP balancedWrite-heavy, append-mostly

4.5 RocksDB tuning knobs

  • write_buffer_size — MemTable size
  • max_write_buffer_number — how many MemTables
  • level0_file_num_compaction_trigger — when to compact L0
  • target_file_size_base — SST file size
  • bloom_filter_bits_per_key — bloom filter density (reduce reads)
  • compression — Snappy / LZ4 / Zstd

Each knob trades read vs write vs space.

4.6 When use RocksDB directly

Rare for app code. Common for:

  • Building storage layer of distributed system
  • Embedded high-write KV (queue, state)
  • Workloads needing tunable LSM
  • KV store inside app for state (Flink, Kafka Streams)

App devs: probably never directly. Important to know it exists when reading “X DB uses RocksDB”.


5. LanceDB — Embedded Vector DB

5.1 Concept

Lance file format + LanceDB query engine. Apache Arrow-based. Embedded mode.

import lancedb
db = lancedb.connect("./data")
table = db.create_table("docs", data=[
    {"id": 1, "embedding": [0.1, 0.2, ...], "text": "..."}
])
results = table.search([0.1, 0.2, ...]).limit(10).to_pandas()

5.2 Strengths

  • Single library, no server
  • Versioning + time travel (like Iceberg)
  • Vector + scalar filter
  • Fast on local SSD
  • S3 support (sync to cloud)

5.3 Use case

  • AI agents with local memory
  • Local RAG (over your notes, docs)
  • ML feature store (embedded mode)
  • Prototyping vector workflows

5.4 vs pgvector / Qdrant

LanceDBpgvectorQdrant
DeployEmbeddedPostgres extServer
VersioningBuilt-inNoNo
Setuppip installPostgresDocker
ScaleLocal-firstMedServer-side
ProductionOK for read-mostlyGood generalBest for filtered search

6. Other Embedded Options

6.1 LMDB

Lightning Memory-Mapped Database. B-tree, memory-mapped.

Use cases:

  • State stores (etcd v2 used it)
  • Caches in-process
  • Bitcoin Core
  • OpenLDAP backend

Strengths: ACID, very fast reads, single-writer/multi-reader.

6.2 BoltDB / bbolt

Go-native B+tree KV store. Simple, single-writer.

Used by: etcd v3, Consul, InfluxDB v1.

Strengths: pure Go (no cgo), simple API.

6.3 BadgerDB

Go-native LSM. RocksDB equivalent for Go ecosystem.

Used by: Dgraph, IPFS.

Strengths: faster pure-Go alternative to RocksDB binding.

6.4 LevelDB

Google’s original LSM library. RocksDB descendant.

Used by: IPFS, blockchain projects, Chrome IndexedDB internals.

6.5 sled

Modern Rust embedded DB. Lock-free, B-epsilon tree. Promising but less battle-tested.

6.6 Redb

Pure Rust embedded KV store. Active development 2024.


7. Embedded vs Server Decision

flowchart TD
    A[Need DB] --> B{Multi-process / multi-host access?}
    B -->|Yes| C[Server DB]
    B -->|No - single app| D{Concurrency level?}
    D -->|Low to medium <100 writes/s| E[Embedded OK]
    D -->|High concurrent writes| C
    A --> F{Network latency matters?}
    F -->|Yes - sub-ms required| E
    F -->|No| C
    A --> G{Operational simplicity?}
    G -->|Maximum simple| E
    G -->|OK ops team| C

    style E fill:#c8e6c9
    style C fill:#fff9c4

8. Production Patterns

8.1 SQLite + Litestream for low-traffic SaaS

graph LR
    App[App container] --> SQLite[(SQLite file)]
    SQLite -.replicate WAL.-> Litestream[Litestream]
    Litestream -.upload.-> S3[(S3)]

    S3 -.restore on container start.-> SQLite

Patterns:

  • Containerized app with SQLite local
  • Litestream replicates WAL to S3 continuously (latency ~1s)
  • Crash/restart → restore latest from S3
  • Cost: 50/month managed Postgres
  • Latency: 0 (in-process)

Limits: single-writer. OK for: personal SaaS, marketing site, dashboards, prototype.

8.2 DuckDB for embedded analytics

App has local analytics needs (admin dashboard, in-app reports):

# In Python app
import duckdb
con = duckdb.connect(":memory:")
con.execute("CREATE VIEW orders AS SELECT * FROM read_postgres('...', 'orders')")
report = con.execute("SELECT date, count(*) FROM orders GROUP BY date").df()

No separate analytical DB. Fast.

8.3 SQLite as test fixture

# Production: Postgres
# Test: SQLite (faster setup, parallel test isolated)
DB_URL = "sqlite:///:memory:" if TESTING else "postgresql://..."

Caveat: SQL dialect differences. SQLAlchemy abstracts. Or use real Postgres in tests (testcontainers) for fidelity.

8.4 LanceDB for local LLM agent

# AI agent with local memory + RAG
import lancedb
db = lancedb.connect("./agent_memory")
mem = db.open_table("memory")
 
# Retrieve relevant memories
relevant = mem.search(query_embedding).limit(5).to_pandas()
 
# Use as LLM context

No server, no API key, all local. Perfect for dev.

8.5 DuckDB in data pipeline

# Replace Spark for medium data
import duckdb
duckdb.sql("""
    COPY (
        SELECT date_trunc('day', ts) as day, count(*) as events
        FROM read_json_auto('logs/*.json')
        GROUP BY day
    ) TO 'output.parquet' (FORMAT PARQUET)
""")

Single laptop processes 100GB/hour. No cluster.


9. Tooling

  • DBeaver — SQLite, DuckDB GUI
  • sqlite-web — SQLite web UI
  • DuckDB CLI — interactive shell
  • DBBrowser for SQLite — friendly GUI
  • DataGrip — paid commercial
  • MotherDuck — cloud DuckDB UI

10. Anti-patterns

PatternWhy bad
SQLite for high-concurrent write web appSingle writer locks app
SQLite without WAL modeConcurrent reads blocked
SQLite + foreign_keys = OFF (default!)Referential integrity ignored
DuckDB for OLTP (frequent small writes)LSM-style not designed
DuckDB as long-running serviceNot designed multi-user concurrent
Direct RocksDB usage in app codeReinventing wheel
LMDB without sizing rightMap file grows unbounded if not sized
Treating embedded as drop-in for PostgresFeature gap
Backup by file copy while runningWAL not consistent; use VACUUM INTO or hot backup tool

11. Lab

Day 1: SQLite WAL benchmarking

Run pgbench-equivalent workload on SQLite WAL vs default journal. Compare TPS.

# SQLite + bench
sqlite3 test.db
PRAGMA journal_mode = WAL;
-- ... run workload

Day 2: Litestream backup/restore

Setup app writing to SQLite + Litestream to local S3 (minio). Kill, restore.

litestream replicate ./db.sqlite file://./backup
litestream restore -o ./db.sqlite file://./backup

Day 3: DuckDB pipeline

Load 100M-row dataset (NYC Taxi). Build aggregation. Compare with Postgres on same data + same query.

Day 4: DuckDB + S3

Read Parquet directly from S3:

INSTALL httpfs; LOAD httpfs;
SELECT count(*) FROM 's3://bucket/data.parquet';

Day 5: RocksDB intro

Use Python/Go RocksDB binding for KV store. Tune compaction. Watch SSTs build.

Day 6: LanceDB RAG

Build local RAG with LanceDB + sentence-transformers + local LLM. Query offline.

Day 7: Migration exercise

Move small Postgres DB to SQLite + Litestream. Estimate cost savings. Compare query latency.


12. Self-check

  1. SQLite + WAL mode — vì sao mandatory cho concurrent?
  2. Litestream giải quyết vấn đề gì với SQLite?
  3. DuckDB vs ClickHouse — pick mỗi cái khi nào?
  4. RocksDB là DB hay storage engine? Ai dùng nó?
  5. LSM vs B-tree — write/read trade-off?
  6. SQLite max DB size theoretical?
  7. LanceDB vs pgvector — pros/cons embedded approach?
  8. SQLite at edge (Cloudflare D1) — use case?
  9. DuckDB extensions — name 5?
  10. Khi nào embedded thắng server?

13. Tiếp theo

Tuan-Bonus-CDC-Debezium

Cập nhật: 2026-05-16