References — Master Index

Sách, paper, blog, course uy tín cho Database Mastery. Sắp theo chủ đề.

Tags: reference bibliography Liên quan: MOC-Database-Mastery · Roadmap


1. Foundational Books

Database internals

  • Database Internals — Alex Petrov (O’Reilly, 2019). Bible cho storage engines, transactions, distributed.
  • Designing Data-Intensive Applications — Martin Kleppmann (O’Reilly, 2017). Gold standard cho data systems thinking.
  • The Internals of PostgreSQL — Hironobu Suzuki (free online). https://www.interdb.jp/pg/
  • PostgreSQL 14 Internals — Egor Rogov (free PDF). https://postgrespro.com/community/books/internals
  • Readings in Database Systems (Red Book)http://www.redbook.io/. Curated paper list.

SQL & query

  • Use The Index, Luke! — Markus Winand (free online). https://use-the-index-luke.com/. Must-read.
  • SQL Antipatterns — Bill Karwin (2nd ed 2022). Anti-pattern catalog.
  • SQL Performance Explained — Markus Winand.
  • The Art of PostgreSQL — Dimitri Fontaine.

Specific DBs

  • High Performance MySQL — Schwartz, Zaitsev, Tkachenko (4th ed).
  • Cassandra: The Definitive Guide — Carpenter, Hewitt (3rd ed 2020).
  • MongoDB: The Definitive Guide — Bradshaw, Brazil, Chodorow (3rd ed).
  • Redis in Action — Josiah Carlson. Older but Redis fundamentals solid.
  • The DynamoDB Book — Alex DeBrie. https://www.dynamodbbook.com/

2. Papers (Foundational)

Storage & engines

  • The Log-Structured Merge-Tree (LSM-Tree) — O’Neil et al, 1996.
  • LevelDB / RocksDB design docs (Google, Facebook).
  • The Bw-Tree — Levandoski et al, Microsoft.

Transactions

  • A Critique of ANSI SQL Isolation Levels — Berenson et al, 1995.
  • Serializable Snapshot Isolation in PostgreSQL — Ports & Grittner, VLDB 2012.
  • Spanner: Google’s Globally Distributed Database — Corbett et al, OSDI 2012.
  • CockroachDB: The Resilient Geo-Distributed SQL Database — Taft et al, SIGMOD 2020.

Distributed systems (DB-relevant)

  • Dynamo: Amazon’s Highly Available Key-value Store — DeCandia et al, SOSP 2007.
  • Bigtable: A Distributed Storage System for Structured Data — Chang et al, OSDI 2006.
  • Snowflake: Cloud Data Warehouse — Vuppalapati et al, SIGMOD 2016.
  • F1: A Distributed SQL Database That Scales — Shute et al, VLDB 2013.
  • In Search of an Understandable Consensus Algorithm (Raft) — Ongaro & Ousterhout, USENIX 2014.
  • Paxos Made Simple — Lamport, 2001.

Indexes & data structures

  • Skiplist — Pugh, 1990.
  • HNSW (Hierarchical Navigable Small World) — Malkov & Yashunin, 2016. Vector index.
  • DiskANN — Subramanya et al, Microsoft 2019.
  • R-tree (spatial) — Guttman, 1984.

Analytics

  • C-Store: A Column-oriented DBMS — Stonebraker et al, VLDB 2005.
  • Vectorwise/MonetDB papers — columnar execution.

CDC / Streaming

  • The Log: What Every Software Engineer Should Know About Real-time Data’s Unifying Abstraction — Jay Kreps.
  • Kafka: a Distributed Messaging System for Log Processing — Kreps et al, NetDB 2011.

3. Documentation

Postgres

MySQL

Redis / Valkey

DynamoDB

MongoDB

Cassandra / ScyllaDB

Elasticsearch / OpenSearch

ClickHouse

  • ClickHouse docshttps://clickhouse.com/docs
  • Aleksey Milovidov talks — YouTube, CMU DB Group talks
  • Altinity blog — ClickHouse production
  • PostHog engineering — ClickHouse use case

Vector DBs

Lakehouse / Iceberg

CDC / Debezium


4. Courses

University

  • CMU 15-445 Database Systems — Andy Pavlo. https://15445.courses.cs.cmu.edu/. Gold standard free course.
  • CMU 15-721 Advanced Database Systems — same. Modern OLAP, distributed.
  • MIT 6.824 Distributed Systems — Robert Morris.
  • Stanford CS 245 Database Systems — Peter Bailis.

Online / Industry

  • MongoDB University — free.
  • ScyllaDB University — free.
  • Snowflake University — free.
  • DataStax Academy — free.
  • Databricks Academy — partial free.

5. Talks (YouTube)

  • Bruce Momjian — PostgreSQL internals slides + talks. https://momjian.us/main/presentations/internals.html
  • Andy Pavlo — Database Lectures (CMU). YouTube playlist.
  • Aleksey Milovidov — ClickHouse internals. Multiple talks.
  • Salvatore Sanfilippo — Redis design. Various.
  • Martin Kleppmann — DDIA talks. ICDE, Strange Loop.
  • Brendan Gregg — Performance. USENIX. (Performance fundamentals)
  • Aphyr (Kyle Kingsbury) — Jepsen talks. Distributed safety analysis.

6. Blogs (Engineering Deep Dives)


7. Tools Reference

Postgres specific

Migration tools

Schema design

  • dbdiagram.io — visual schema
  • DBML — DSL for schemas
  • Prisma Schema — ORM-first

Modeling

  • NoSQL Workbench (AWS) — DynamoDB design
  • MongoDB Compass — schema analyzer

8. Newsletters


9. Communities

  • r/PostgreSQL, r/Database
  • Postgres Discord, Slack (Postgres community)
  • Hacker News — search “Postgres”, “ClickHouse” tags
  • Stack Overflow tags — postgresql, sql, indexing

10. Production Stories (Must-Read Case Studies)

  • Stripe Engineering Blog — Postgres at fintech scale
  • Notion Engineering — Postgres sharding via Citus
  • GitHub Engineering — MySQL migration (Vitess), gh-ost
  • Shopify Engineering — MySQL Vitess sharding
  • Figma Engineering — Postgres horizontal partitioning
  • Discord Engineering — Cassandra → ScyllaDB
  • Mailchimp — Postgres XID wraparound incident
  • Sentry 2015 XID wraparound — incident report
  • Cloudflare engineering blog — D1 (SQLite at edge)

Cập nhật: 2026-05-16. Build dần khi gặp tài liệu mới.