References — Master Index
Sách, paper, blog, course uy tín cho Database Mastery. Sắp theo chủ đề.
Tags: reference bibliography Liên quan: MOC-Database-Mastery · Roadmap
1. Foundational Books
Database internals
- Database Internals — Alex Petrov (O’Reilly, 2019). Bible cho storage engines, transactions, distributed.
- Designing Data-Intensive Applications — Martin Kleppmann (O’Reilly, 2017). Gold standard cho data systems thinking.
- The Internals of PostgreSQL — Hironobu Suzuki (free online). https://www.interdb.jp/pg/
- PostgreSQL 14 Internals — Egor Rogov (free PDF). https://postgrespro.com/community/books/internals
- Readings in Database Systems (Red Book) — http://www.redbook.io/. Curated paper list.
SQL & query
- Use The Index, Luke! — Markus Winand (free online). https://use-the-index-luke.com/. Must-read.
- SQL Antipatterns — Bill Karwin (2nd ed 2022). Anti-pattern catalog.
- SQL Performance Explained — Markus Winand.
- The Art of PostgreSQL — Dimitri Fontaine.
Specific DBs
- High Performance MySQL — Schwartz, Zaitsev, Tkachenko (4th ed).
- Cassandra: The Definitive Guide — Carpenter, Hewitt (3rd ed 2020).
- MongoDB: The Definitive Guide — Bradshaw, Brazil, Chodorow (3rd ed).
- Redis in Action — Josiah Carlson. Older but Redis fundamentals solid.
- The DynamoDB Book — Alex DeBrie. https://www.dynamodbbook.com/
2. Papers (Foundational)
Storage & engines
- The Log-Structured Merge-Tree (LSM-Tree) — O’Neil et al, 1996.
- LevelDB / RocksDB design docs (Google, Facebook).
- The Bw-Tree — Levandoski et al, Microsoft.
Transactions
- A Critique of ANSI SQL Isolation Levels — Berenson et al, 1995.
- Serializable Snapshot Isolation in PostgreSQL — Ports & Grittner, VLDB 2012.
- Spanner: Google’s Globally Distributed Database — Corbett et al, OSDI 2012.
- CockroachDB: The Resilient Geo-Distributed SQL Database — Taft et al, SIGMOD 2020.
Distributed systems (DB-relevant)
- Dynamo: Amazon’s Highly Available Key-value Store — DeCandia et al, SOSP 2007.
- Bigtable: A Distributed Storage System for Structured Data — Chang et al, OSDI 2006.
- Snowflake: Cloud Data Warehouse — Vuppalapati et al, SIGMOD 2016.
- F1: A Distributed SQL Database That Scales — Shute et al, VLDB 2013.
- In Search of an Understandable Consensus Algorithm (Raft) — Ongaro & Ousterhout, USENIX 2014.
- Paxos Made Simple — Lamport, 2001.
Indexes & data structures
- Skiplist — Pugh, 1990.
- HNSW (Hierarchical Navigable Small World) — Malkov & Yashunin, 2016. Vector index.
- DiskANN — Subramanya et al, Microsoft 2019.
- R-tree (spatial) — Guttman, 1984.
Analytics
- C-Store: A Column-oriented DBMS — Stonebraker et al, VLDB 2005.
- Vectorwise/MonetDB papers — columnar execution.
CDC / Streaming
- The Log: What Every Software Engineer Should Know About Real-time Data’s Unifying Abstraction — Jay Kreps.
- Kafka: a Distributed Messaging System for Log Processing — Kreps et al, NetDB 2011.
3. Documentation
Postgres
- Official docs — https://www.postgresql.org/docs/current/
- PgAnalyze blog — https://pganalyze.com/blog (deep dives, must-read)
- Citus Data blog — https://www.citusdata.com/blog/
- Crunchy Data blog — https://www.crunchydata.com/blog
- EDB blog — https://www.enterprisedb.com/blog
- Cybertec blog — https://www.cybertec-postgresql.com/en/blog/
- Postgres Weekly — https://postgresweekly.com/
MySQL
- MySQL docs — https://dev.mysql.com/doc/
- Percona blog — https://www.percona.com/blog/
- PlanetScale blog — https://planetscale.com/blog
- MySQL Performance Blog — historical Percona content
Redis / Valkey
- Redis docs — https://redis.io/docs/
- Valkey docs — https://valkey.io/
- antirez (Salvatore Sanfilippo) blog — http://antirez.com/
DynamoDB
- AWS DynamoDB docs — https://docs.aws.amazon.com/amazondynamodb/
- Alex DeBrie blog — https://www.alexdebrie.com/
- AWS database blog — https://aws.amazon.com/blogs/database/
MongoDB
- MongoDB docs — https://www.mongodb.com/docs/
- MongoDB University — free courses.
Cassandra / ScyllaDB
- Cassandra docs — https://cassandra.apache.org/doc/
- ScyllaDB University — https://university.scylladb.com/ (free, deep)
- The Last Pickle blog — Cassandra deep dives.
- DataStax Academy — free courses.
Elasticsearch / OpenSearch
- Elastic docs — https://www.elastic.co/guide
- OpenSearch docs — https://opensearch.org/docs/
- Elasticsearch: The Definitive Guide — free online (older but core)
ClickHouse
- ClickHouse docs — https://clickhouse.com/docs
- Aleksey Milovidov talks — YouTube, CMU DB Group talks
- Altinity blog — ClickHouse production
- PostHog engineering — ClickHouse use case
Vector DBs
- pgvector — https://github.com/pgvector/pgvector
- Qdrant docs — https://qdrant.tech/documentation/
- Pinecone Learning Center — https://www.pinecone.io/learn/
- MTEB Leaderboard — https://huggingface.co/spaces/mteb/leaderboard
- Faiss tutorial — Facebook AI
Lakehouse / Iceberg
- Apache Iceberg docs — https://iceberg.apache.org/docs/
- Apache Iceberg spec — https://iceberg.apache.org/spec/
- Netflix engineering blog — Iceberg origins
- Tabular blog — https://tabular.io/blog/ (Tabular acquired by Databricks 2024)
- Databricks Delta Lake docs — https://docs.delta.io/
CDC / Debezium
- Debezium docs — https://debezium.io/documentation/
- Confluent blog — https://www.confluent.io/blog/
- Gunnar Morling blog — Debezium maintainer
4. Courses
University
- CMU 15-445 Database Systems — Andy Pavlo. https://15445.courses.cs.cmu.edu/. Gold standard free course.
- CMU 15-721 Advanced Database Systems — same. Modern OLAP, distributed.
- MIT 6.824 Distributed Systems — Robert Morris.
- Stanford CS 245 Database Systems — Peter Bailis.
Online / Industry
- MongoDB University — free.
- ScyllaDB University — free.
- Snowflake University — free.
- DataStax Academy — free.
- Databricks Academy — partial free.
5. Talks (YouTube)
- Bruce Momjian — PostgreSQL internals slides + talks. https://momjian.us/main/presentations/internals.html
- Andy Pavlo — Database Lectures (CMU). YouTube playlist.
- Aleksey Milovidov — ClickHouse internals. Multiple talks.
- Salvatore Sanfilippo — Redis design. Various.
- Martin Kleppmann — DDIA talks. ICDE, Strange Loop.
- Brendan Gregg — Performance. USENIX. (Performance fundamentals)
- Aphyr (Kyle Kingsbury) — Jepsen talks. Distributed safety analysis.
6. Blogs (Engineering Deep Dives)
- Brandur Leach — https://brandur.org/ (Postgres at scale, Stripe-style)
- High Scalability — https://highscalability.com/
- The Pragmatic Engineer — https://newsletter.pragmaticengineer.com/
- Engineering at Meta / Discord / Notion / Linear / Figma — case studies
- Aphyr’s blog — distributed systems safety
- Marc Brooker (AWS) — https://brooker.co.za/blog/
- Murat Demirbas — https://muratbuffalo.blogspot.com/
7. Tools Reference
Postgres specific
- pgBackRest — https://pgbackrest.org/
- WAL-G — https://github.com/wal-g/wal-g
- pgBouncer — https://www.pgbouncer.org/
- Patroni (HA) — https://patroni.readthedocs.io/
- pg_repack — https://github.com/reorg/pg_repack
- pg_partman — https://github.com/pgpartman/pg_partman
- pgaudit — https://github.com/pgaudit/pgaudit
- pg_stat_statements — built-in
- explain.depesz.com — paste & analyze
- pev2 — https://github.com/dalibo/pev2
Migration tools
- Flyway — https://flywaydb.org/
- Liquibase — https://www.liquibase.org/
- golang-migrate — https://github.com/golang-migrate/migrate
- sqlx migrate — Rust
- Alembic — Python (SQLAlchemy)
- Atlas — https://atlasgo.io/
- gh-ost — https://github.com/github/gh-ost
- pt-online-schema-change — Percona
Schema design
- dbdiagram.io — visual schema
- DBML — DSL for schemas
- Prisma Schema — ORM-first
Modeling
- NoSQL Workbench (AWS) — DynamoDB design
- MongoDB Compass — schema analyzer
8. Newsletters
- Postgres Weekly — https://postgresweekly.com/
- DB Weekly — https://dbweekly.com/
- NoSQL Weekly — slightly older but archive useful
- The Pragmatic Engineer — broader engineering
- DataEngineering Weekly — Anand R
9. Communities
- r/PostgreSQL, r/Database
- Postgres Discord, Slack (Postgres community)
- Hacker News — search “Postgres”, “ClickHouse” tags
- Stack Overflow tags — postgresql, sql, indexing
10. Production Stories (Must-Read Case Studies)
- Stripe Engineering Blog — Postgres at fintech scale
- Notion Engineering — Postgres sharding via Citus
- GitHub Engineering — MySQL migration (Vitess), gh-ost
- Shopify Engineering — MySQL Vitess sharding
- Figma Engineering — Postgres horizontal partitioning
- Discord Engineering — Cassandra → ScyllaDB
- Mailchimp — Postgres XID wraparound incident
- Sentry 2015 XID wraparound — incident report
- Cloudflare engineering blog — D1 (SQLite at edge)
Cập nhật: 2026-05-16. Build dần khi gặp tài liệu mới.