Course overview

From Single-Node Tables to Distributed Storage Platforms

30 modules
116 lessons
Part 1

Appendices

  1. Appendix A - Diagram Templates by StepSign in

  2. Appendix B - Technology Mapping GuideSign in

  3. Appendix C - Readiness Assessments (Step N to Step N+1)Sign in

  4. Appendix D - GlossarySign in

Part 2

Course Setup and the Incremental Ladder

  1. Course Setup and the Incremental LadderSign in

  2. Why "Rows to Replicas": databases as system memory, and how guarantees get expensive fastSign in

  3. How to Use This Course: steps as "storage slices" (model -> mechanism -> guarantees -> failures)Sign in

  4. The Incremental Ladder (Step 0 -> Step 7): what each rung addsSign in

  5. The Course Lenses: data model and API, indexing and execution, transactions and concurrency, replication and distribution, recovery, operations, evolutionSign in

Part 3

Mental Models: What a Database System Is

  1. Mental Models: What a Database System IsSign in

  2. System Types: OLTP vs OLAP vs key-value vs object storage (and what "system of record" means)Sign in

  3. DB vs Storage Engine vs Cache: who owns truth, who owns speedSign in

  4. Guarantees as Product Features: latency, throughput, freshness, durabilitySign in

Part 4

Workloads and Access Patterns

  1. Workloads and Access PatternsSign in

  2. Read/Write Shapes: point lookups, range scans, joins, aggregatesSign in

  3. Logical vs Physical: schemas, views, projections, denormalizationSign in

  4. Constraint Mapping: latency SLOs, throughput targets, storage growth, failure expectationsSign in

Part 5

Diagramming and Notation for Databases

  1. Diagramming and Notation for DatabasesSign in

  2. Schema Shapes: ER diagrams, document shapes, key spacesSign in

  3. Index and Plan Diagrams: B+ trees/LSM at a conceptual level; operator pipelinesSign in

  4. Cluster Topologies: leaders/replicas/shards/routers; trust and failure boundariesSign in

Part 6

Step 0 Data Modeling: Relational, Document, Key–Value

  1. Step 0 Data Modeling: Relational, Document, Key–ValueSign in

  2. Relational Basics: keys, constraints, normalization vs denormalization (conceptual)Sign in

  3. Document Modeling: embedding vs referencing; shape evolution pressuresSign in

  4. Key-Value Modeling: key design, scans, and "opaque value" tradeoffsSign in

Part 7

Step 0 Query Interfaces and Baseline Execution

  1. Step 0 Query Interfaces and Baseline ExecutionSign in

  2. Query Languages: SQL-ish SELECT/JOIN/GROUP BY; document filters; KV get/put/scanSign in

  3. Full Scans as Baseline: filters, projections, orderingSign in

  4. Materialization Choices: early vs late materialization (high-level intuition)Sign in

Part 8

Step 1 Index Structures

  1. Step 1 Index StructuresSign in

  2. B+ Trees (Conceptual): what they optimize, what they costSign in

  3. Hash Indexes: where they win, where they fail (ranges, ordering)Sign in

  4. Clustered vs Non-Clustered; Primary vs Secondary IndexesSign in

Part 9

Step 1 Planning, Operators, and Performance

  1. Step 1 Planning, Operators, and PerformanceSign in

  2. Index Design and Maintenance: prefixes, composite keys, write amplificationSign in

  3. Query Planning (Conceptual): logical vs physical plans; selectivity and cardinality intuitionSign in

  4. Execution Operators: scans, joins (nested/merge/hash), sorts, aggregates (conceptual)Sign in

  5. Query Anti-Patterns: N+1, unbounded scans, missing indexes, ad hoc query chaosSign in

Part 10

Step 2 Transaction Guarantees

  1. Step 2 Transaction GuaranteesSign in

  2. ACID (Conceptual): what app developers actually get from each letterSign in

  3. Transaction Lifecycle: begin -> read/write -> commit/rollback; savepointsSign in

  4. Transaction Boundaries in Applications: where invariants live (and where they leak)Sign in

Part 11

Step 2 Isolation Levels and Application Patterns

  1. Step 2 Isolation Levels and Application PatternsSign in

  2. Isolation Levels (Conceptual): read uncommitted -> serializableSign in

  3. Anomalies: dirty/non-repeatable/phantoms and why "mostly works" is dangerousSign in

  4. App Patterns: idempotency, retries, invariant enforcement, saga-like compensationsSign in

Part 12

Step 3 Lock-Based Concurrency

  1. Step 3 Lock-Based ConcurrencySign in

  2. Shared vs Exclusive Locks; what blocks what (conceptual)Sign in

  3. Granularity: row/page/table/partition; escalation and hierarchiesSign in

  4. Practical Conflict Reduction: shorten transactions, order operations, avoid hot rowsSign in

Part 13

Step 3 MVCC, OCC, and Conflict Handling

  1. Step 3 MVCC, OCC, and Conflict HandlingSign in

  2. MVCC (Conceptual): snapshots, visibility rules, cleanup/vacuum pressureSign in

  3. OCC (Conceptual): read -> compute -> validate -> commit; where it shines/falls downSign in

  4. Deadlocks and Starvation: detection, timeouts, avoidance strategiesSign in

  5. Distributed Preview: why cross-node coordination changes everything (latency + partial failure)Sign in

Part 14

Step 4 Replication Fundamentals

  1. Step 4 Replication FundamentalsSign in

  2. Logical vs Physical Replication: change streams, log shipping, snapshots (conceptual)Sign in

  3. Why Replicate: HA, read scaling, locality, disaster recoverySign in

  4. The Enemy: lag, divergence windows, and stale readsSign in

Part 15

Step 4 Replication Topologies

  1. Step 4 Replication TopologiesSign in

  2. Single-Leader: write path, read options, read-your-writes pitfallsSign in

  3. Failover (Conceptual): detecting failure, promoting leaders, split-brain hazardsSign in

  4. Multi-Leader and Leaderless (Conceptual): conflicts, resolution pressure, operational complexitySign in

Part 16

Step 4 Sharding and Routing

  1. Step 4 Sharding and RoutingSign in

  2. Partitioning: horizontal vs vertical; range vs hash vs directorySign in

  3. Rebalancing and Hotspots: uneven keys, hot partitions, adaptive strategiesSign in

  4. Routing Layers: proxies/service discovery; client-side vs server-side routing; metadata control planeSign in

Part 17

Step 5 Failure Modes and Consistency Models

  1. Step 5 Failure Modes and Consistency ModelsSign in

  2. Failure Modes: node loss, slow nodes, partitions, partial vs total outagesSign in

  3. Consistency Models (Conceptual): strong/eventual/causal/session guaranteesSign in

  4. CAP-ish Thinking: tradeoffs as design choices, not labels; graceful degradation under partitionsSign in

Part 18

Step 5 Quorums and “App-Level Truth”

  1. Step 5 Quorums and “App-Level Truth”Sign in

  2. Quorum Intuition: W+R>N and what it buys you (high-level)Sign in

  3. Tunable Consistency: latency vs safety vs throughput; where knobs backfireSign in

  4. Application Design for Imperfect Consistency: semantic merges, idempotency, UX patterns for eventual correctnessSign in

Part 19

Step 6 WAL and Durability

  1. Step 6 WAL and DurabilitySign in

  2. Write-Ahead Logging: log-then-data ordering and why it mattersSign in

  3. Redo vs Undo vs Logical Logs (Conceptual): what each makes easy/hardSign in

  4. Group Commit and fsync Strategy: throughput vs latency tradeoffsSign in

Part 20

Step 6 Recovery and Engine Internals

  1. Step 6 Recovery and Engine InternalsSign in

  2. Checkpointing and Snapshots: recovery speed vs write overheadSign in

  3. Crash Recovery Flows: what happens after power loss (conceptual walkthroughs)Sign in

  4. Maintenance as a Feature: background work, compaction/vacuum, "hidden" system loadSign in

Part 21

Step 6 Storage Engine Families and Layout

  1. Step 6 Storage Engine Families and LayoutSign in

  2. B-Tree Engines: pages, splits/merges, fragmentation; random vs sequential implicationsSign in

  3. LSM Engines: memtables/SSTables, compaction strategies; write/read amplification tradeoffsSign in

  4. Layout and Compression: row vs column (conceptual), encoding, OLTP vs OLAP alignmentSign in

  5. Data Lifecycle: TTLs, archival, cold storage, capacity planningSign in

Part 22

Step 7 Shared-Nothing and Distributed Execution

  1. Step 7 Shared-Nothing and Distributed ExecutionSign in

  2. Shared-Nothing Locality: why data placement is performanceSign in

  3. Distributed Query Planning (Conceptual): scatter/gather, pushing down filters/aggregatesSign in

  4. Cross-Shard Joins/Sorts/Groups: why "simple SQL" gets complicated fastSign in

Part 23

Step 7 Distributed Transactions and Coordination

  1. Step 7 Distributed Transactions and CoordinationSign in

  2. 2PC Concepts: what it guarantees and why it is operationally heavySign in

  3. Consensus-Based Commits (High-Level): where coordination moves and what it costsSign in

  4. Avoiding Distributed Transactions: denormalization, async workflows, sagas, per-entity ownershipSign in

Part 24

Step 7 Global Indexes, Metadata, and Geo Distribution

  1. Step 7 Global Indexes, Metadata, and Geo DistributionSign in

  2. Global Secondary Indexes: maintaining them without killing write throughputSign in

  3. Metadata Services: partition maps, schema state, routing truthSign in

  4. Multi-Region Architectures: active-passive vs active-active; residency and latency constraintsSign in

Part 25

Step 7 Operating Database Platforms

  1. Step 7 Operating Database PlatformsSign in

  2. Observability: latency/throughput, queue depth, compaction pressure, replication lagSign in

  3. Multi-Tenancy: isolation models, noisy neighbors, limits, governance/billingSign in

  4. Schema Evolution and Migrations: online backfills, dual writes, safe rolloutsSign in

  5. Security and Compliance (Data-Centric): roles, encryption (conceptual), audit logs, access trackingSign in

  6. Reference Architectures: OLTP + replicas + cache; lake/warehouse pipeline; tunable-consistency KV storeSign in

Part 26

Data Modeling and API Design Patterns

  1. Data Modeling and API Design PatternsSign in

  2. Lessons on modeling for access patterns, invariants placement, denormalization boundaries, read/write isolation strategiesSign in

Part 27

Indexing and Query Performance Patterns

  1. Indexing and Query Performance PatternsSign in

  2. Lessons on composite keys, covering indexes, pagination shapes, join strategies, plan stability, and performance anti-patternsSign in

Part 28

Transaction and Consistency Patterns

  1. Transaction and Consistency PatternsSign in

  2. Lessons on idempotency and retries, outbox/inbox-style integration, conflict resolution semantics, session guarantees, eventual UXSign in

Part 29

Replication, Sharding, and Resilience Patterns

  1. Replication, Sharding, and Resilience PatternsSign in

  2. Lessons on failover playbooks, hotspot mitigation, resharding and rebalancing, quorum tuning, and degraded-mode designSign in

Part 30

Operations, Observability, and Evolution Patterns

  1. Operations, Observability, and Evolution PatternsSign in

  2. Lessons on SLOs for databases, capacity planning, backup and restore drills, migration playbooks, tenancy guardrails, and auditabilitySign in