Course overview

From Single Thread to Multi-Cloud

47 modules
193 lessons
—
Part 1

Course Setup and the Incremental Ladder

  1. Course Setup and the Incremental LadderSign in

  2. Why "Threads to Clusters"Sign in

  3. How to Use This CourseSign in

  4. The Incremental Ladder (Step 0 to Step 7)Sign in

  5. The Course LensesSign in

Part 2

Mental Models: Threads, Processes, Nodes, Clusters

  1. Mental Models: Threads, Processes, Nodes, ClustersSign in

  2. Core DefinitionsSign in

  3. Scaling ModelsSign in

  4. Failure DomainsSign in

Part 3

Architectures as Layers

  1. Architectures as LayersSign in

  2. The Layered ViewSign in

  3. Concern PlacementSign in

  4. Why You Don't Skip RungsSign in

Part 4

Diagramming and Notation

  1. Diagramming and NotationSign in

  2. Canonical Symbols and LegendsSign in

  3. Notation StylesSign in

  4. Reading and Writing Architecture DiagramsSign in

Part 5

Step 0 Architecture: Single-Threaded Systems

  1. Step 0 Architecture: Single-Threaded SystemsSign in

  2. Classical Monolith ShapeSign in

  3. Event Flow and Blocking I/OSign in

  4. Packaging and Manual DeploymentSign in

Part 6

Step 0 Operations: Local Data, Logging, Configuration

  1. Step 0 Operations: Local Data, Logging, ConfigurationSign in

  2. Local Persistence ModelsSign in

  3. Configuration and Secrets on One HostSign in

  4. Debugging and Local ObservabilitySign in

Part 7

Step 1 Compute: Threads and Async

  1. Step 1 Compute: Threads and AsyncSign in

  2. Concurrency Primitives: Work Queues, Thread Pools, Async RuntimesSign in

  3. Shared Memory Hazards: Locks, Deadlocks, Contention, False SharingSign in

  4. Canonical Patterns: Producer/Consumer, Reactor/Event Loop, Futures/PromisesSign in

Part 8

Step 1 Operations: Packaging and Running Concurrent Apps

  1. Step 1 Operations: Packaging and Running Concurrent AppsSign in

  2. CPU-bound vs I/O-bound: Choosing Concurrency StrategiesSign in

  3. Runtime Tuning: Thread Counts, Pools, and Saturation BehaviorSign in

  4. Profiling and Debugging Concurrency: Practical Diagnosis and Anti-Pattern RecognitionSign in

Part 9

Step 2 Architecture: Multi-Process Systems

  1. Step 2 Architecture: Multi-Process SystemsSign in

  2. Process Decomposition: Web Server, Worker, Scheduler as Separate ProcessesSign in

  3. IPC Patterns: Pipes, Unix Sockets, Shared Memory, Localhost TCPSign in

  4. Supervision and Lifecycle: Init Systems and Supervisors (Systemd-like Patterns)Sign in

Part 10

Step 2 Operations: Distribution, Security, Observability on One Host

  1. Step 2 Operations: Distribution, Security, Observability on One HostSign in

  2. Packaging Process Topologies: Bundles, Installers, and Dependency AlignmentSign in

  3. Local Perimeter Thinking: Loopback Security and Host Firewall BasicsSign in

  4. Structured Logs and Host Metrics: Preparing for the Container LeapSign in

Part 11

Containers as the New Process

  1. Containers as the New ProcessSign in

  2. Isolation Mechanics: Namespaces, cgroups, Container BoundariesSign in

  3. Designing Container Cuts: Mapping Multi-Process Apps to ContainersSign in

  4. Sidecar vs Single-Container: Trade-offs and Operational ConsequencesSign in

Part 12

Image Build, Packaging, Distribution

  1. Image Build, Packaging, DistributionSign in

  2. Dockerfile/OCI Design: Layers, Base Images, Multi-Stage BuildsSign in

  3. Registries and Tagging: Immutability, Promotion, Provenance (SBOM as Baseline)Sign in

  4. Reproducibility: Dev-to-Prod Workflows and Artifact DisciplineSign in

Part 13

Single-Host Container Networking and Security

  1. Single-Host Container Networking and SecuritySign in

  2. Bridge vs Host Networking: Port Mapping and Local RoutingSign in

  3. Local Naming/DNS: Service Naming on One HostSign in

  4. Least Privilege Containers: Users, Filesystem Permissions, Minimal ImagesSign in

Part 14

Operating Containerized Single-Host Systems

  1. Operating Containerized Single-Host SystemsSign in

  2. Multi-Container Topologies: Compose-like Orchestration PatternsSign in

  3. Health, Restarts, Failover: Liveness, Readiness, Restart PoliciesSign in

  4. Container Observability: Logs, Metrics, Tracing Basics Inside ContainersSign in

Part 15

Cluster Primitives

  1. Cluster PrimitivesSign in

  2. Workload Building Blocks: Pods/Tasks, Deployments, Jobs, DaemonSetsSign in

  3. Control Plane and Scheduling: Placement, Resourcing, Node PoolsSign in

  4. Requests/Limits and Bin Packing: Performance, Stability, Noisy NeighborsSign in

Part 16

Cluster Networking and Service Discovery

  1. Cluster Networking and Service DiscoverySign in

  2. East-West Traffic: Pod Networks and Service AbstractionsSign in

  3. Cluster DNS and Naming: Conventions and Failure BehaviorSign in

  4. L4 vs L7 Inside the Cluster: Load Balancing and Routing DecisionsSign in

Part 17

Ingress, Edge, External Access

  1. Ingress, Edge, External AccessSign in

  2. Ingress Controllers and Gateways: Edge Patterns and ResponsibilitiesSign in

  3. TLS Termination and mTLS: Secure Traffic Inside and Outside the ClusterSign in

  4. Public vs Private Ingress: Allowlists, WAF Integration, Exposure ControlSign in

Part 18

Packaging for Clusters

  1. Packaging for ClustersSign in

  2. Manifests and Charts: Helm/Kustomize Mental ModelsSign in

  3. Versioning and Release Mechanics: Promotion and Rollback StrategySign in

  4. Config and Secrets at Scale: Operational Models and Drift ControlSign in

Part 19

Data, State, Storage in a Single Cluster

  1. Data, State, Storage in a Single ClusterSign in

  2. Stateful Workloads: PVCs, Storage Classes, Stateful SetsSign in

  3. DB Inside vs Outside: Trade-offs and Operational PostureSign in

  4. Cache Placement: Cluster-Local vs External TiersSign in

Part 20

Observability and Reliability in a Single Cluster

  1. Observability and Reliability in a Single ClusterSign in

  2. Central Telemetry: Logging, Metrics, Tracing Stacks and PatternsSign in

  3. Probes and Autoscaling: Readiness/Liveness, HPA Patterns, Disruption BudgetsSign in

  4. Incident Operations: Canary, Blue/Green, and Recovery WorkflowsSign in

Part 21

Why Multi-Cluster

  1. Why Multi-ClusterSign in

  2. Isolation Models: Per-Tenant, Per-Team, Per-Env MotivationsSign in

  3. Trade-offs vs One Mega-Cluster: Complexity, Cost, Failure IsolationSign in

  4. When Multi-Cluster Is Justified: Thresholds and TriggersSign in

Part 22

Topologies: Cell-Based and Hub-and-Spoke

  1. Topologies: Cell-Based and Hub-and-SpokeSign in

  2. Cells/Shards vs Shared Control: Design Choices and ConsequencesSign in

  3. Ingress Models: Per-Cluster Ingress vs Shared Ingress LayersSign in

  4. Regional Segmentation: Network Segmentation Patterns in One RegionSign in

Part 23

Cross-Cluster Networking and Discovery

  1. Cross-Cluster Networking and DiscoverySign in

  2. Private Networking: VPC/VNet Peering and Private LinksSign in

  3. Federation and Mesh: DNS, Mesh Federation, and Discovery PatternsSign in

  4. Routing Strategies: Failover, Shadowing, Regional Load BalancingSign in

Part 24

Data and Caching Across Clusters

  1. Data and Caching Across ClustersSign in

  2. Shared vs Per-Cluster Datastores: Governance and Blast RadiusSign in

  3. Cache Tiers: Cluster-Local vs Shared Cache BackbonesSign in

  4. Event Buses: Messaging as the Cross-Cluster Integration PlaneSign in

Part 25

CI/CD, Packaging, Governance

  1. CI/CD, Packaging, GovernanceSign in

  2. Artifact Promotion: Images and Configuration Across ClustersSign in

  3. GitOps and Pipelines: Multi-Cluster Deployment MechanicsSign in

  4. Policy as Code: Admission Control, Scanning, and ComplianceSign in

Part 26

Regions and Failure Domains

  1. Regions and Failure DomainsSign in

  2. Regions/AZs as Boundaries: What Can Fail TogetherSign in

  3. Active-Active vs Active-Passive: Availability ModelsSign in

  4. RTO/RPO: Defining Recovery Objectives and ConstraintsSign in

Part 27

Global Traffic Management and DNS

  1. Global Traffic Management and DNSSign in

  2. Global DNS Policies: Latency, Geo, and Failover StrategiesSign in

  3. Anycast and CDN Edges: Routing Implications and Trade-offsSign in

  4. Health-Based Failover: Combining DNS and L7 RoutingSign in

Part 28

Data Replication and Consistency

  1. Data Replication and ConsistencySign in

  2. Strong vs Eventual: What You Can Promise GloballySign in

  3. Topologies: Leader-Follower, Multi-Leader, Conflict ResolutionSign in

  4. Replication Failure Modes: Lag, Split-Brain, ReconciliationSign in

Part 29

Caching and Performance at Global Scale

  1. Caching and Performance at Global ScaleSign in

  2. Edge vs Regional Caches: Placement and CoherenceSign in

  3. Invalidation Strategies: TTLs, Hints, Stampede MitigationSign in

  4. Read-Mostly vs Write-Heavy: Performance Posture and ConstraintsSign in

Part 30

Security, Identity, Compliance Across Regions

  1. Security, Identity, Compliance Across RegionsSign in

  2. Data Residency: Region-Specific Compliance ImpactsSign in

  3. Federated Identity: Region-Aware AuthZ and PolicySign in

  4. Key Management: KMS/HSM Patterns and Secure DistributionSign in

Part 31

Operating Multi-Region Systems

  1. Operating Multi-Region SystemsSign in

  2. Failover Playbooks: Drains, Failback, CutoversSign in

  3. Game Days and DR Rehearsals: Operational ValidationSign in

  4. Global Observability: SLOs per Region, Aggregation, Incident CoordinationSign in

Part 32

Why Multi-Cloud

  1. Why Multi-CloudSign in

  2. Real Motivations vs Myths: Risk, Locality, Negotiation, Capability GapsSign in

  3. When Multi-Region Is Enough: Avoiding Unnecessary ComplexitySign in

  4. Anti-Goals: What Multi-Cloud Should Not Be Used to SolveSign in

Part 33

Abstraction Layers and Control Planes

  1. Abstraction Layers and Control PlanesSign in

  2. Cloud-Agnostic vs Cloud-Native: The Portability Trade SpaceSign in

  3. Common Control Planes: Orchestration and Policy PatternsSign in

  4. Contracts and APIs: Minimizing Lock-In Through Explicit InterfacesSign in

Part 34

Networking Across Clouds

  1. Networking Across CloudsSign in

  2. Connectivity Options: VPN, Direct Connect, OverlaysSign in

  3. Routing and DNS: Naming and Traffic Management Across ProvidersSign in

  4. Cost and Latency: Egress, Bottlenecks, and Optimization PostureSign in

Part 35

Identity, Access, Policy Federation

  1. Identity, Access, Policy FederationSign in

  2. Cross-Cloud SSO: Federated Identity FundamentalsSign in

  3. Consistent Authorization: RBAC/ABAC Across ProvidersSign in

  4. Policy as Code at Multi-Cloud Scope: Enforcement and AuditingSign in

Part 36

Data Portability and Gravity

  1. Data Portability and GravitySign in

  2. Data Gravity: why data dominates architecture decisionsSign in

  3. Replication and DR: cross-cloud backup and recovery modelsSign in

  4. Portability Boundaries: what must be portable vs what can be per-cloudSign in

Part 37

Packaging and Distribution for Multi-Cloud

  1. Packaging and Distribution for Multi-CloudSign in

  2. Portable Artifacts: images, manifests, infra-as-code disciplineSign in

  3. Multi-Cloud Pipelines: promotion and verificationSign in

  4. Extensions: provider-specific vs provider-neutral patternsSign in

Part 38

Compute and Concurrency Patterns

  1. Compute and Concurrency PatternsSign in

  2. Thread Pools, Work Queues, and Saturation BoundariesSign in

  3. Async I/O, Event Loops, and BackpressureSign in

  4. Actors, Green Threads, and Isolation-by-MailboxSign in

  5. Request/Response vs Event-Driven vs Batch: Choosing the Work ModelSign in

  6. Scaling Patterns Across the Ladder: When "More Instances" FailsSign in

Part 39

Integration, Messaging, Event-Driven Architectures

  1. Integration, Messaging, Event-Driven ArchitecturesSign in

  2. Integration Boundaries: App Layer vs Data Layer vs Infra LayerSign in

  3. Queues, Topics, Streams: Semantics and Operational Trade-offsSign in

  4. Change Data Capture and the Event BackboneSign in

  5. The Outbox Pattern: Making Side Effects DurableSign in

  6. Sagas and Distributed Workflows: Coordinating Without a Global TransactionSign in

Part 40

Caching and Performance Patterns

  1. Caching and Performance PatternsSign in

  2. Cache-Aside, Read-Through, Write-Through, Write-Behind: What You PromiseSign in

  3. Hot Keys, Hot Partitions, and Load SkewSign in

  4. Cache Stampede and Thundering Herd: Prevention and MitigationSign in

  5. Local -> Cluster -> Global: How Cache Boundaries Evolve Up the LadderSign in

  6. Consistency Hints: TTLs, Invalidation, and "Good Enough" CorrectnessSign in

Part 41

Load Balancing and Traffic Shaping Patterns

  1. Load Balancing and Traffic Shaping PatternsSign in

  2. L4 vs L7 Load Balancing: Connection vs Request SemanticsSign in

  3. Blue/Green and Canary: Release Safety as a Boundary DesignSign in

  4. Shadow Traffic and A/B Testing: Measurement Without Breaking UsersSign in

  5. Rate Limiting and Quotas: Protecting Shared SystemsSign in

  6. Backpressure, Circuit Breakers, and Overload ControlSign in

Part 42

Security Architecture and Zero Trust

  1. Security Architecture and Zero TrustSign in

  2. Authentication vs Authorization: What Each Boundary EnforcesSign in

  3. RBAC, ABAC, and Policy Evaluation: Consistency Across StepsSign in

  4. Perimeter to Microsegmentation: Network Segmentation PatternsSign in

  5. Secrets Distribution and Rotation: Secure Bootstrapping Over TimeSign in

  6. Service Mesh and Zero Trust: When It Helps, When It HurtsSign in

Part 43

Observability, SLOs, Operational Maturity

  1. Observability, SLOs, Operational MaturitySign in

  2. Metrics, Logs, Traces: Signals and Failure BoundariesSign in

  3. SLIs and SLOs: Turning "Reliability" Into a ContractSign in

  4. Error Budgets and Release Policy: Governing Change with DataSign in

  5. Alerting and On-Call Design: Avoiding Paging as a Monitoring StrategySign in

  6. Operational Maturity by Step: Readiness Criteria Across the LadderSign in

Part 44

Diagram Templates by Step

  1. Diagram Templates by StepSign in

  2. Diagram Templates by StepSign in

Part 45

Technology Mapping Guide

  1. Technology Mapping GuideSign in

  2. Technology Mapping GuideSign in

Part 46

Readiness Assessments: Moving from Step N to Step N+1

  1. Readiness Assessments: Moving from Step N to Step N+1Sign in

  2. Readiness Assessments: Moving from Step N to Step N+1Sign in

Part 47

Glossary: Canonical Definitions (and the Boundaries They Imply)

  1. Glossary: Canonical Definitions (and the Boundaries They Imply)Sign in

  2. Glossary: Canonical Definitions (and the Boundaries They Imply)Sign in