Course overview

How to Design Internet-Scale Networks

30 modules
122 lessons
—
Part 1

Course Setup and the Incremental Ladder

  1. Course Setup and the Incremental LadderSign in

  2. Why Sockets to SatellitesSign in

  3. How to Use This CourseSign in

  4. The Incremental Ladder (Step 0 to Step 7)Sign in

  5. The Course LensesSign in

  6. Diagram Legend and Notation TypesSign in

Part 2

Mental Models: Layers, Paths, and Control

  1. Mental Models: Layers, Paths, and ControlSign in

  2. The Internet as a Layered SystemSign in

  3. Control Plane vs Data PlaneSign in

  4. Reliability Emerges from CompositionSign in

Part 3

Transport & Network Basics (Conceptual)

  1. Transport & Network Basics (Conceptual)Sign in

  2. Sockets, Ports, and ProcessesSign in

  3. TCP vs UDPSign in

  4. IP Packets and MTUSign in

Part 4

Diagramming Networked Systems

  1. Diagramming Networked SystemsSign in

  2. Socket Flow DiagramsSign in

  3. Topology DiagramsSign in

  4. Layer OverlaysSign in

Part 5

Step 0 Endpoints: Local Networking and Simple LANs

  1. Step 0 Endpoints: Local Networking and Simple LANsSign in

  2. Single-Host NetworkingSign in

  3. Simple LANs and Default GatewaysSign in

  4. Basic Address AssignmentSign in

Part 6

Step 0 Transport Realities: Connections in Practice

  1. Step 0 Transport Realities: Connections in PracticeSign in

  2. TCP Lifecycle at a High Level: handshake, teardown, and why TIME_WAIT-ish realities matterSign in

  3. Middlebox Awareness: connection tracking, NAT intuition, and stateful filtering effectsSign in

  4. Timeouts, Retries, and Reuse: correctness under loss, and how apps accidentally DDoS themselvesSign in

Part 7

Step 0 Delivery: Minimal Network for a Service

  1. Step 0 Delivery: Minimal Network for a ServiceSign in

  2. The Minimal Service Network: one host, one service, basic firewall rules, and observability hooksSign in

  3. Failure Surfaces: packet loss, port blocks, MTU weirdness, and "it's always DNS" (eventually)Sign in

  4. Validation in Small: repeatable tests, traffic captures conceptually, and safe change habitsSign in

Part 8

Step 1 Addressing: Subnets and Practical IP Design

  1. Step 1 Addressing: Subnets and Practical IP DesignSign in

  2. Prefixes and Subnets: conceptual masks, allocation, and why IP planning is architectureSign in

  3. Private vs Public Addressing: translation boundaries and where complexity accumulatesSign in

  4. Dual Stack Basics: IPv6 framing, migration posture, and compatibility realitiesSign in

Part 9

Step 1 Layer 2: Switching, VLANs, and Segmentation

  1. Step 1 Layer 2: Switching, VLANs, and SegmentationSign in

  2. MACs and Broadcast Domains: why L2 scales until it doesn'tSign in

  3. VLAN segmentation: separating traffic by purpose, security zone, and operational ownershipSign in

  4. Loop avoidance concepts: spanning-tree-like intuition and failure modes of "just plug it in"Sign in

Part 10

Step 1 Layer 3: Routing Inside an Organization

  1. Step 1 Layer 3: Routing Inside an OrganizationSign in

  2. Routing tables and prefixes: static routes vs dynamic routing families (IGP-level view)Sign in

  3. Default routes and gateways: designing the "exit" and preventing accidental hairpinsSign in

  4. High availability basics: redundant routers, simple failover, and where state bites youSign in

Part 11

Step 1 Delivery: Enterprise LAN/WAN Design Slice

  1. Step 1 Delivery: Enterprise LAN/WAN Design SliceSign in

  2. Core-Distribution-Access: common topology patterns and why hierarchy helps operationsSign in

  3. Zoning by function and risk: prod vs corp, PCI-ish zones, and blast-radius managementSign in

  4. NAT/firewalls as architectural boundaries: what they simplify, what they complicate, and how apps adaptSign in

Part 12

Step 2 DNS Fundamentals

  1. Step 2 DNS FundamentalsSign in

  2. Why Indirection Matters: names as stable handles; IPs as implementation detailsSign in

  3. Recursive vs Authoritative: caching, recursion, and the latency/availability trade spaceSign in

  4. TTLs and Propagation: designing for change without lying to yourself about timeSign in

Part 13

Step 2 Records, Zones, and Architecture Patterns

  1. Step 2 Records, Zones, and Architecture PatternsSign in

  2. Record Types as Primitives: A/AAAA/CNAME/TXT/SRV-like roles conceptuallySign in

  3. Zones and Delegation: scaling ownership across orgs and environmentsSign in

  4. DNS in Application Architecture: multi-record load distribution and DNS-based failover patternsSign in

Part 14

Step 2 Service Discovery Strategies

  1. Step 2 Service Discovery StrategiesSign in

  2. Internal DNS and Split-Horizon Concepts: internal vs external naming, and boundary safetySign in

  3. Naming Conventions at Scale: service/region/env hierarchies that survive re-orgsSign in

  4. When DNS Isn't Enough: dedicated discovery systems conceptually and the cost of "more control"Sign in

Part 15

The Internet as Autonomous Systems

  1. The Internet as Autonomous SystemsSign in

  2. ASNs and the Global Graph: how packets cross many networksSign in

  3. Prefixes and Reachability: why "announcing space" is powerSign in

  4. Failure at the Edges: leaks, misconfigs, and why local mistakes go globalSign in

Part 16

BGP Fundamentals and Policy (Conceptual)

  1. BGP Fundamentals and Policy (Conceptual)Sign in

  2. Sessions and Route Exchange: peers, advertisements, withdrawals at a high levelSign in

  3. Path Selection Intuition: attributes and policy shaping without knob-level depthSign in

  4. Inbound vs Outbound Control: what you can influence, what you mostly can'tSign in

Part 17

Peering, Transit, and Routing Security

  1. Peering, Transit, and Routing SecuritySign in

  2. Peering vs Transit: economic drivers that become topologySign in

  3. IXPs and Peering Fabrics: why shared interconnects change cost and latencySign in

  4. Routing Security Basics: hijacks/leaks, RPKI intuition, and monitoring postureSign in

Part 18

Step 4 Edge Mental Models

  1. Step 4 Edge Mental ModelsSign in

  2. Latency as Geography: why POPs exist and how proximity becomes product qualitySign in

  3. Edge POP Footprints: placement heuristics and failure domainsSign in

  4. Origin Shielding: using edge layers to protect core servicesSign in

Part 19

Step 4 Anycast and Caching

  1. Step 4 Anycast and CachingSign in

  2. Anycast Addressing: one IP, many locations; "nearest" via routing outcomesSign in

  3. Cache Hierarchies: hit rates, parent/child caches, and controlling blast radiusSign in

  4. Invalidation and Freshness: correctness vs cache efficiency as a living tradeSign in

Part 20

Step 4 Beyond Caching: Edge Services

  1. Step 4 Beyond Caching: Edge ServicesSign in

  2. TLS Termination and WAF at the Edge: shifting work outward (and new trust boundaries)Sign in

  3. Edge Compute Concepts: running logic close to users without turning POPs into snowflakesSign in

  4. Multi-CDN Strategies: traffic steering, portability, and operational complexitySign in

Part 21

Step 5 Load Balancing Fundamentals

  1. Step 5 Load Balancing FundamentalsSign in

  2. L4 vs L7 Load Balancing: connection routing vs request routing as different control leversSign in

  3. Health Checks and Out-of-Rotation: truth, lies, and delayed failure detectionSign in

  4. Algorithms and Weights: round robin, least connections, and why "fair" is not always stableSign in

Part 22

Step 5 Global Traffic Steering

  1. Step 5 Global Traffic SteeringSign in

  2. Global Load Balancing Patterns: geo steering, failover, and multi-region postureSign in

  3. Anycast + L7 Steering: combining routing and application logic without fighting yourselfSign in

  4. Session Affinity and State: sticky sessions, stateless design, and mobility of trafficSign in

Part 23

Step 5 Capacity and Failure Engineering

  1. Step 5 Capacity and Failure EngineeringSign in

  2. Traffic Engineering Goals: performance vs cost vs resilience in explicit trade-offsSign in

  3. Load Shedding and Admission Control: graceful degradation at edge and originSign in

  4. Big Event Drills: peak planning, region evacuation patterns, and controlled rollback mindsetsSign in

Part 24

Step 6 Threat Models and Defensive Design

  1. Step 6 Threat Models and Defensive DesignSign in

  2. Attack Classes at a High Level: volumetric, protocol, and application-layer pressuresSign in

  3. Defense-in-Depth Topologies: edge, mid-tier, origin segmentation and blast-radius controlSign in

  4. Flash Crowds vs Attacks: designing systems that survive both without guessing intentSign in

Part 25

Step 6 DDoS Mitigation Building Blocks

  1. Step 6 DDoS Mitigation Building BlocksSign in

  2. Rate Limiting and Filtering: where to enforce, what signals to use, and failure safetySign in

  3. Scrubbing and Diversion: traffic reroute concepts and operational implicationsSign in

  4. Anycast as Absorption: distributing attack load and the debugging complexity it introducesSign in

Part 26

Step 6 Security Operations and Governance

  1. Step 6 Security Operations and GovernanceSign in

  2. Security Telemetry: flows/logs/metrics and the minimum viable situational awarenessSign in

  3. Incident Response for Network Events: runbooks, coordination with providers/peers, and rollback disciplineSign in

  4. Policy and Abuse Handling: enforcement, privacy boundaries, and sustainable operationsSign in

Part 27

Step 7 Long-Haul and Backbone Design (Conceptual)

  1. Step 7 Long-Haul and Backbone Design (Conceptual)Sign in

  2. Metro vs Long-Haul vs Subsea: diversity, latency, and capacity as first-class constraintsSign in

  3. Owning vs Leasing: when control and predictability justify costSign in

  4. Failure Domains at Backbone Scale: link cuts, regional events, and diversity planningSign in

Part 28

Step 7 Satellite and Non-Terrestrial Links (Conceptual)

  1. Step 7 Satellite and Non-Terrestrial Links (Conceptual)Sign in

  2. GEO/MEO/LEO Trade-offs: latency, jitter, handoffs, and link variability as design inputsSign in

  3. Integrating Satellite into Broader Networks: routing posture, backhaul, and service expectationsSign in

  4. Designing for Extreme Latency: timeouts, buffering, and user experience under long RTTsSign in

Part 29

Step 7 Internet-Scale Architecture and Operations

  1. Step 7 Internet-Scale Architecture and OperationsSign in

  2. Regions, POPs, and Backbones: stitching edge and core into coherent failure domainsSign in

  3. Multi-Cloud and Multi-Backbone: hybrid connectivity strategies and observability challengesSign in

  4. Operating Global Networks: NOCs, staged rollouts, change management, and continuous improvement cultureSign in

Part 30

Step 7 Reference Architectures and Maturity

  1. Step 7 Reference Architectures and MaturitySign in

  2. Global SaaS Network: multi-region ingress, edge acceleration, and operational safety railsSign in

  3. Streaming/Video Delivery Network: throughput, caching strategy, and peak-event readinessSign in

  4. Gaming and Global API Platforms: latency sensitivity, routing choices, and fairness under congestionSign in