Course
Overview
free
Foundations: What a Cloud Platform Is
0/9
Foundations: What a Cloud Platform Is
Why "Service to Serverless"
How to Use This Course
The Incremental Ladder (Step 0 to Step 7)
The Course Lenses
Diagram Legend and Notation Types
From Datacenters to "As-a-Service" Abstractions
IaaS, PaaS, and FaaS as a Responsibility Spectrum
Shared Responsibility as a Boundary Contract
Virtualization Fundamentals for a “Mini-Cloud”
0/4
Virtualization Fundamentals for a “Mini-Cloud”
Hypervisors as the First Isolation Boundary
VMs, Images, and Snapshots as a Packaging and Lifecycle Model
Overcommit, Live Migration, and Early Storage Boundaries
Containers as the Density Layer (on VMs)
0/4
Containers as the Density Layer (on VMs)
Containers vs VMs: Isolation and Lifecycle Tradeoffs
Images, Layers, and Registries as Distribution Boundaries
Manual Scheduling and Handcrafted Cluster Configs: What Can Fail Together
Baby SDN: Virtual Networks and Security Groups
0/4
Baby SDN: Virtual Networks and Security Groups
Virtual Networks, Subnets, Routing, and Internet Egress
Security Groups as Coarse Firewall Rules and Policy Attachment Points
Load Balancers at a Conceptual Level: Service IPs and Failure Masking
First Control Plane: API + UI for a Tiny Cloud
0/4
First Control Plane: API + UI for a Tiny Cloud
Create and Delete VMs and Networks via API: The First System of Record
Dashboard and CLI as Control-Plane Clients: Shaping Safe Workflows
State Stores as "Source of Truth": Desired State, Drift, and Repair
IAM Fundamentals: Identities, Roles, Policies
0/4
IAM Fundamentals: Identities, Roles, Policies
Users vs Service Identities: Who Can Act, and How You Attribute Actions
RBAC and Scopes: Mapping Roles and Permissions to Resources
Policy Evaluation at a High Level: Enforcement Points and Failure Modes
Resource Hierarchies and Delegated Administration
0/4
Resource Hierarchies and Delegated Administration
Orgs to Folders to Projects to Resources - The Governance Tree
Permission Inheritance and Scoping - Minimizing Blast Radius by Default
Delegated Administration - Operating at Scale Without Central Bottlenecks
Auth, Federation, SSO, and Credential Strategy
0/4
Auth, Federation, SSO, and Credential Strategy
Federation with External IdPs (Conceptual): Identity Boundaries and Trust
SSO to Console and APIs: Consistent Authentication Paths
Short-Lived Credentials and Key Management (High Level): Reducing Standing Access
Audit Logging and Governance Hooks
0/4
Audit Logging and Governance Hooks
Control-Plane Audit Logs: Who/What/Where/When, and Why It Matters Operationally
SIEM Integration Concepts: Turning Events into Detection and Response
Guardrails vs Hard Blocks: Governance Policies as Platform Contracts
Tenancy Models and Shared Responsibility
0/4
Tenancy Models and Shared Responsibility
Physical vs Logical Tenancy: Choosing Where Isolation Lives
Account and Project Boundaries: Scoping Identity and Resources
Soft vs Hard Multitenancy: Tradeoffs in Cost, Risk, and Operability
Isolation Boundaries: Network, Compute, Storage
0/4
Isolation Boundaries: Network, Compute, Storage
Network Isolation: VPC-Like Constructs, Peering, and Private Connectivity
Compute Isolation: Quotas, Noisy Neighbors, and Fairness Controls
Storage Isolation: Encryption and Per-Tenant Key Ideas (Conceptual)
Multi-Tenant Data Planes and Blast Radius
0/4
Multi-Tenant Data Planes and Blast Radius
Shared Control Plane; Shared vs Per-Tenant Data Planes: Choosing Boundaries
Designing Safe Shared Services: Gateways, Queues, and Databases (Conceptual)
Blast Radius Controls: Compartmentalization and Graceful Degradation
Compliance Zones and Regulated Tenants
0/4
Compliance Zones and Regulated Tenants
Data Residency and Compliance Zones: Partitioning the Platform
Baselines and Blueprints per Tenant: Repeatable Controls
Evidence and Auditability: Proving What Happened and What Is Enforced
Multitenancy in Serverless and Managed Services
0/4
Multitenancy in Serverless and Managed Services
Hidden Sharing in Serverless Runtimes: Where Tenants Meet
Fairness and Tenant SLOs in Shared Compute Pools
Isolation Constraints for Managed Services: What Cannot Be Customized
Cluster Managers and Scheduling Basics
0/4
Cluster Managers and Scheduling Basics
Node Pools and Capacity Pools: Defining Placement Domains
Binpacking, Anti-Affinity, and Guarantees: Scheduling Goals and Failure Tradeoffs
Queueing and Priorities: Who Gets Capacity Under Contention
Declarative Control Planes and Reconciliation Loops
0/4
Declarative Control Planes and Reconciliation Loops
Desired vs Observed State: What Declarative Means Operationally
Controllers and Reconcilers: Convergence, Retries, and Backoff
Versioned APIs and Compatibility: Evolving a Platform Safely
Service Discovery and Load Balancing
0/4
Service Discovery and Load Balancing
DNS and Service Registries: Naming as a Platform Boundary
L4 vs L7 Load Balancing (Conceptual): Where Policy and Retries Live
Health Checks and Out-of-Rotation Behavior: Isolating Partial Failure
Deployments, Rollouts, and Runtime Management
0/4
Deployments, Rollouts, and Runtime Management
Rolling, Blue-Green, and Canary: Change as a Controlled Experiment
Config as Data: Secrets, Config Maps, and Flags as Boundaries
Draining, Pausing, and Rescheduling: Coordinating with Load and State
Serverless as Higher-Order Orchestration
0/4
Serverless as Higher-Order Orchestration
Functions and Jobs plus Triggers: Execution as a Managed Boundary
Event-Based Autoscaling: Signals, Backpressure, and Thundering Herds
Limits of Serverless Abstractions: What You Still Have to Design
Observability at Platform Scale
0/4
Observability at Platform Scale
Metrics, Logs, Traces, and Events: Four Signals and What They Answer
Standardizing Telemetry Across Services: Consistency for Operators
Cardinality and Observability Cost: When Visibility Becomes a Budget Risk
SLOs, SLIs, and Error Budgets
0/4
SLOs, SLIs, and Error Budgets
Platform SLOs vs Customer SLOs: Setting Realistic Contracts
Selecting SLIs: Latency, Availability, Correctness, and What They Hide
Error Budgets: Decision Tools for Balancing Change and Stability
Failure Domains and Resilient Topologies
0/4
Failure Domains and Resilient Topologies
Zones, Regions, and Global Control Planes: Naming Correlated Failure
Multi-AZ and Multi-Region Patterns: Redundancy as a Boundary Choice
Latency vs Resilience: Understanding What You Pay to Reduce Blast Radius
Incident Response and On-Call for Platforms
0/4
Incident Response and On-Call for Platforms
On-Call Patterns for Shared Platforms: Routing and Ownership
Runbooks, Playbooks, and Automation: Scaling Response Without Heroics
Post-Incident Reviews and Learning Loops: Turning Incidents Into Design Changes
Chaos Engineering and Failure Injection
0/4
Chaos Engineering and Failure Injection
Host, Network, and API Failure Modes: Choosing Experiments That Matter
Validating Autoscaling, Failover, and Self-Healing: Measuring the Control Loops
Folding Results Back Into Design: Resilience as a Continuous Refactor
Capacity Planning Fundamentals
0/4
Capacity Planning Fundamentals
Demand Modeling, Utilization, and Headroom: Translating Uncertainty into Buffers
Compute, Storage, and Network Capacity: Understanding Independent Bottlenecks
Procurement and Buffer Strategy: Lead Time as a Failure Domain
Allocation, Reservations, and Overcommit
0/4
Allocation, Reservations, and Overcommit
Reserved vs Best-Effort Resources: Setting Tenant Expectations
CPU and Memory Overcommit Risks: Efficiency vs Correlated Failure
Placement and Anti-Noisy-Neighbor Controls: Guardrails for Fairness
Metering, Pricing, and Billing Pipelines
0/4
Metering, Pricing, and Billing Pipelines
Billing Units: Choosing What You Measure and What Customers Can Predict
Metering Pipelines and Aggregation: Turning Events Into Durable Usage
Invoices and Showback/Chargeback: Trust, Disputes, and Corrections
Cost Controls and Optimization
0/4
Cost Controls and Optimization
Budgets, Alerts, and Quotas: Cost as a Control Plane Contract
Rightsizing, Autoscaling, and Capacity Optimization: Feedback Loops for Waste
Reserved and Discount Program Patterns: Incentives and Lock-In Trade-offs
Sustainability and Efficiency as Platform Requirements
0/4
Sustainability and Efficiency as Platform Requirements
Energy and Carbon Considerations (Conceptual): Externalities as Constraints
Hardware Refresh and Density Strategy: Keeping the Fleet Efficient
"Cost Efficiency" as a Platform SLO: Operationalizing Efficiency
Cloud API Design Patterns
0/4
Cloud API Design Patterns
Resource-Oriented APIs: Modeling the Platform as a Graph of Resources
Idempotency, Pagination, Long-Running Operations: Building Safe Clients
Versioning and Deprecation: Changing Contracts Without Breaking Tenants
CLIs, SDKs, and Developer Experience
0/4
CLIs, SDKs, and Developer Experience
CLI Command Structure and Auth UX: Avoiding Sharp Edges
SDK Design: Thin Wrappers vs Higher-Level Abstractions
Docs, Samples, and Golden Paths: Teaching the Intended Workflow
Service Catalogs and Internal Marketplaces
0/4
Service Catalogs and Internal Marketplaces
Service Catalogs and Templates: Productizing Internal Capabilities
Provisioning Flows With Policy Checks: Self-Service With Guardrails
Tagging, Ownership Metadata, and Discovery: Making Operations Tractable
External Marketplaces and Partner Ecosystems
0/4
External Marketplaces and Partner Ecosystems
Third-Party Listings and Integrations: Expanding the Platform Safely
Billing and Entitlement Integration: Turning Usage into Agreements
Trust and Security Vetting: Controlling Supply Chain Risk
Governance and Policy Engines
0/4
Governance and Policy Engines
Org-Wide Policies (Locations, Types, Sizes): Guardrails for Sprawl
Constraint Frameworks (Conceptual): Evaluating Policy at Scale
Governance With Good DX: Exceptions, Previews, and Safe Defaults
Global Platforms: Regions, Zones, and Control Plane Consistency
0/4
Global Platforms: Regions, Zones, and Control Plane Consistency
Region/Zone Design: Partitioning for Availability and Latency
Global APIs vs Per-Region Endpoints: Routing, Failover, and Contract Boundaries
Consistency Models for Control Planes: What "Global State" Can Mean
Core Service Families: Compute, Storage, Network
0/4
Core Service Families: Compute, Storage, Network
Compute Services: VMs, Containers, and Serverless as Products and Boundaries
Storage Services: Block, Object, File, and Databases from a Platform View
Network Services: Routing, Load Balancing, and Private Connectivity
Data and AI Platforms as “Platforms on the Platform”
0/4
Data and AI Platforms as “Platforms on the Platform”
Data Warehouses, Lakes, and Streaming (Conceptual): Shared Services With Strong Contracts
Managed ML Services Patterns: Multi-Tenant Training and Inference Boundaries
Governance and Lineage Hooks: Observing Data Movement Across Domains
Cloud Provider Security Posture
0/4
Cloud Provider Security Posture
Provider Internal Security Responsibilities: Securing the Platform Itself
Tenant Isolation, Key Management, and Identity at Scale: Designing for Least Trust
Threat Modeling and Red Teams: Testing the Boundaries You Claim
Operating the Organization: Platform Teams and Rollouts
0/4
Operating the Organization: Platform Teams and Rollouts
Platform Org Structures and Product Lines: Aligning Ownership With Boundaries
Change Management, Rollout Waves, and Feature Flags: Reducing Correlated Failure
Customer Comms: Status Pages, SLAs, and Public Postmortems
Evolution and Next-Gen Platforms
0/4
Evolution and Next-Gen Platforms
Backward Compatibility: Introducing New Abstractions Without Breaking Tenants
Deprecation and Migration Playbooks: Operationalizing Change Over Years
Long-Term Bets: Serverless, Edge, and the Next Boundary Shift
Capstone: Design a Cloud Platform Slice
0/4
Capstone: Design a Cloud Platform Slice
Capstone Problem Statement and Constraints
Deliverables: Diagrams and a "Platform Spec" Document
Self-Review and Evaluation Rubric: Checking Boundary Claims Against Failures
Reset progress
/
service-to-serverless
/
service-to-serverless
Search
K
Browse Courses
System
Fairness and Tenant SLOs in Shared Compute Pools
Sign in to access this lesson.
Sign in
Create account