CDN (Content Delivery Network) Analysis Note
Table of Contents
CDN (Content Delivery Network) Analysis Note #
This note captures the full step-by-step analysis for a CDN: origin truth, edge cache state, invalidation, routing policy, and bounded-stale edge delivery.
Step 1 — Normalize #
Assume the baseline prompt is:
- design a CDN
- users request content from nearby edge nodes
- origin content is authoritative
- edge nodes cache content and metadata
- content can be invalidated or refreshed
- system scales globally across many edge locations
| Requirement | Actor | Operation | State touched | Priority |
|---|---|---|---|---|
| Client requests content | Client | read projection | S1read projection targetEdgeCacheEntry | C1 |
| Origin publishes or updates content metadata | Client | overwrite state | S1update targetOriginObject | C1 |
| System fetches missing/stale content from origin | System | async process | S1hidden write targetEdgeCacheEntry | C1 |
| System invalidates edge cache entry | System | state transition | S1update targetEdgeCacheEntry | C1 |
| System updates routing/policy for edge selection | Admin | overwrite state | S1update targetRoutingPolicy | C1 |
| System propagates invalidation/config snapshot to edges | System | async process | S1hidden write targetEdgeSnapshot | C1 |
| Client reads CDN status/analytics | Client | read projection | S1read projection targetCDNStatusView | R2 |
Notes on normalization:
- client content request is
read projection- edge cache is a derived projection of origin truth
- origin update is
overwrite state - origin fetch is
async process - invalidation is a lifecycle transition on edge cache state
- routing/policy is control-plane overwrite
- snapshot propagation is async dissemination
This is a composition of:
Origin Projection + Edge Delivery PlaneControl Plane + Data Plane
Step 2 — Critical Path Selection #
| Requirement | Priority class | Why |
|---|---|---|
| Serve content from edge | C1 | wrong or excessively stale content breaks delivery correctness |
| Publish/update origin content metadata | C1 | changes authoritative content truth |
| Fetch missing/stale content from origin | C1 | edge correctness depends on valid refresh path |
| Invalidate edge cache entry | C1 | invalidation is the main safety path for freshness |
| Update routing/policy | C1 | changes where/how requests are served |
| Propagate invalidation/config snapshot | C1 | stale edge nodes can serve wrong content/policy |
| Read CDN status/analytics | R2 | operational only |
Critical paths:
P1serve contentP2update origin object metadataP3fetch/refresh edge cache entryP4invalidate edge cache entryP5update routing/policyP6propagate invalidation/config snapshot
Step 3 — Primary State Extraction #
| Candidate object label | Candidate source | Candidate needed for C1/R1? | Candidate decomposition action | Class | Primary? | Owner | Evolution | Scope kind | Scope value |
|---|---|---|---|---|---|---|---|---|---|
| OriginObject | direct noun | Yes | keep as candidate | entity | Yes | service | overwrite | instance | object_key |
| EdgeCacheEntry | hidden write target | Yes | keep as candidate | projection | Yes | service | overwrite | relation | edge_id + object_key |
| InvalidationState | lifecycle object | Yes | keep as candidate | process | Yes | service | state machine | instance | object_key or invalidation_id |
| RoutingPolicy | direct noun | Yes | keep as candidate | entity | Yes | service | overwrite | instance | hostname or distribution_id |
| EdgeSnapshot | hidden write target | Yes | keep as candidate | projection | Yes | service | overwrite | instance | edge_id |
| CDNStatusView | derived read model | No | reject as UI artifact | projection | No | derived | overwrite | collection | distribution_id |
Minimal primary set:
OriginObjectEdgeCacheEntryInvalidationStateRoutingPolicyEdgeSnapshot
Important modeling choices:
OriginObject #
Primary because:
- origin is authoritative truth for content version/metadata
EdgeCacheEntry #
Primary enough to model explicitly because:
- edge delivery depends on current cached version, expiry, validation state
InvalidationState #
Worth keeping explicit because:
- invalidation is an important lifecycle/control-plane process
- propagation and completion tracking matter
EdgeSnapshot #
Important because:
- edge nodes serve from local config/policy/invalidation view
- propagation lag matters
Step 4 — Hard Invariants #
| Path | Tier | Type | Invariant statement |
|---|---|---|---|
P1 serve content | HARD | eligibility | serve_content is valid only if selected edge cache entry is eligible under current cache state, routing policy, and freshness/invalidation rules for that request scope. |
P2 update origin object metadata | HARD | ordering | Origin-object revisions are ordered by monotonic version within object_key. |
P3 fetch/refresh edge cache entry | HARD | accounting | EdgeCacheEntry(edge_id, object_key) equals function of authoritative origin object state modulo bounded refresh/invalidation lag. |
P4 invalidate edge cache entry | HARD | eligibility | invalidate_entry is valid only if current invalidation lifecycle and cache state allow transition for that object scope. |
P5 update routing/policy | HARD | ordering | Routing-policy revisions are ordered by monotonic config version within distribution scope. |
P6 propagate snapshot | HARD | freshness | EdgeSnapshot(edge_id) reflects authoritative invalidation/routing state within configured propagation bound. |
What matters most:
- edges must not serve invalidated or too-stale content beyond the stated bound
- edge cache state must track authoritative origin version semantics
- invalidation propagation must be versioned and bounded
- edge routing/policy must move monotonically forward
Step 5 — Execution Context #
| Field | Value | Why |
|---|---|---|
| Topology | single service distributed | one logical CDN with origin-facing control plane and many edge nodes |
| Write coordination scope | per object scope | correctness is per object key, invalidation scope, and edge snapshot scope |
| Read consistency target | bounded stale allowed | edge delivery is projection-based, not strong-read-on-every-request |
| Holder model | none | no lease-like per-request ownership in the hot path |
| Compensation acceptable? | No | wrong content delivery/invalidation cannot be treated as compensable workflow |
Derived:
bounded_staleness_allowed = trueexclusive_claim_required = falseguarded_by_current_state = true
This implies:
- origin truth plus edge projection
- versioned invalidation/config propagation
- local edge serving on hot path
Step 6 — Deterministic Mechanism Selection #
| Path | Write shape | Base mechanism | Required companions |
|---|---|---|---|
P2 update origin object metadata | overwrite current value | CAS on version | object version |
P3 fetch/refresh edge cache entry | overwrite current value | single writer edge refresh or CAS on version | content version/ETag |
P4 invalidate edge cache entry | guarded state transition | CAS on (state, version) | invalidation version |
P5 update routing/policy | overwrite current value | CAS on version | routing version |
P6 propagate snapshot | overwrite current value | single writer snapshot publication | snapshot version |
Hot path P1 is a read path.
Why these fit:
- origin metadata and routing policy are versioned current-state config
- edge cache refresh is current-state overwrite from origin truth
- invalidation is a lifecycle transition
- edge snapshots are versioned overwrite dissemination
Step 7 — Read Model / Source of Truth #
| Concept | Truth | Read path | Rebuild path |
|---|---|---|---|
C1 origin content metadata | OriginObject | read source directly | authoritative origin store |
C2 edge cache entry | EdgeCacheEntry | read projection | refresh from origin object state |
C3 invalidation lifecycle | InvalidationState | read source directly | authoritative invalidation state |
C4 routing policy | RoutingPolicy | read source directly | authoritative policy store |
C5 edge local snapshot | EdgeSnapshot | materialized view | rebuild from latest invalidation + routing state |
C6 CDN status/analytics | derived | materialized view | recompute from primary state |
Important point:
For the hot path:
- edge node reads local cache entry plus local snapshot
- not origin or control plane on every request
This is the defining CDN split:
- origin is authoritative
- edge serves a bounded-stale projection
Step 8 — Failure Handling #
| Path | Retry | Competing writers | Crash after commit | Publish failure | Stale holder |
|---|---|---|---|---|---|
| origin update | retry with object version | stale update loses CAS | committed origin state survives crash if persisted | invalidation/refresh propagation may lag | n/a |
| edge refresh | retry with version/ETag validation | latest valid refresh wins | refreshed cache survives edge restart if cache persisted, otherwise repopulates | origin fetch retry/backoff | n/a |
| invalidation | retry with invalidation version | stale invalidation transition loses CAS | committed invalidation survives control-plane crash if persisted | snapshot propagation may lag | n/a |
| routing update | retry with routing version | stale update loses CAS | committed policy survives crash if persisted | snapshot propagation may lag | n/a |
| snapshot propagation | retry with versioned snapshot | older snapshot loses to newer version | edge keeps last good snapshot until refresh | failed push retried or pulled | n/a |
| content serve | request retries are application-level | many edge nodes can serve concurrently from local state | edge crash only affects local request handling | n/a | stale snapshot/cache bounded by TTL/invalidation/version refresh |
What matters most:
- invalidation and snapshot versions must move monotonically
- edge cache must validate freshness against origin version semantics
- stale edge state is allowed only within bounded configured limits
Step 9 — Scale Adjustments #
| Hotspot | Type | First response |
|---|---|---|
| massive read volume at edge | read hotspot | add more edge capacity and keep hot path fully local |
| invalidation storms | fan-out hotspot | batch invalidations and use versioned incremental edge updates |
| hot large objects / popular assets | read hotspot | tiered caching and object segmentation |
| origin refresh bursts on cache miss | contention hotspot | request coalescing and negative caching / stale-while-revalidate |
| routing/config churn | fan-out hotspot | incremental snapshot propagation and pull-on-version-miss |
| analytics/status reads | read hotspot | derived views only |
What scales well:
- edge reads scale horizontally
- origin truth stays centralized/narrow
- edge nodes serve locally from cache and snapshot
What fails first:
- invalidation fanout storms
- origin thundering herd on misses
- giant hot-object fetch bursts
Canonical design conclusion:
- archetype composition:
Origin Projection + Edge Delivery PlaneControl Plane + Data Plane
- primary truth:
OriginObjectEdgeCacheEntryInvalidationStateRoutingPolicyEdgeSnapshot
- hot path:
- local edge cache read + local routing/invalidation snapshot
- control plane:
- authoritative origin metadata + invalidation + routing + snapshot publication
Concrete Substrate #
- origin metadata/content store as authoritative source
- control plane in
Go/Java - invalidation/routing state in strongly consistent control-plane store
- edge fleet with local cache and versioned config snapshots
- config/invalidation propagation via watch streams / push channels / pull-on-version-miss
- content fetch from origin with ETag/version validation
Operation Layer #
ServeContent(hostname, object_key, request_context)
- entry point: edge node
- authoritative decider: local
EdgeCacheEntry+EdgeSnapshot - transition: none on source truth
- response: cached object, refresh trigger, or fetch-forwarded response
PutOriginObject(object_key, metadata, expected_version?)
- entry point: control-plane/origin API
- authoritative decider: origin metadata store owner
- transition: overwrite
OriginObject
Invalidate(object_key or prefix, expected_version?)
- entry point: invalidation API
- authoritative decider: invalidation-state owner
- transition: guarded update to
InvalidationState
PutRoutingPolicy(distribution_id, config, expected_version?)
- entry point: control-plane API
- authoritative decider: routing-policy owner
- transition: overwrite
RoutingPolicy
- internal edge refresh
- validate cached object version/ETag against origin
- overwrite
EdgeCacheEntry
- snapshot propagation
- publish latest
EdgeSnapshot(version)to edge nodes
Entry Point vs Decider vs Responder #
| Path | Entry point | Authoritative decider | Physical responder | Logical responder |
|---|---|---|---|---|
| serve content | edge node | local cache entry + local snapshot | edge node | CDN |
| origin update | control-plane/origin API | origin metadata owner | control-plane node | CDN |
| invalidation | invalidation API | invalidation-state owner | control-plane node | CDN |
| routing update | control-plane API | routing-policy owner | control-plane node | CDN |
| edge refresh | edge node | local edge refresh worker + origin validation | edge node | CDN |
| snapshot propagation | edge / control plane | snapshot publisher | control/data-plane | CDN |
Concrete HLD #
Main components:
- origin metadata/content store
- control-plane API
- invalidation/routing state store
- snapshot distribution layer
- edge fleet with local cache
- edge refresh/validation logic
- derived CDN analytics/status views
Short interview version #
“I’d design the CDN as an origin-truth plus edge-projection system. Origin content metadata is authoritative, while edge nodes serve bounded-stale cached projections using local invalidation and routing snapshots. Invalidations and routing changes are versioned control-plane updates propagated to edges, and edges refresh content from origin using version or ETag validation instead of consulting origin on every request. The main correctness boundary is bounded stale delivery under explicit invalidation and refresh rules.”