Skip to main content
  1. System Design Components/

Service Registry / Service Discovery

Service Registry / Service Discovery #

This note models a service registry / service discovery system where service instances register themselves, maintain heartbeats or leases, consumers resolve current healthy endpoints, and the system propagates membership changes safely at scale.


Step 1 - Normalize #

Assume the baseline prompt is:

  • design a service registry / service discovery system
  • service instances register and deregister themselves
  • instances heartbeat or renew leases
  • clients discover healthy endpoints for a service
  • membership changes should propagate to clients quickly
  • system scales across many services and instances

Normalize into state-affecting paths.

RequirementActorOperationState touchedPriority
Service instance registers endpointClientstate transitionS1
update target
ServiceInstanceState
C1
Service instance renews lease / heartbeatClientstate transitionS1
update target
ServiceInstanceState
C1
Service instance deregisters endpointClientstate transitionS1
update target
ServiceInstanceState
C1
System expires stale instanceSystemasync processS1
hidden write target
ServiceInstanceState
C1
Client resolves service endpointsClientread sourceS1
read source target
ServiceMembershipState
R1
System updates effective membership / health viewSystemstate transitionS1
update target
ServiceMembershipState
C1
Client registers watch on serviceClientappend eventS1
create target
DiscoveryWatchRegistration
R1
System emits membership-change watch eventSystemasync processS1
hidden write target
DiscoveryWatchEvent
R1
System routes service/shard to current ownerSystemread sourceS1
read source target
PartitionMap
C1
System reassigns shard ownership after node failureSystemstate transitionS1
update target
PartitionOwnership
C1

Notes on normalization #

Important choices:

  • register/renew/deregister are lifecycle transitions
  • stale expiry is explicit because instance crashes are central correctness cases
  • client endpoint lookup is a read path over current membership truth
  • effective membership state is distinct from raw individual instance lifecycle
  • watch registration and watch events are separate from source truth

This system is fundamentally:

  • membership + lease + lookup

not:

  • log replay
  • queue delivery

Step 2 - Critical Path Selection #

RequirementPriority classWhy
Register endpointC1wrong membership truth breaks all downstream routing
Renew heartbeat / leaseC1stale renewals affect endpoint validity
Deregister endpointC1removal correctness affects traffic safety
Expire stale instanceC1crash recovery depends on safe expiry
Resolve endpointsR1core serving path
Update effective membership viewC1consumers need correct healthy endpoint set
Register / deliver watchR1important for propagation, but downstream of membership truth
Route to shard ownerC1wrong routing can split membership truth
Reassign shard ownershipC1failover must preserve membership correctness

Baseline critical paths #

Main C1 paths:

  • P1 register endpoint
  • P2 renew lease
  • P3 deregister endpoint
  • P4 expire stale instance
  • P5 update effective membership
  • P6 route to shard owner
  • P7 reassign shard ownership

Main R1 paths:

  • P8 resolve endpoints
  • P9 watch registration and delivery

This design is driven by:

  • one authoritative current lifecycle per instance
  • current healthy endpoint set per service
  • lease expiry on crash
  • fast propagation to consumers

Step 3 - Primary State Extraction #

For a service registry, the minimal primary state is the individual instance lifecycle, effective membership view, client/session or lease validity, and routing/ownership state.

Candidate object labelCandidate sourceCandidate needed for C1/R1?Candidate decomposition actionClassPrimary?OwnerEvolutionScope kindScope value
ServiceInstanceStatedirect nounYeskeep as candidateprocessYesservicestate machineinstanceservice_id + instance_id
ServiceMembershipStatehidden write targetYeskeep as candidateentityYesserviceoverwriteinstanceservice_id
ClientSessionhidden write targetYeskeep as candidateprocessYesservicestate machineinstancesession_id
DiscoveryWatchRegistrationdirect nounYeskeep as candidaterelationshipYesserviceappend-onlyrelationclient_id + service_id
DiscoveryWatchEventhidden write targetNokeep as candidateeventNoderivedappend-onlycollectionservice_id
PartitionOwnershiphidden write targetYeskeep as candidateprocessYesservicestate machineinstanceshard_id
PartitionMaphidden write targetYeskeep as candidateentityYesserviceoverwritecollectionservice shards
RegistryStatusViewderived read modelNoreject as UI artifactprojectionNoderivedoverwritecollectiontenant or cluster

Important modeling choices #

ServiceInstanceState #

This is the central instance-lifecycle object.

Likely fields:

  • service_id
  • instance_id
  • endpoint
  • zone/region
  • health/serving status
  • session_id
  • expiry
  • state

States:

  • REGISTERED
  • HEALTHY
  • UNHEALTHY
  • DEREGISTERED
  • EXPIRED

ServiceMembershipState #

Primary because:

  • consumers usually read the effective healthy endpoint set per service
  • derived but still authoritative current view for lookup

ClientSession #

Primary because:

  • registrations are often tied to a lease/session lifecycle

Minimal strict primary set #

The strongest minimal set is:

  • ServiceInstanceState
  • ServiceMembershipState
  • ClientSession
  • PartitionOwnership
  • PartitionMap

With:

  • DiscoveryWatchRegistration as an optional explicit primary object

Step 4 - Hard Invariants #

For a service registry / discovery system, the hard invariants are about one authoritative lifecycle per instance, valid renew/deregister only by current lease holder, and correct current healthy membership per service.

PathTierTypeInvariant statement
P1 register endpointHARDuniquenessKey (service_id, instance_id) maps to at most one logical outcome current authoritative instance lifecycle within instance scope.
P1 register endpointHARDeligibilityAction register_instance is valid only if current session is active and current instance state is registerable at decision time.
P2 renew leaseHARDeligibilityAction renew_instance is valid only if current ServiceInstanceState is owned by the same session and lease/epoch matches at decision time.
P3 deregister endpointHARDeligibilityAction deregister_instance is valid only if current ServiceInstanceState is owned by the same session and lease/epoch matches at decision time.
P4 expire stale instanceHARDeligibilityAction expire_instance is valid only if current instance is still registered, expiry has passed, and lease/epoch is unchanged at decision time.
P5 update effective membershipHARDaccountingServiceMembershipState(service_id) equals the current authoritative set of eligible healthy instances for that service scope.
P6 route to shard ownerHARDuniquenessKey shard_id maps to at most one logical outcome current authoritative owner within shard_id.
P7 reassign shard ownershipHARDeligibilityAction reassign_shard is valid only if current owner is failed or relinquished and candidate owner is eligible and sufficiently current on shard_id at decision time.
P8 resolve endpointsHARDfreshnessLookup reflects authoritative membership and instance state within configured consistency bound.
P9 watch deliverySOFTfreshnessWatch stream reflects authoritative membership changes within propagation bound.

What matters most #

1. One authoritative lifecycle per instance #

This prevents split or stale endpoint truth.

2. Membership view must correspond to valid healthy instances #

Consumers should not receive endpoints that are expired or deregistered.

3. Renew/deregister are fenced #

Only the current registering session/epoch may continue to mutate the instance record.

4. Watch delivery is secondary to membership truth #

Clients should treat watch streams as propagation help, not as sole truth.


Step 5 - Execution Context #

For the strict baseline service registry:

FieldValueWhy
Topologysingle service distributedone logical discovery service spread across many nodes
Write coordination scopeper object scopecorrectness is per instance, service membership, and shard ownership scope
Read consistency targetstrong onlystale endpoint reads can route traffic to dead instances
Holder modelclientservice instances temporarily hold registrations through sessions/leases
Compensation acceptable?Nowrong endpoint membership can send production traffic to dead or unauthorized instances

Derived implications #

  • holder_may_crash = true

    • service instances can crash while registered
  • cross_service_write = false

    • baseline keeps instance, membership, and ownership state in one logical service
  • bounded_staleness_allowed = false

    • correctness-critical resolution should use authoritative or tightly controlled fresh state
  • cross_service_atomicity_required = false

    • no multi-service transaction required in baseline
  • exclusive_claim_required = true

    • shard ownership and per-instance lease ownership must be exclusive
  • guarded_by_current_state = true

    • register, renew, deregister, and expiry all depend on current state

What this implies #

This pushes us toward:

  • one authoritative writer per service shard
  • lease-backed instance records
  • effective membership derived from authoritative instance state
  • watch propagation derived from committed membership changes

Step 6 - Deterministic Mechanism Selection #

PathWrite shapeBase mechanismRequired companions
P1 register endpointguarded state transitionCAS on (state, version) or single writer per shardsession/lease id, epoch
P2 renew leaseguarded state transitionCAS on (state, version)session/lease id, epoch
P3 deregister endpointguarded state transitionCAS on (state, version)session/lease id, epoch
P4 expire stale instanceguarded state transitionleader-applied guarded transitionepoch, timeout scan
P5 update effective membershipoverwrite current valuesingle writer recomputemembership version
P6 route to shard ownerexclusive claimleasefencing token, heartbeat
P7 reassign shard ownershipguarded state transitionCAS on (state, version)fencing token, shard catch-up check

Why these fit #

Register/renew/deregister #

These all depend on current session/epoch ownership and current lifecycle state, so guarded transitions fit.

Effective membership #

The current endpoint set per service is a current-value view derived from instance states, so overwrite fits.

Routing #

One current owner per shard is required for correctness, so exclusive claim fits.

Canonical substrate implied #

The baseline now points to:

  • sharded registry service
  • one authoritative owner per service shard
  • lease-backed instance records
  • current membership view per service
  • watch propagation from committed membership changes

Step 7 - Read Model / Source of Truth #

For a service registry, truth is direct source state for instance and membership data. Watches are derived.

ConceptTruthRead pathRebuild path
C1 current instance lifecycleServiceInstanceStateread source directlyauthoritative instance-state store
C2 current healthy endpoint setServiceMembershipStateread source directlyrecompute from authoritative instance state
C3 current session / lease validityClientSessionread source directlyauthoritative session store
C4 shard ownershipPartitionOwnershipread source directlyauthoritative ownership store
C5 shard routing mapPartitionMapread source directlyauthoritative routing metadata
C6 watch streamcommitted membership changesmaterialized viewrebuild from authoritative instance/membership transitions
C7 dashboards / statusderived from instance and membership statematerialized viewrecompute from authoritative state

Important point #

For the core semantics:

  • resolution reads authoritative ServiceMembershipState
  • membership recomputes from authoritative instance lifecycle state
  • watches are derived propagation

Step 8 - Failure Handling #

PathRetryCompeting writersCrash after commitPublish failureStale holder
P1 register endpointretry with same session/epoch safestale re-register loses guarded transitioncommitted registration survives crash if persistedwatch delivery may lagstale instance/session blocked by epoch
P2 renew leaseretry with current session/epochstale renew loses guarded transitioncommitted renewal survives crash if persistedwatch delivery may lagold epoch rejected
P3 deregister endpointretry with current session/epochstale deregister loses guarded transitioncommitted deregistration survives crash if persistedwatch delivery may lagold epoch rejected
P4 expire stale instancetimeout scan retry safeonly one expiry transition should win for current expired statescanner crash delays cleanup; next scan retrieswatch delivery may lagprior holder blocked once epoch/version advanced
P5 membership recomputerecompute retry safe from source inputssingle recompute/version winsrecompute reruns after crashwatch/update propagation may lagn/a
P6 route to shard ownerretry after refreshing shard maponly one valid owner should existif owner changed, refreshed map points to new ownern/astale owner rejected by fencing token
P7 reassign shard ownershipretry failover transition safelyonly one reassignment wins current ownership statepromoted owner crash triggers later reassignmentn/aold owner fenced and must not continue serving
P8 resolve endpointsread retry safemany readers coexistnode crash drops query onlyn/astale read should be disallowed or tightly bounded

What matters most #

1. Lease/epoch fencing #

This prevents crashed or partitioned instances from continuing to mutate or appear healthy after losing authority.

2. Membership truth comes from instance truth #

The effective endpoint set must not outlive current instance validity.

3. Watch lag must not affect correctness #

Consumers should reconcile against current membership state when needed.


Step 9 - Scale Adjustments #

HotspotTypeFirst response
hot services with many instancescontention hotspotshard by service and isolate very large services
renewal trafficwrite throughput hotspotlengthen lease duration within acceptable failover bounds and batch renewals
membership watch fanoutfan-out hotspotderive watch delivery from committed membership stream and decouple it from source truth
strong reads on popular servicesread hotspotcache only under strict freshness/versioning or colocate reads with authoritative owners
failover churncontention hotspotstabilize leadership and avoid aggressive reassignment
reconnect storms after outagecontention hotspotstagger heartbeats and client watch/session restoration

What scales well #

A registry scales for relatively small coordination data.

It scales by:

  • sharding services and instances
  • keeping instance records compact
  • deriving current membership efficiently
  • treating watch delivery as secondary

What fails first #

Usually:

  • one or a few very large services
  • heartbeat storms
  • watch fanout spikes
  • clients depending on watches alone instead of source truth

Canonical design conclusion #

The mechanical outcome is:

  • primary state:
    • ServiceInstanceState
    • ServiceMembershipState
    • ClientSession
    • PartitionOwnership
    • PartitionMap
  • critical invariants:
    • one authoritative lifecycle per instance
    • renew/deregister valid only for current session/epoch
    • current membership equals healthy eligible instances
    • exclusive shard ownership for membership truth
  • mechanisms:
    • guarded register/renew/deregister/expiry transitions
    • overwrite current membership view
    • lease for ownership/session validity
    • fenced shard ownership
  • reads:
    • direct authoritative reads for resolution and membership truth
    • watches as derived notifications

Polished interview answer #

I’d design the service registry as a sharded strongly consistent membership service with one authoritative owner per service shard. Each instance registers an endpoint under a lease-backed ServiceInstanceState record, renews that lease with heartbeats, and is removed either explicitly or by expiry if it crashes. The registry maintains an authoritative current ServiceMembershipState per service, which is the set of healthy eligible endpoints derived from instance records. Clients resolve endpoints from that membership view and can subscribe to watch streams for change propagation, but correctness comes from authoritative membership state, not the watch stream. The main scaling levers are more shards, longer but bounded leases, efficient membership recompute, and decoupled watch fanout.


Concrete Substrate #

I’ll choose a sharded strongly consistent registry service with lease-backed instance records and derived membership views as the concrete baseline, because it matches the mechanics we derived:

  • guarded instance lifecycle transitions
  • lease-backed validity
  • current membership view
  • one owner per shard

Concrete tech family:

  • registry service in Go or Java
  • authoritative state in a replicated metadata store or service-owned Raft state machine
  • metadata/control:
    • built-in Raft consensus per shard or a small etcd-like control layer

Each shard leader stores:

  • ServiceInstanceState(service_id, instance_id)
  • ServiceMembershipState(service_id)
  • ClientSession(session_id)
  • watch registrations
  • expiry index

This is effectively the same substrate family as Consul/etcd-backed service discovery, with the product surface centered on membership lookup.


Operation Layer #

1. Register instance #

API

  • RegisterInstance(service_id, instance_id, endpoint, metadata, session_id, ttl)

Initiator

  • service instance / sidecar

Entry point

  • gateway or any registry node

Authoritative decider

  • current shard leader for service_id

Precondition

  • session active
  • instance state registerable

Transition

  • create or update ServiceInstanceState
  • recompute ServiceMembershipState(service_id)

Response

  • {registered: true, expiry}

2. Renew lease #

API

  • Heartbeat(service_id, instance_id, session_id, epoch, ttl)

Initiator

  • service instance / sidecar

Entry point

  • gateway or any node

Authoritative decider

  • shard leader

Precondition

  • current instance state owned by session_id
  • epoch matches current state

Transition

  • extend expiry

Response

  • {renewed: true, expiry}

3. Deregister instance #

API

  • DeregisterInstance(service_id, instance_id, session_id, epoch)

Initiator

  • service instance / sidecar

Entry point

  • gateway or any node

Authoritative decider

  • shard leader

Precondition

  • current instance state owned by session_id
  • epoch matches current state

Transition

  • REGISTERED/HEALTHY -> DEREGISTERED
  • recompute ServiceMembershipState(service_id)

Response

  • {deregistered: true}

4. Resolve endpoints #

API

  • Resolve(service_id)

Initiator

  • client / downstream service

Entry point

  • gateway, resolver, or any node

Authoritative decider

  • shard leader or tightly controlled fresh read path

Precondition

  • none

Transition

  • none

Response

  • current endpoint set

5. Expire stale instance #

API

  • internal background process

Initiator

  • system

Entry point

  • shard leader

Authoritative decider

  • shard leader

Precondition

  • current time > expiry
  • instance state and epoch unchanged

Transition

  • mark instance expired
  • recompute ServiceMembershipState(service_id)

Entry Point vs Decider vs Responder #

PathEntry pointAuthoritative deciderPhysical responderLogical responder
RegisterInstancegateway / any nodeshard leaderleader or front nodeservice registry
Heartbeatgateway / any nodeshard leaderleader or front nodeservice registry
DeregisterInstancegateway / any nodeshard leaderleader or front nodeservice registry
Resolveresolver / any nodeshard leader or strong read pathresolver nodeservice registry
expiryshard leadershard leaderinternalservice registry
watchwatch endpointcommitted membership stream / shard leaderwatch-serving nodeservice registry
shard failoverfollower / coordination layershard quorum / lease storenew leader / control planeservice registry

Concrete HLD #

Main components:

  • client / instance gateway
    • routes register, heartbeat, and resolve operations
  • shard leaders
    • authoritative owners of instance, session, and membership state
    • maintain expiry index
  • shard followers
    • replicate committed state
  • watch service
    • emits membership-change notifications from committed state transitions
  • metadata/control service
    • tracks shard ownership and routing

Short Interview Version #

I’d design the service registry as a sharded strongly consistent membership service with one authoritative owner per service shard. Each instance registers an endpoint under a lease-backed ServiceInstanceState record, renews that lease with heartbeats, and is removed either explicitly or by expiry if it crashes. The registry maintains an authoritative current ServiceMembershipState per service, which is the set of healthy eligible endpoints derived from instance records. Clients resolve endpoints from that membership view and can subscribe to watch streams for change propagation, but correctness comes from authoritative membership state, not the watch stream. The main scaling levers are more shards, longer but bounded leases, efficient membership recompute, and decoupled watch fanout.