Skip to main content
  1. System Design Components/

Feature Flag / A-B Testing Delivery Analysis Note

Feature Flag / A-B Testing Delivery Analysis Note #

This note captures the full step-by-step analysis for a feature flag and A/B testing delivery system: flag config, targeting rules, experiment allocation, config propagation, local evaluation, and optional exposure logging.

Step 1 — Normalize #

Assume the baseline prompt is:

  • design a feature flag / A-B testing delivery system
  • applications evaluate flags locally or through a low-latency service
  • admins update flags, rollout percentages, and targeting rules
  • experiments assign users to variants deterministically
  • system scales across many services/clients
RequirementActorOperationState touchedPriority
Client evaluates flag for request/user contextClientread sourceS1
read source target
EvaluationState
C1
Admin creates/updates flag configAdminoverwrite stateS1
update target
FlagConfig
C1
Admin updates rollout/targeting rulesAdminoverwrite stateS1
update target
TargetingPolicy
C1
System computes effective evaluation stateSystemstate transitionS1
update target
EvaluationState
C1
System propagates config snapshot to SDKs/edge nodesSystemasync processS1
hidden write target
ConfigSnapshot
C1
System records exposure eventSystemappend eventS1
create target
ExposureEvent
R2
Client reads flag/admin statusClientread projectionS1
read projection target
FlagStatusView
R2

Notes on normalization:

  • evaluation is a read path against current effective evaluation state
  • flag and targeting config are overwrite-state control-plane objects
  • effective evaluation state is a control-plane derived object
  • config propagation is async dissemination
  • exposure logging is append-only, but secondary unless explicitly required for billing/analytics correctness

This is primarily a:

  • Control Plane + Data Plane system

with optional secondary analytics/event-stream behavior.

Step 2 — Critical Path Selection #

RequirementPriority classWhy
Evaluate flag for request/userC1wrong evaluation breaks rollout correctness and experiment treatment
Update flag configC1changes future evaluation truth
Update targeting/rollout policyC1changes future routing into variants
Compute effective evaluation stateC1control-plane to data-plane correctness bridge
Propagate config snapshotC1stale SDK/edge nodes can evaluate wrongly
Record exposure eventR2important for analytics and experiment analysis, but not core evaluation correctness in baseline
Read flag/admin statusR2operational only

Critical paths:

  • P1 evaluate flag
  • P2 update flag config
  • P3 update targeting policy
  • P4 compute effective evaluation state
  • P5 propagate config snapshot

Optional/secondary:

  • P6 exposure logging

Step 3 — Primary State Extraction #

Candidate object labelCandidate sourceCandidate needed for C1/R1?Candidate decomposition actionClassPrimary?OwnerEvolutionScope kindScope value
FlagConfigdirect nounYeskeep as candidateentityYesserviceoverwriteinstanceflag_id
TargetingPolicydirect nounYeskeep as candidateentityYesserviceoverwriteinstanceflag_id or segment_scope
EvaluationStatehidden write targetYeskeep as candidateprocessYesserviceoverwriteinstanceflag_id or environment_scope
ConfigSnapshothidden write targetYeskeep as candidateprojectionYesserviceoverwriteinstanceclient_id or node_id
ExposureEventhidden write targetNokeep as candidateeventNoderivedappend-onlycollectionflag_id + subject_id + request_id
FlagStatusViewderived read modelNoreject as UI artifactprojectionNoderivedoverwritecollectionenvironment_id

Minimal primary set:

  • FlagConfig
  • TargetingPolicy
  • EvaluationState
  • ConfigSnapshot

Important modeling choices:

EvaluationState #

This is worth making explicit because:

  • it is the effective compiled form used by the hot path
  • it can include normalized targeting rules, percentage rollout config, variant weights, kill-switch flags, prerequisites, etc.

ConfigSnapshot #

Like other control-plane/data-plane systems:

  • local SDKs or edge nodes should evaluate from a versioned snapshot
  • not query control plane for every request

ExposureEvent #

Usually secondary in baseline correctness:

  • important for experiment analytics
  • not needed to decide variant selection if assignment is deterministic from config + context

Step 4 — Hard Invariants #

PathTierTypeInvariant statement
P1 evaluate flagHARDeligibilityevaluate_flag is valid only if result is derived from current eligible flag config, targeting policy, and rollout rules for the request context within the configured propagation bound.
P2 update flag configHARDorderingFlag-config revisions are ordered by monotonic version within flag_id.
P3 update targeting policyHARDorderingTargeting-policy revisions are ordered by monotonic version within policy scope.
P4 compute effective evaluation stateHARDaccountingEffective EvaluationState(flag_id) equals function of authoritative flag config and targeting policy.
P5 propagate config snapshotHARDfreshnessConfigSnapshot(client_or_node) reflects authoritative evaluation state within configured propagation bound.

If experiment assignment is deterministic, add:

PathTierTypeInvariant statement
P1 evaluate flagHARDuniquenessFor a fixed (flag_id, subject_key, config_version), variant assignment maps to at most one deterministic outcome within that evaluation scope.

What matters most:

  • clients must not evaluate against arbitrarily stale config
  • deterministic bucketing must be stable for same subject and config version
  • config versions must move monotonically
  • effective state must be a faithful compiled form of authoritative rules

Step 5 — Execution Context #

FieldValueWhy
Topologysingle service distributedone logical flag-delivery system with many SDKs/edge nodes
Write coordination scopeper object scopecorrectness is per flag/policy/environment/client snapshot scope
Read consistency targetbounded stale allowedhot path typically uses local cached/snapshotted config
Holder modelnoneno central lease-like ownership on hot path
Compensation acceptable?Nowrong experiment treatment or feature exposure cannot be undone for correctness purposes

Derived:

  • bounded_staleness_allowed = true
  • exclusive_claim_required = false
  • guarded_by_current_state = true

This implies:

  • authoritative control plane
  • versioned snapshot propagation to evaluators
  • local evaluation on hot path

Step 6 — Deterministic Mechanism Selection #

PathWrite shapeBase mechanismRequired companions
P2 update flag configoverwrite current valueCAS on versionconfig version
P3 update targeting policyoverwrite current valueCAS on versionpolicy version
P4 compute effective evaluation stateoverwrite current valuesingle writer control-plane recomputecompiled-state version
P5 propagate config snapshotoverwrite current valuesingle writer snapshot publicationconfig version
P6 exposure loggingappend-only eventappend logevent id or request id dedup if needed

Hot path P1 is a read path.

Why these fit:

  • flag config and targeting are versioned current-state config
  • effective evaluation state is a recomputed current view
  • snapshots are versioned overwrite propagation
  • exposure logging, if included, is append-only

Step 7 — Read Model / Source of Truth #

ConceptTruthRead pathRebuild path
C1 flag configFlagConfigread source directlyauthoritative config store
C2 targeting policyTargetingPolicyread source directlyauthoritative policy store
C3 effective evaluation stateEvaluationStateread source directlyrecompute from flag config + policy
C4 evaluator snapshotConfigSnapshotmaterialized viewrebuild from latest effective evaluation state
C5 flag/admin statusderivedmaterialized viewrecompute from primary state
C6 exposure analyticsExposureEvent if retainedappend/event analytics pathreplay from event stream

Important point:

For the hot evaluation path:

  • SDK/edge/service reads local ConfigSnapshot
  • not control-plane config source on every request

That is the key control-plane/data-plane split for feature-flag delivery.

Step 8 — Failure Handling #

PathRetryCompeting writersCrash after commitPublish failureStale holder
flag/policy updateretry with config versionstale update loses CAScommitted config survives control-plane crash if persistedsnapshot propagation may lagn/a
effective-state recomputeretry safe from primary inputssingle recompute/version winsrecompute reruns after crashsnapshot propagation may lagn/a
snapshot propagationretry with versioned snapshotolder snapshot loses to newer versionSDK/edge keeps last good snapshot until refreshfailed push retried or pulledn/a
flag evaluationretries are application-levelmany evaluators can serve concurrently using same snapshot versionone evaluator crash affects only local request handlingn/astale snapshot bounded by version/TTL refresh
exposure loggingretry with event id/request idduplicate events coexist unless dedup appliedcommitted event survives if persistedasync analytics publication may lagn/a

What matters most:

  • evaluators reject older config versions
  • same subject + config version yields stable assignment
  • stale snapshots are bounded
  • exposure logging, if used, is idempotent enough for analytics

Step 9 — Scale Adjustments #

HotspotTypeFirst response
very high evaluation QPSread hotspotpush evaluation fully local to SDK/edge and keep snapshots compact
config churn from frequent flag updatesfan-out hotspotbatch updates and publish incremental snapshots
large targeting rule setsread hotspotcompile rules into efficient evaluation structures and shard config by environment/app
huge client fleet reconnectsfan-out hotspotbackoff reconnects and support pull-on-version-miss
exposure-event volumewrite throughput hotspotkeep exposure logging asynchronous and separate from evaluation hot path
admin/status readsread hotspotderived views only

What scales well:

  • hot path is local evaluation from snapshot
  • control plane is narrow and versioned
  • propagation is incremental

What fails first:

  • config fanout storms
  • large rule trees
  • exposure analytics overload if tied too closely to the hot path

Canonical design conclusion:

  • archetype: Control Plane + Data Plane
  • primary truth:
    • FlagConfig
    • TargetingPolicy
    • EvaluationState
    • ConfigSnapshot
  • hot path:
    • local snapshot read + deterministic evaluation/bucketing
  • control plane:
    • authoritative config + compiled effective state + snapshot publication

Concrete Substrate #

  • control plane in Go/Java
  • authoritative config store in etcd, Postgres, or another strongly consistent config DB
  • config distribution via watch streams / streaming SDK updates / CDN-backed snapshot pull
  • local evaluators in SDKs, edge nodes, or a low-latency gateway service
  • optional exposure event pipeline via Kafka/PubSub/ClickHouse path

Operation Layer #

  1. Evaluate(flag_id, context, subject_key)
  • entry point: SDK / edge evaluator / local client library
  • authoritative decider: local ConfigSnapshot
  • transition: none on source truth
  • response: {enabled, variant, config_version, reason}
  1. PutFlagConfig(flag_id, config, expected_version?)
  • entry point: control-plane API
  • authoritative decider: config store owner
  • transition: overwrite FlagConfig
  1. PutTargetingPolicy(scope, config, expected_version?)
  • entry point: control-plane API
  • authoritative decider: policy store owner
  • transition: overwrite TargetingPolicy
  1. internal recompute
  • recompute EvaluationState from flag config + policy
  1. snapshot propagation
  • publish latest ConfigSnapshot(version) to SDKs/edge nodes
  1. RecordExposure(flag_id, subject_key, variant, config_version, request_id?)
  • entry point: async event pipeline
  • authoritative decider: analytics/event pipeline, not hot-path evaluator
  • transition: append ExposureEvent

Entry Point vs Decider vs Responder #

PathEntry pointAuthoritative deciderPhysical responderLogical responder
evaluate flagSDK / local evaluator / edge nodelocal config snapshotlocal evaluatorfeature-flag service
flag/policy updatecontrol-plane APIconfig/policy store ownercontrol-plane nodefeature-flag service
effective-state recomputecontrol planerecompute workerinternalfeature-flag service
snapshot propagationSDK / edge node / control planesnapshot publishercontrol/data-planefeature-flag service
exposure loggingasync ingestion endpointevent pipelineingestion nodefeature-flag analytics subsystem

Concrete HLD #

Main components:

  • control-plane API
  • flag + policy state store
  • effective-state compiler/recompute worker
  • snapshot distribution layer
  • SDK / edge evaluators
  • optional exposure analytics pipeline

Short interview version #

“I’d design the feature-flag and A/B testing system as a control-plane/data-plane setup. Control plane stores flag definitions and targeting rules, compiles them into an effective evaluation state, and publishes versioned snapshots to SDKs or edge evaluators. The hot path does not query control plane; it evaluates locally from the snapshot and uses deterministic bucketing so the same user gets the same variant for a given config version. Exposure logging is asynchronous and separate from the evaluation hot path.”