Skip to main content
  1. System Design Components/

Archetype 10 — Frontier + Claimable Run #


What this archetype is #

The system tracks a frontier of not-yet-covered work, claims slices of that frontier, and checkpoints covered progress.

Examples: crawler, migration scanner, batch backfill.

We will use URL crawl frontier as the running example.


Layer 1: Entities and Postgres table design #

FrontierState
BatchRunState
CheckpointState
create table crawl_frontier (
  frontier_id bigserial primary key,
  partition_id int not null,
  item_key text not null,
  status text not null default 'DISCOVERED',
  discovered_at timestamptz not null default now(),
  unique (partition_id, item_key)
);

create table crawl_batch_runs (
  batch_id uuid primary key,
  partition_id int not null,
  claimed_by text not null,
  lease_expires_at timestamptz not null,
  status text not null default 'CLAIMED',
  created_at timestamptz not null default now()
);

create table crawl_checkpoints (
  partition_id int primary key,
  last_safe_key text,
  updated_at timestamptz not null default now()
);

Layer 2: Write path mechanics #

Claim work #

select frontier_id, item_key
from crawl_frontier
where partition_id = $1
  and status = 'DISCOVERED'
order by frontier_id
limit 100
for update skip locked;

Then mark claimed:

update crawl_frontier
set status = 'CLAIMED'
where frontier_id = any($2);

Advance checkpoint #

insert into crawl_checkpoints (partition_id, last_safe_key)
values ($1, $2)
on conflict (partition_id) do update
set last_safe_key = excluded.last_safe_key,
    updated_at = now();

Layer 3: Fault tolerance #

  • frontier advanced too far
  • same frontier item claimed twice
  • progress not advanced after success
  • uncovered work skipped

Layer 4: Scale #

Default hotspots:

  • frontier hot row / hot partition
  • batch-claim bursts
  • skewed range/work distribution
  • checkpoint lag

Common mitigations:

  • partition frontier aggressively
  • lease expiry and reclamation
  • done sets / dedup tables for replay safety