Quickwit Internals: A Substrate Decomposition
Table of Contents
Quickwit Internals: A Substrate Decomposition #
Quickwit is a cloud-native search engine built for log and trace analytics. It indexes documents into immutable splits stored on object storage (S3, GCS, Azure Blob) and answers queries by scatter-gathering across nodes. Its architecture is unusual: there is no persistent local disk required for correctness, segments are never mutated after upload, and the cluster membership layer is a custom gossip protocol rather than ZooKeeper or etcd.
This series decomposes Quickwit into its constituent substrates, grounded in the source at quickwit/quickwit/.
Substrate Map #
| Substrate | Primary Crate / Module | Role |
|---|---|---|
| Actor Framework | quickwit-actors | Supervised async actors: mailboxes, backpressure, health monitoring |
| Ingest | quickwit-ingest / Ingester | Write path: WAL (mrecordlog), shard management, replication |
| Indexing Pipeline | quickwit-indexing / actors | Chain: SourceActor → DocProcessor → Indexer → Packager → Uploader → Publisher |
| Split Storage | quickwit-storage / BundleStorage | Immutable split bundles on object storage with hotcache |
| Search Root | quickwit-search / root.rs | Scatter-gather: job placement via Rendezvous hashing, merge collection |
| Search Leaf | quickwit-search / leaf.rs | Per-split search: footer cache, Tantivy searcher, warmup |
| Cluster/Membership | quickwit-cluster / Chitchat | Gossip-based membership, failure detection, service discovery |
| Metastore | quickwit-metastore | Split lifecycle (Staged → Published → ScheduledForDelete), checkpoints |
Chapter List #
- The Actor Framework Substrate —
quickwit-actors: Actor and Handler traits, Mailbox priority channels, ActorContext, KillSwitch, supervision and health monitoring. - Ingest Substrate —
Ingester,mrecordlogWAL, shard lifecycle,IngesterState, replication factor, persist request flow. - Indexing Pipeline Substrate — the eight-actor chain,
IndexingPipelinesupervision loop,CommitTrigger,PublishLock,Sequencerordering guarantee. - Split Storage Substrate —
SplitPayloadBuilder,BundleStorageFileOffsets, split bundle format, hotcache layout, upload semaphore. - Search Root Substrate —
SearchJob,SearchJobPlacerRendezvous hashing,assign_jobsLPT algorithm,make_merge_collector, scatter-gather fan-out. - Search Leaf Substrate —
open_split_bundle,get_split_footer_from_cache_or_fetch,MemorySizedCache,HotDirectory, Tantivy warmup phase. - Cluster/Membership Substrate — Chitchat gossip protocol,
ChitchatConfig,FailureDetectorConfigPhi accrual,ClusterMemberkey-value node state, gRPC catchup. - Metastore Substrate —
MetastoreServicetrait, split lifecycle state machine,IndexCheckpointDelta, Janitor service, PostgreSQL and file backends.