Sal is a Lean 4 framework for verifying state-based CRDTs and Mergeable Replicated Data Types (MRDTs) under replication-aware (RA) linearizability. Rather than discharging every verification condition with a single backend, Sal stages automation by trustworthiness: it first attempts kernel-verified proof reconstruction, falls back to an SMT backend only when that fails, and finally hands remaining obligations to an AI-assisted interactive theorem prover — all while keeping the trusted computing base (TCB) as small as the VC allows. When a VC is in fact invalid, Sal uses Plausible (property-based testing) together with a ProofWidgets-based visualizer to turn the failure into an inspectable counterexample execution trace.
The approach builds on the F★-based Neem framework of Soundarapandian, Nagar, Rastogi, and Sivaramakrishnan (OOPSLA 2025), which reduces RA-linearizability for a data type to a fixed set of VCs over do, merge, and rc. Sal is the Lean port of that reduction plus a multi-modal tactic, counterexample pipeline, and custom decidable set/map interfaces that make grind effective on RDT goals.
The suite currently contains 28 RDTs — 17 CRDTs and 11 MRDTs — with all 24 RA-linearizability VCs closed on every properly-verified one (one MRDT, Enable_Wins_Flag_known_broken, is an intentionally-buggy demo fixture and is excluded from the "all VCs closed" count). That's 648 VCs for state convergence (27 × 24), of which the vast majority are kernel-checked (no TCB-enlarging admits) and a small residue of stage-2 Blaster-admits remain in a few files (OR_Set_MRDT, OR_Set_Efficient_MRDT, Add_Win_Priority_Queue_MRDT, Multi_Valued_Register_MRDT) — validated by Z3 via the sal tactic's SMT stage but not yet kernel-reconstructed.
Several Tier-C RDTs — those whose 24 VCs prove union-commutativity on grow-only state but leave the user-visible semantics to the read-side projection — additionally carry read-side definitions and intent-preservation theorems: Peritext, RGA, Add_Win_Priority_Queue, OR_Set (plain and efficient), and Multi_Valued_Register. Each comes with a *_ReadSide.lean companion alongside its *_CRDT.lean / *_MRDT.lean, and a per-RDT crosswalk to the source paper:
docs/peritext-vs-paper.md— Litt et al. CSCW 2022, Ex 1 / 2 / 3 / 5 / 7 / 8 intent-preservation.docs/rga-vs-paper.md— Roh et al. JPDC 2011, causal-order preservation, tombstone monotonicity, deterministic concurrent-insert tiebreak.docs/aw-crpq-vs-paper.md— Zhang et al. Internetware 2023, Add-wins headline + LWW innate + acquired-Σ +get_max.docs/or-set-vs-paper.md— Shapiro et al. INRIA RR-7506, Add-Wins on lookup with sequential-Add-then-Remove extinguishment.docs/mvr-vs-paper.md— Shapiro et al. INRIA RR-7506 §3.2.2, classical replace-on-write semantics with concurrent-writes-both-survive + sequential-writes-supersede.
Everything is checked on Lean v4.28.0 against the chore-bump-lean-4.28 branch of a Blaster fork.
CRDTs (Sal/CRDTs/)
Registers & counters:
Increment_Only_CounterPN_CounterLWW_RegisterMAX_RegisterMIN_RegisterMulti_Valued_Register— classical replace-on-write MVR via the snapshot-in-op-payload trick (Shapiro et al. INRIA RR-7506 §3.2.2). State(writes, removed); concurrent writes survive via additive merge, sequential writes overwrite via the snapshot. + read-side:is_visible_value,concurrent_writes_both_visible,sequential_write_supersedes. Seedocs/mvr-vs-paper.md.Bounded_Counter— Sypytkowski 2019 / Balegas et al. 2015. PN-counter plus a sparse per-replica-pairtransfersmap for quota redistribution.
Sets & maps:
OR_Set— Shapiro et al. INRIA RR-7506. + read-side:lookup,add_wins_over_concurrent_remove,add_then_remove_extinguishes. Seedocs/or-set-vs-paper.md.Grow_Only_SetGrow_Only_MultisetLWW_Element_SetLWW_MapMAX_Map
Sequences & structured data:
RGA— Replicated Growable Array, the sequence CRDT underlying Automerge / Yjs, in its state-based formulation as a grow-onlyMap OpId (char, afterId, deleted). + read-side:visible_ltfour-rule DFS-traversal predicate,causal_order_visible_lt,tombstone_monotone_under_remove,concurrent_insert_tiebreak_deterministic. Seedocs/rga-vs-paper.md.Peritext— Litt et al. CSCW 2022. Rich text = RGA + formatting marks represented as a flatset AnchorAttachment. + read-side: paper Ex 1 / 2 / 3 / 5 / 7 / 8 intent-preservation theorems. Seedocs/peritext-vs-paper.md.Shopping_CartAdd_Win_Priority_Queue— adapted from Zhang et al. 2023. + read-side:lookup,add_wins_over_concurrent_rmv, LWW innate, MCW-collapsed-to-Σ acquired,get_max,inc_increases_acquired. Seedocs/aw-crpq-vs-paper.md.
MRDTs (Sal/MRDTs/)
Increment_Only_CounterPN_CounterMulti_Valued_Register— classical MVR with the same(writes, removed)shape as the CRDT but standard three-way merge per component. + read-side (mirrors the CRDT side). Seedocs/mvr-vs-paper.md.Grow_Only_SetGrow_Only_MapOR_Set— + read-side (mirrors the CRDT side via three-way merge). Seedocs/or-set-vs-paper.md.OR_Set_Efficient— compressed variant with(rid, ts, elem)triples and per-replica deduplication. + read-side with the same headline theorems on the triple representation.Replicated_Growable_Array— + read-side with relationalreadSeq_visibleand the three RGA intent theorems. Seedocs/rga-vs-paper.md.Add_Win_Priority_Queue— adapted from Zhang et al. 2023. Drops the CRDT's tombstone set since the LCA handles Add-Wins directly, leavingset (add_ts, elem, value) × set (inc_ts, elem, amount). + read-side. Seedocs/aw-crpq-vs-paper.md.Peritext— Litt et al. CSCW 2022. RGA substrate plusset AnchorAttachment. RGA's tombstones are structurally load-bearing (later inserts reference earlier char ids), so this MRDT keeps all components grow-only and uses pointwise-union merge, mirroringRGA_MRDT. + read-side. Seedocs/peritext-vs-paper.md.Enable_Wins_Flag— enable-wins boolean flag, per-replica map of(counter, flag).Enable_Wins_Flag_known_broken— intentionally buggy variant preserved as a demo fixture. Drives the Plausible counterexample demo; the bug manifests oninter_right_1op.
The tactic lives in Sal/Tactic/Sal.lean and tries three strategies in order, stopping at the first one that succeeds:
dsimp+grind— lightweight SMT-style automation with proof reconstruction; the result is a kernel-checkable proof term. This stage is preferred because it does not enlarge the TCB. We deliberately skipaesopat this stage because its verification times on these RDT goals were prohibitive.lean-blaster— encodes the goal to Z3. More powerful (especially for higher-order functions and lambdas) but sacrifices proof reconstruction, so the TCB grows to include the SMT solver. Invoked with a wall-clock timeout (default 30 s) so a stuck goal cannot hang.dsimp+aesop+all_goals (try grind)— a broader proof-search fallback. Remaining goals are then typically closed interactively with tactics produced by Harmonic's Aristotle, whose outputs are kernel-checked so the TCB is still not enlarged.
Stages 1 and 3 are guarded against sorryAx in the resulting proof term: aesop's default mode can otherwise close a goal with a silent sorry placeholder. Stage 2 is intentionally not guarded, because Blaster trusts Z3's "valid" verdict via MVarId.admit — that is Blaster's TCB-enlarging mechanism.
The tactic takes a heartbeat budget (default 400 000) that caps Lean-side elaboration across all three stages, and a separate smtTimeoutSec budget (default 30 s) for the SMT stage. See the docstring in Sal.lean for how to tune them.
Minimal example (see Sal/Tactic/SalExample.lean for more):
import Sal.Tactic.Sal
example (a b : Nat) : a + b = b + a := by salMany post-paper CRDTs in the suite are proved with a uniform kernel-verifiable pattern — rcases over the operation family, refine ⟨?_, …, ?_⟩ to split the state's components, then simp +decide [*] and grind — and avoid Blaster entirely. A few stubborn VCs still need Blaster or an Aristotle-assisted intermediate lemma (e.g. LWW_Map_CRDT uses merge_do_lex_max); those calls stay inside the three-stage pipeline. For the full recipe of translating an op-based CRDT into Sal's state-based signature and closing the 24 VCs, see docs/porting-op-based-crdts.md.
Lean's standard Set α := α → Prop is convenient for hand-written proofs but fights automation (membership is a proposition, not a Boolean). Sal introduces decidable versions where grind can compute:
abbrev set (a : Type) [DecidableEq a] := a → Bool
structure map (key : Type) [DecidableEq key] (value : Type) where
mappings : key → value
domain : set keyEvery lemma on these types is annotated with @[simp, grind] and (where useful) a grind_pattern, building a domain-specific rewrite database that lets grind discharge set/map goals without SMT assistance. Files live in Sal/Interfaces/.
The Boolean-predicate representation is grind-friendly only as a top-level state component. Nesting a set inside a map value — e.g. map K (set V) — forces grind to prove function equality via funext and typically defeats it. Where possible, flatten: represent map K (set V) as set (K × V). The Peritext CRDT is the clearest example in the suite: its formatting marks were originally a map (OpId × Bool) (set MarkOp), which left 5 of 24 VCs stuck; flattening to a top-level set AnchorAttachment closes all 24.
When automation fails because the implementation is actually wrong, Sal makes the failure inspectable rather than opaque:
- Plausible generates concrete counterexamples for decidable VCs. The canonical demo is the Enable-Wins Flag MRDT, which contains a known bug from prior work (the
inter_right_1opVC fails); Plausible rediscovers a minimal failing execution automatically. SeeSal/Counterexample_Visualization/WriterMonad_Enable_Wins_Flag.lean. - ProofWidgets trace visualizer. A logging-style writer monad instruments
doandmergeto record intermediate states, and a ProofWidgets component renders the LCA, left branch, right branch, and merge result as a vertical diagram. SeeSal/Counterexample_Visualization/. - Universe tracking for functional sets. Since Sal's
settype is aα → Boolpredicate and may be infinite, the visualizer augments abstract sets with a concreteHashSetof elements added or removed during execution, so the state can be displayed concretely.
Clone this repository, then install elan (the Lean toolchain manager). elan will read lean-toolchain and install Lean v4.28.0 on first use. From the repo root, run lake exe cache get to download the prebuilt Mathlib oleans — this takes a few minutes and is required before any file will type-check in a reasonable time.
lake update is safe to run — lakefile.toml pins mathlib to v4.28.0 and pins Blaster to the chore-bump-lean-4.28 branch of kayceesrk/Lean-blaster, a fork whose upstream (input-output-hk/Lean-blaster) does not yet have a v4.28-compatible branch. Once the upstream catches up we will switch back.
Open each Lean file in VS Code to run the verification conditions interactively, or run lake lean <path-to-file.lean> from the command line. The run_files.sh script checks every .lean file under a given directory.
Browser demos for every RDT in the suite live in demos/ (Vite + React + TypeScript, one hand-ported module per RDT). All 29 RDTs have a playground — 18 CRDTs + 10 MRDTs.
- CRDT playgrounds spin up three replicas, let you apply local ops per replica, and merge any pair directionally (source → target, like
git merge). There's also a "Merge all" button that folds every replica's state into a single join and assigns it back, so you can watch replicas snap to the same value in one click. Toggle Show concrete state to expose the lattice representation. - MRDT playgrounds organise history like git. Every local op creates a 1-parent commit; every merge creates a 2-parent commit with the LCA computed from the DAG (BFS on ancestors). A toggleable SVG history graph shows per-replica lanes with colour-coded commits (op commits solid, merge commits dashed, HEADs ringed thicker); click any past commit to inspect its full state.
- Lattice-law invariants are property-checked with fast-check per RDT: CRDTs get idempotence / commutativity / associativity / strong convergence; MRDTs get left identity / right identity / commutativity (the MRDT
merge(l,a,a) ~ aanalog is NOT a law — the closed-form counter MRDT violates it by design; MRDTs only promise convergence given a coherent history DAG, which the playground supplies at runtime).
cd demos
npm install
npm run dev # http://localhost:5173
npm run test # 28 fast-check suites, ~1.5 s
npm run build # TS + Vite production bundleSee demos/README.md for the CRDTSpec / MRDTSpec interfaces, file layout, and deployment. .github/workflows/demos-deploy.yml publishes to GitHub Pages on every push to main that touches demos/**.
Sal/Interfaces/— Sal's decidablesetandmapinterfaces (Set_Extended,Map_Extended,Map_Extended_With_Lean_Set).Sal/Tactic/— thesaltactic (Sal.lean) and usage examples (SalExample.lean).Sal/CRDTs/— 18 state-based CRDTs in the⟨Σ, σ₀, do, merge, rc⟩signature.Sal/MRDTs/— 11 state-based MRDTs.Sal/Counterexample_Visualization/— theWriterMonad_*.leanlogging-monad traces that feed the ProofWidgets visualizer.demos/— Vite + React + TypeScript playgrounds, one per RDT. CRDT demos do two-way merge; MRDT demos maintain a git-style commit DAG with LCA-driven three-way merge and a toggleable history visualisation.docs/porting-op-based-crdts.md— recipe for porting a new op-based CRDT into Sal's state-based signature.
Pranav Ramesh, Vimala Soundarapandian, KC Sivaramakrishnan. Sal: Multi-modal Verification of Replicated Data Types.
The paper evaluates Sal on 13 RDTs (4 CRDTs + 9 MRDTs), 312 VCs total (24 per RDT). The breakdown across the three stages (DG = dsimp + grind; LB = lean-blaster; ITP = Aristotle-assisted interactive proof):
- 215 VCs (68.9%) via DG — TCB not enlarged.
- 87 VCs (27.9%) via LB — TCB enlarged to include Z3.
- 9 VCs (3.0%) via ITP, all closed using Aristotle, whose outputs are kernel-checked — TCB not enlarged.
| RDT | DG | LB | ITP |
|---|---|---|---|
| Increment-only counter MRDT | 24 | 0 | 0 |
| PN-counter MRDT | 24 | 0 | 0 |
| OR-set MRDT | 3 | 21 | 0 |
| Enable-Wins Flag MRDT | 9 | 14 | 0 |
| Efficient OR-Set MRDT | 2 | 22 | 0 |
| Grows-only set MRDT | 24 | 0 | 0 |
| Grows-only map MRDT | 22 | 0 | 2 |
| Replicated Growable Array MRDT | 15 | 9 | 0 |
| Multi-valued Register MRDT | 24 | 0 | 0 |
| Increment-only counter CRDT | 24 | 0 | 0 |
| PN-counter CRDT | 16 | 2 | 6 |
| Multi-Valued Register CRDT | 24 | 0 | 0 |
| OR-set CRDT | 4 | 19 | 1 |
Two patterns from the paper: MRDTs generally need less SMT than CRDTs because three-way merges express causality directly, and map-based reasoning stresses current Lean automation more than set-based reasoning (the 8 remaining ITP goals are concentrated in map-heavy RDTs).
Since publication the suite has grown from 13 to 29 RDTs, and most of the added CRDTs were proved with the uniform kernel-verifiable pattern described under The sal tactic, without invoking Blaster for the majority of VCs. A refreshed DG/LB/ITP breakdown across all 29 RDTs is future work. The papoc2026 branch snapshots the repo at the paper-artifact state.
MIT; see LICENSE.