Add opt-in Thompson Sampling for relay scoring by alltheseas · Pull Request #53 · coracle-social/welshman

alltheseas · 2026-03-05T16:30:10Z

Summary

Add sampleBeta(alpha, beta, rng?) to @welshman/lib — Beta-distributed sampling for Thompson bandit relay selection
Add getRelayPrior option to RouterOptions — when provided, scoreRelay uses Beta sampling instead of Math.random(), biasing toward relays with better delivery history
Extend RelayStats with optional alpha/beta/last_delivery_update fields (backwards-compatible with existing IndexedDB data)
Add getRelayPrior(url) and recordRelayDelivery(url, delivered, expected) to @welshman/app relay stats
Wire getRelayPrior into routerContext alongside getRelayQuality

Motivation

Welshman's Router scores relays with quality * (1 + log(weight)) * Math.random(). The Math.random() factor is stateless — it never learns which relays actually deliver events.

Benchmarks in nostrability/outbox show that replacing Math.random() with sampleBeta(alpha, beta) (Thompson Sampling) improves 1-year event recall by +9pp (~30% → ~39%, 6-profile mean, 10-run validated) after 3–5 learning sessions. The scoring formula structure is unchanged — only the random factor is replaced. See the benchmark results and methodology for details.

Design decisions

Opt-in: when no priors exist, scoreRelay produces identical behavior to current Math.random()
Per-relay, not per-pubkey-per-relay: scoreRelay doesn't know pubkey context, so priors are global ("is this relay reliable?"). This is simpler and captures the dominant signal.
Push API for delivery feedback: recordRelayDelivery is intentionally a push API — callers (e.g. Coracle) must invoke it after observing delivery outcomes (e.g. after EOSE). The Router shouldn't be opinionated about when "delivery" is measured.
Time-based decay: exponential decay (0.95/hour) on stored priors prevents ossification without requiring a "session" concept. Decay is applied on both read and write to avoid stale priors snapping back after idle periods. (The decay rate is a design choice, not a benchmarked parameter — the benchmarks used discrete sessions without decay.)
Backwards-compatible: alpha, beta, last_delivery_update are optional fields. Existing serialized RelayStats deserialize fine without them.
Corrupted data self-heals: NaN/Infinity/negative stored values are sanitized to 1 (uniform) so relays re-enter Thompson learning after upgrade or data corruption.
No latency discount: kept out to reduce scope; can follow up.

Test plan

pnpm vitest run packages/lib/__tests__/Beta.test.ts — 14 tests covering statistical properties, edge cases, deterministic seeding, uniform fast path, decay-on-write invariant, and router defensive fallback
pnpm build — no type errors in lib, router, or app packages
Manual review: when no priors exist, scoreRelay produces identical behavior to current Math.random()
Review that getRelayPrior returns undefined for relays with no delivery history (no Beta overhead)
Review that time-based decay prevents prior ossification (alpha+beta bounded)

🤖 Generated with Claude Code

Port sampleBeta(alpha, beta, rng?) from nostrability benchmarks. Uses Jöhnk's algorithm for small params, Marsaglia-Tsang gamma sampling for larger values. Fast path: sampleBeta(1, 1) returns rng() directly (zero overhead on cold start / uniform prior). Includes comprehensive tests for statistical properties, edge cases, deterministic seeding, and the uniform fast path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add getRelayPrior to RouterOptions. When provided, scoreRelay uses sampleBeta(alpha, beta) instead of Math.random(), biasing toward relays with better delivery history. Falls back to uniform random when no priors exist (identical to current behavior). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Extend RelayStats with optional alpha/beta/last_delivery_update fields (backwards-compatible with existing IndexedDB data). Add getRelayPrior(url) with exponential time-decay (0.95/hour) to prevent prior ossification. Add recordRelayDelivery(url, delivered, expected) for callers to report relay delivery outcomes. Wire getRelayPrior into routerContext alongside getRelayQuality. Note: recordRelayDelivery is a push API — callers (e.g. Coracle) must invoke it after observing delivery outcomes (e.g. after EOSE). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Apply time-decay to stored alpha/beta before adding new observations in recordRelayDelivery. Previously, decay was only applied on read (getRelayPrior), so after long idle periods a single new observation would snap stale priors back to full confidence. Add input validation: - recordRelayDelivery rejects non-finite and negative delivered - getRelayPrior validates alpha/beta are finite and positive - Router scoreRelay catches sampleBeta exceptions, falls back to Math.random() Add integration tests covering decay semantics, the decay-then-update invariant, and the router's defensive fallback on invalid priors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

decayPrior now sanitizes stored alpha/beta: NaN, Infinity, negative, or undefined values reset to 1 (uniform). This lets relays with corrupted legacy data self-heal on the next delivery observation instead of staying permanently stuck. Move getRelayPrior call inside the router's try/catch so that a throwing provider implementation falls back to Math.random() instead of aborting relay selection. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

alltheseas · 2026-03-07T16:21:26Z

Numbers update from nostrability/outbox benchmarks

The motivation section cites:

replacing Math.random() with sampleBeta(alpha, beta) (Thompson Sampling) improves 1-year event recall from 24% → 89% after 2–3 sessions

The 89% figure was inflated by a phase2 cache bug in the benchmark framework (lossy serialization stored the union of event IDs across all relays, inflating S2+ verification). This was fixed in nostrability/outbox#34.

Corrected numbers (from 793 benchmark runs, 10-run variance study, --no-phase2-cache):

Window	Stochastic baseline	Welshman+Thompson (5 sessions)	Absolute	Relative
7d	79-90%	84-92%	+4-7pp	+5-8%
1yr	30%	39% ± 2.7 SE	+9pp	+30%
3yr	19%	26%	+7pp	+37%

Suggested replacement for the motivation section:

Benchmarks in nostrability/outbox (793 runs across 6 profiles, 3 time windows) show that replacing Math.random() with sampleBeta(alpha, beta) finds 30% more events at 1 year (30% → 39% recall, +9pp, 10-run validated) and 37% more at 3 years (19% → 26%). At 7 days the baseline is already strong (79-90%), so gains are modest (+5-8%). Per-profile gains range from 0pp to +15pp depending on relay graph diversity. See the corrected benchmark data.

The PR code itself is correct — this is just a docs fix for the motivation section.

alltheseas and others added 5 commits March 5, 2026 09:37

alltheseas mentioned this pull request Mar 18, 2026

Add Newsletter #14 (2026-03-18) and new topic pages andotherstuff/nostr-compass#69

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add opt-in Thompson Sampling for relay scoring#53

Add opt-in Thompson Sampling for relay scoring#53
alltheseas wants to merge 5 commits intocoracle-social:masterfrom
alltheseas:feat/thompson-sampling

alltheseas commented Mar 5, 2026 •

edited

Loading

Uh oh!

alltheseas commented Mar 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

alltheseas commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Design decisions

Test plan

Uh oh!

alltheseas commented Mar 7, 2026

Numbers update from nostrability/outbox benchmarks

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

alltheseas commented Mar 5, 2026 •

edited

Loading