Add opt-in Thompson Sampling for relay scoring#53
Add opt-in Thompson Sampling for relay scoring#53alltheseas wants to merge 5 commits intocoracle-social:masterfrom
Conversation
Port sampleBeta(alpha, beta, rng?) from nostrability benchmarks. Uses Jöhnk's algorithm for small params, Marsaglia-Tsang gamma sampling for larger values. Fast path: sampleBeta(1, 1) returns rng() directly (zero overhead on cold start / uniform prior). Includes comprehensive tests for statistical properties, edge cases, deterministic seeding, and the uniform fast path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add getRelayPrior to RouterOptions. When provided, scoreRelay uses sampleBeta(alpha, beta) instead of Math.random(), biasing toward relays with better delivery history. Falls back to uniform random when no priors exist (identical to current behavior). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extend RelayStats with optional alpha/beta/last_delivery_update fields (backwards-compatible with existing IndexedDB data). Add getRelayPrior(url) with exponential time-decay (0.95/hour) to prevent prior ossification. Add recordRelayDelivery(url, delivered, expected) for callers to report relay delivery outcomes. Wire getRelayPrior into routerContext alongside getRelayQuality. Note: recordRelayDelivery is a push API — callers (e.g. Coracle) must invoke it after observing delivery outcomes (e.g. after EOSE). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Apply time-decay to stored alpha/beta before adding new observations in recordRelayDelivery. Previously, decay was only applied on read (getRelayPrior), so after long idle periods a single new observation would snap stale priors back to full confidence. Add input validation: - recordRelayDelivery rejects non-finite and negative delivered - getRelayPrior validates alpha/beta are finite and positive - Router scoreRelay catches sampleBeta exceptions, falls back to Math.random() Add integration tests covering decay semantics, the decay-then-update invariant, and the router's defensive fallback on invalid priors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
decayPrior now sanitizes stored alpha/beta: NaN, Infinity, negative, or undefined values reset to 1 (uniform). This lets relays with corrupted legacy data self-heal on the next delivery observation instead of staying permanently stuck. Move getRelayPrior call inside the router's try/catch so that a throwing provider implementation falls back to Math.random() instead of aborting relay selection. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Numbers update from nostrability/outbox benchmarksThe motivation section cites:
The 89% figure was inflated by a phase2 cache bug in the benchmark framework (lossy serialization stored the union of event IDs across all relays, inflating S2+ verification). This was fixed in nostrability/outbox#34. Corrected numbers (from 793 benchmark runs, 10-run variance study,
Suggested replacement for the motivation section:
The PR code itself is correct — this is just a docs fix for the motivation section. |
Summary
sampleBeta(alpha, beta, rng?)to@welshman/lib— Beta-distributed sampling for Thompson bandit relay selectiongetRelayPrioroption toRouterOptions— when provided,scoreRelayuses Beta sampling instead ofMath.random(), biasing toward relays with better delivery historyRelayStatswith optionalalpha/beta/last_delivery_updatefields (backwards-compatible with existing IndexedDB data)getRelayPrior(url)andrecordRelayDelivery(url, delivered, expected)to@welshman/apprelay statsgetRelayPriorintorouterContextalongsidegetRelayQualityMotivation
Welshman's Router scores relays with
quality * (1 + log(weight)) * Math.random(). TheMath.random()factor is stateless — it never learns which relays actually deliver events.Benchmarks in nostrability/outbox show that replacing
Math.random()withsampleBeta(alpha, beta)(Thompson Sampling) improves 1-year event recall by +9pp (~30% → ~39%, 6-profile mean, 10-run validated) after 3–5 learning sessions. The scoring formula structure is unchanged — only the random factor is replaced. See the benchmark results and methodology for details.Design decisions
scoreRelayproduces identical behavior to currentMath.random()scoreRelaydoesn't know pubkey context, so priors are global ("is this relay reliable?"). This is simpler and captures the dominant signal.recordRelayDeliveryis intentionally a push API — callers (e.g. Coracle) must invoke it after observing delivery outcomes (e.g. after EOSE). The Router shouldn't be opinionated about when "delivery" is measured.alpha,beta,last_delivery_updateare optional fields. Existing serializedRelayStatsdeserialize fine without them.Test plan
pnpm vitest run packages/lib/__tests__/Beta.test.ts— 14 tests covering statistical properties, edge cases, deterministic seeding, uniform fast path, decay-on-write invariant, and router defensive fallbackpnpm build— no type errors in lib, router, or app packagesscoreRelayproduces identical behavior to currentMath.random()getRelayPriorreturnsundefinedfor relays with no delivery history (no Beta overhead)🤖 Generated with Claude Code