Skip to content

Ostailor/CLBC

Repository files navigation

Certified Latent Bottleneck Communication (CLBC)

Public runtime repository for reproducing CLBC experiments.

This repo intentionally excludes private drafting and internal report files (for example docs/, plans/, paper/) via .gitignore. Everything referenced in this README exists in the tracked Git repo.

Prerequisites

  • macOS or Linux
  • Python 3.11+
  • bash, curl, rg
  • ollama for non-mock empirical evaluation lanes
  • CUDA GPU recommended for full 8.3/8.5/9.5/11.2/11.4 runs

Optional for ZK-heavy paths:

cargo install rzup
rzup install rust
rzup install r0vm

Environment Setup

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
pip install pytest matplotlib

Public Prereg Inputs

Tracked prereg/evidence files used by scripts live in prereg/:

  • prereg/9.5_preregistered_analysis_plan.md
  • prereg/11.2_preregistered_analysis_plan.md
  • prereg/11.3_preregistered_analysis_plan.md
  • prereg/9.5_*.md supporting 9.5 gate completeness checks

Fast Smoke (CPU)

bash scripts/run_9_5_smoke_cpu.sh
bash scripts/run_11_2_smoke_cpu.sh
bash scripts/run_11_4_smoke_cpu.sh

Stage Entry Points

Use these for full runs:

  • bash scripts/run_8_3_gpu.sh
  • bash scripts/run_8_5_gpu.sh
  • bash scripts/run_9_5_gpu.sh
  • bash scripts/run_11_2_gpu.sh
  • bash scripts/run_11_4_gpu.sh
  • bash scripts/run_11_3_prereqs.sh
  • python scripts/run_12_experiment_protocol.py --out experiments/results

Lower-level theorem/metric experiments are available via:

  • python bench/metrics/run_metrics.py --t2/--t3/--t4/--t5/--t81/--t83/--t84/--t85/--t94/--t101/--t102/--t103/--t104/--t105/--t111 ...
  • python bench/semantic_slack_gate/run_teps_baselines.py ...
  • python scripts/run_10_5_env_residuals.py ...

Full Reproduction Sequence

# 8.x
bash scripts/run_8_3_gpu.sh
bash scripts/run_8_5_gpu.sh

# 9.5 strong-accept battery + gate
bash scripts/run_9_5_gpu.sh

# 11.2 adaptive attacker
bash scripts/run_11_2_gpu.sh

# 11.4 baseline suite
bash scripts/run_11_4_gpu.sh

# 11.3 prerequisite artifacts (10.1-10.4 probes, 11.1 slack metrics, perf log)
bash scripts/run_11_3_prereqs.sh

# 11.3 report card
mkdir -p artifacts/11.3
python scripts/hash_11_3_prereg.py \
  --prereg prereg/11.3_preregistered_analysis_plan.md \
  --thresholds spec/report_card_thresholds.json \
  --out artifacts/11.3/prereg_hash.txt \
  --manifest-out artifacts/11.3/prereg_manifest.json

python scripts/run_11_3_report_card.py \
  --prereg prereg/11.3_preregistered_analysis_plan.md \
  --thresholds spec/report_card_thresholds.json \
  --prereg-hash artifacts/11.3/prereg_hash.txt \
  --manifest-out artifacts/11.3/runtime_manifest.json \
  --out artifacts/11.3/t113_report_card.json | tee artifacts/11.3/metrics.log

# 12.x tables/plots/report
python scripts/run_12_experiment_protocol.py --out experiments/results

Primary Outputs

  • artifacts/neurips_strong_accept/
  • artifacts/11.2/t112_attacker_metrics.json
  • artifacts/11.3/t113_report_card.json
  • artifacts/11.4/t114_empirical_baselines_summary.json
  • experiments/results/tables/
  • experiments/results/plots/
  • experiments/results/report.md

Notes

  • Most long-running scripts support resume behavior.
  • Generated artifacts are intentionally Git-ignored.
  • scripts/run_1_12_paper_gate.py expects a private docs pack and is not part of the public-only reproduction path.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors