You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Program explores a modular quantum-active-learning (QAL) pipeline that blends quantum kernel regressors, quantum generative models, Bayesian acquisition, and DFT validation for high-entropy alloy discovery.
Work is organized by the agent stack in AGENTS.md, ensuring clear ownership, quality gates, and reproducibility requirements across PDA, DPQA, QKAA, QGMA, ALOA, MDIA, BRA, QHSOA, and RKMA.
This README captures the closed production DFT phase (M5-real) plus implemented finish-to-submission packaging work across M6–M9.
Summary of Completed Milestones
M0 – Program Alignment & Literature Reconnaissance
Charter approved (docs/m0_charter.md) defining scope, success metrics, governance cadence, and risk posture for the QAL program.
Annotated bibliography and synthesis memo produced (docs/literature/annotated_bibliography.md, docs/literature/synthesis_memo.md) with 45 peer-reviewed sources covering all research aims, satisfying T0.2 acceptance criteria.
Documentation workspace and metadata schema validated (workspace/entries/lab_notebooks/2025-09-15_T0.4_workspace_validation.md), enabling consistent lab notebook and decision-log capture.
Venue targeting brief finalized (docs/venue_target_brief.md), locking target conferences/journals and compliance checklists for downstream writing tasks.
M1 – Data Readiness & Domain Constraints
Raw perovskite, HEA, and SAA datasets catalogued with provenance manifests; preprocessing pipelines built (scripts/preprocess_datasets.py) achieving 100% validation pass rate (workspace/entries/lab_notebooks/2025-09-16_T1.2_preprocessing.md).
Noise and perturbation simulation module implemented (scripts/simulate_noise.py), delivering four stress scenarios for robustness studies (workspace/entries/lab_notebooks/2025-09-16_T1.3_noise_simulation.md).
HEA constraint library (data/metadata/hea_constraints.yaml) and validation tooling established with 100% pass rate (workspace/entries/lab_notebooks/2025-09-16_T1.4_hea_constraints.md).
DFT handoff interface and sample packages documented (docs/interfaces/dft_handoff.md), validated via automated checks (workspace/entries/lab_notebooks/2025-09-16_T1.5_dft_handoff.md).
Dataset release v1.0 published with checksums and manifest (data/releases/dataset_v1.0/), ensuring reproducible data foundations (workspace/entries/lab_notebooks/2025-09-16_T1.6_dataset_release.md).
M2 – Quantum Kernel Regression Foundations
Feature-map design brief produced with three ansatz families and hardware resource budgets (docs/qml/feature_map_design.md; workspace/entries/lab_notebooks/2025-09-16_T2.1_feature_maps.md).
QSVR benchmarks show quantum kernel matching classical RMSE within tolerance (data/qml/qsvr_metrics.json; workspace/entries/lab_notebooks/2025-09-16_T2.2_qsvr.md).
Quantum GPR prototypes deliver coverage within ±1.4% of nominal 95% target, establishing uncertainty-aware regressors (workspace/entries/lab_notebooks/2025-09-16_T2.3_qgpr.md).
Classical active-learning baselines executed and logged for future comparison (data/qml/classical_al_metrics.json; workspace/entries/lab_notebooks/2025-09-16_T2.4_classical_baselines.md).
RKMA archived experiment documentation and provenance graphs linking datasets to MLflow runs (docs/qml/experiment_documentation.md; workspace/entries/lab_notebooks/2025-09-16_T2.5_documentation.md).
M3 – Quantum Generative Modeling Capability
Generative architecture survey completed with expressivity/resource analysis (workspace/entries/lab_notebooks/2025-09-16_T3.1_generative_strategy.md).
QGAN prototype achieves 88.3% valid HEA candidates under constraint filtering (data/qml/qgan_metrics.json; workspace/entries/lab_notebooks/2025-09-16_T3.2_qgan.md).
Property-conditioned sampling reaches 100% compliance with DFT-informed priors (data/qml/qgan_property_metrics.json; workspace/entries/lab_notebooks/2025-09-16_T3.3_property_conditioning.md).
Quantum generator outperforms classical baseline on novelty by Δ=0.226 with higher feasibility (data/qml/generative_novelty_metrics.json; workspace/entries/lab_notebooks/2025-09-16_T3.4_novelty.md).
RKMA recorded full provenance for generative artefacts, ensuring traceability for future audits (docs/qml/generative_provenance.md; workspace/entries/lab_notebooks/2025-09-16_T3.5_provenance.md).
M4 – Active Learning Loop Design
System architecture and orchestration baseline established (scripts/qal_orchestrator.py, data/architecture/qal_run_summary.json; workspace/entries/lab_notebooks/2025-09-16_T4.3_orchestration.md).
Acquisition strategy experiments deliver 32% label-efficiency gain over classical baselines (data/architecture/label_efficiency_metrics.json; workspace/entries/lab_notebooks/2025-09-16_T4.4_label_efficiency.md).
Gate review authorized move to DFT-integrated operations with documented risks and action items (docs/architecture/qal_gate_review.md; workspace/entries/lab_notebooks/2025-09-16_T4.5_gate_review.md).
M5 – DFT Integration & Feedback Coupling
Automated DFT workflow scaffolded with reproducible handoffs and logs (scripts/dft/run_dft_workflow.py; workspace/entries/lab_notebooks/2025-09-16_T5.1_dft_workflow.md).
Closed-loop execution simulated three iterations end-to-end, generating consolidated run summaries (data/architecture/closed_loop_summary.json; workspace/entries/lab_notebooks/2025-09-16_T5.2_closed_loop.md).
Production DFT campaign executed and archived (data/dft_workflow/campaigns/t5r4-14539888/closed_loop_summary.json; workspace/entries/lab_notebooks/2025-10-27_T5R4_real_campaign.md).
DFT workflow release package (v1.0) archived with manifests for reproducibility (data/releases/dft_workflow_v1/; workspace/entries/lab_notebooks/2025-09-16_T5.5_dft_release.md).
Quantitative Highlights
45 peer-reviewed sources captured across all research aims for literature grounding (T0.2).
Preprocessing validation pass rate: 1.0 across three target datasets; four noise scenarios ready for robustness (T1.2–T1.3).
Quantum SVR outperformed classical baseline by ~0.0016% on RMSE; QGPR coverage gap −0.014 vs 95% target (T2.2–T2.3).
QGAN valid sample rate: 88.3% pre-conditioning; 100% post-conditioning with HEA priors (T3.2–T3.3).
Quantum novelty advantage Δ=0.226 with feasibility improvement Δ=0.117 vs classical generator (T3.4).
Active-learning loop achieved 32% label reduction while maintaining performance (T4.4).
DFT-integrated loop delivered significant accuracy/label gains with p=0.031 (T5.4).
Key Artefacts & Reproducibility Hooks
Data releases: data/releases/dataset_v1.0/, data/releases/dft_workflow_v1/, data/releases/real_dft_campaign_v1/ include manifests, checksums, and runbooks.
QA & metrics: data/metadata/qa_reports/ for preprocessing, noise, constraint, and DFT validation; data/architecture/*.json for loop performance stats.
ML experiments: scripts under scripts/ (qsvr_benchmark.py, qgpr_benchmark.py, qgan_prototype.py, qal_orchestrator.py, etc.) with runs tracked in mlruns/.
Documentation: milestone briefs and interface specifications in docs/ plus final packaging trees in docs/hardware/, docs/benchmarking/, docs/reproducibility/, docs/manuscript/, docs/poster/, and docs/submission/.
Lab notebooks: workspace/entries/lab_notebooks/ provide task-by-task evidence, MLflow run IDs, and reviewer approvals satisfying RKMA standards.
Submission-Ready Outputs
M6 hardware adapters, pilot runs, and cost memo: scripts/hardware/run_hardware_pilots.py, docs/hardware/.
M7 robustness + benchmark report with acceptance metrics: scripts/benchmarking/run_m7_benchmarks.py, docs/benchmarking/.
M8 reproducibility closure and provenance finalization: scripts/repro/run_repro_check.py, docs/reproducibility/, data/reproducibility/provenance_graph_final.json.
M9 manuscript/poster/preprint bundle with reproducible figures: docs/manuscript/, docs/poster/, docs/preprint/, scripts/publication/build_figures.py.
One-command fail-fast paper pipeline (M5-real to M9, default paper-grade profile with 1,100 GPU benchmark runs, --paper-grade-fastest, and a 2-hour runtime gate): scripts/publication/run_paper_data_pipeline.py.