Skip to content

Ostailor/Generative_QML

Repository files navigation

Project Progress Overview (Through Milestone M9 Packaging)

Context & Scope

  • Program explores a modular quantum-active-learning (QAL) pipeline that blends quantum kernel regressors, quantum generative models, Bayesian acquisition, and DFT validation for high-entropy alloy discovery.
  • Work is organized by the agent stack in AGENTS.md, ensuring clear ownership, quality gates, and reproducibility requirements across PDA, DPQA, QKAA, QGMA, ALOA, MDIA, BRA, QHSOA, and RKMA.
  • This README captures the closed production DFT phase (M5-real) plus implemented finish-to-submission packaging work across M6–M9.

Summary of Completed Milestones

M0 – Program Alignment & Literature Reconnaissance

  • Charter approved (docs/m0_charter.md) defining scope, success metrics, governance cadence, and risk posture for the QAL program.
  • Annotated bibliography and synthesis memo produced (docs/literature/annotated_bibliography.md, docs/literature/synthesis_memo.md) with 45 peer-reviewed sources covering all research aims, satisfying T0.2 acceptance criteria.
  • Documentation workspace and metadata schema validated (workspace/entries/lab_notebooks/2025-09-15_T0.4_workspace_validation.md), enabling consistent lab notebook and decision-log capture.
  • Venue targeting brief finalized (docs/venue_target_brief.md), locking target conferences/journals and compliance checklists for downstream writing tasks.

M1 – Data Readiness & Domain Constraints

  • Raw perovskite, HEA, and SAA datasets catalogued with provenance manifests; preprocessing pipelines built (scripts/preprocess_datasets.py) achieving 100% validation pass rate (workspace/entries/lab_notebooks/2025-09-16_T1.2_preprocessing.md).
  • Noise and perturbation simulation module implemented (scripts/simulate_noise.py), delivering four stress scenarios for robustness studies (workspace/entries/lab_notebooks/2025-09-16_T1.3_noise_simulation.md).
  • HEA constraint library (data/metadata/hea_constraints.yaml) and validation tooling established with 100% pass rate (workspace/entries/lab_notebooks/2025-09-16_T1.4_hea_constraints.md).
  • DFT handoff interface and sample packages documented (docs/interfaces/dft_handoff.md), validated via automated checks (workspace/entries/lab_notebooks/2025-09-16_T1.5_dft_handoff.md).
  • Dataset release v1.0 published with checksums and manifest (data/releases/dataset_v1.0/), ensuring reproducible data foundations (workspace/entries/lab_notebooks/2025-09-16_T1.6_dataset_release.md).

M2 – Quantum Kernel Regression Foundations

  • Feature-map design brief produced with three ansatz families and hardware resource budgets (docs/qml/feature_map_design.md; workspace/entries/lab_notebooks/2025-09-16_T2.1_feature_maps.md).
  • QSVR benchmarks show quantum kernel matching classical RMSE within tolerance (data/qml/qsvr_metrics.json; workspace/entries/lab_notebooks/2025-09-16_T2.2_qsvr.md).
  • Quantum GPR prototypes deliver coverage within ±1.4% of nominal 95% target, establishing uncertainty-aware regressors (workspace/entries/lab_notebooks/2025-09-16_T2.3_qgpr.md).
  • Classical active-learning baselines executed and logged for future comparison (data/qml/classical_al_metrics.json; workspace/entries/lab_notebooks/2025-09-16_T2.4_classical_baselines.md).
  • RKMA archived experiment documentation and provenance graphs linking datasets to MLflow runs (docs/qml/experiment_documentation.md; workspace/entries/lab_notebooks/2025-09-16_T2.5_documentation.md).

M3 – Quantum Generative Modeling Capability

  • Generative architecture survey completed with expressivity/resource analysis (workspace/entries/lab_notebooks/2025-09-16_T3.1_generative_strategy.md).
  • QGAN prototype achieves 88.3% valid HEA candidates under constraint filtering (data/qml/qgan_metrics.json; workspace/entries/lab_notebooks/2025-09-16_T3.2_qgan.md).
  • Property-conditioned sampling reaches 100% compliance with DFT-informed priors (data/qml/qgan_property_metrics.json; workspace/entries/lab_notebooks/2025-09-16_T3.3_property_conditioning.md).
  • Quantum generator outperforms classical baseline on novelty by Δ=0.226 with higher feasibility (data/qml/generative_novelty_metrics.json; workspace/entries/lab_notebooks/2025-09-16_T3.4_novelty.md).
  • RKMA recorded full provenance for generative artefacts, ensuring traceability for future audits (docs/qml/generative_provenance.md; workspace/entries/lab_notebooks/2025-09-16_T3.5_provenance.md).

M4 – Active Learning Loop Design

  • System architecture and orchestration baseline established (scripts/qal_orchestrator.py, data/architecture/qal_run_summary.json; workspace/entries/lab_notebooks/2025-09-16_T4.3_orchestration.md).
  • Acquisition strategy experiments deliver 32% label-efficiency gain over classical baselines (data/architecture/label_efficiency_metrics.json; workspace/entries/lab_notebooks/2025-09-16_T4.4_label_efficiency.md).
  • Gate review authorized move to DFT-integrated operations with documented risks and action items (docs/architecture/qal_gate_review.md; workspace/entries/lab_notebooks/2025-09-16_T4.5_gate_review.md).

M5 – DFT Integration & Feedback Coupling

  • Automated DFT workflow scaffolded with reproducible handoffs and logs (scripts/dft/run_dft_workflow.py; workspace/entries/lab_notebooks/2025-09-16_T5.1_dft_workflow.md).
  • Closed-loop execution simulated three iterations end-to-end, generating consolidated run summaries (data/architecture/closed_loop_summary.json; workspace/entries/lab_notebooks/2025-09-16_T5.2_closed_loop.md).
  • Production DFT campaign executed and archived (data/dft_workflow/campaigns/t5r4-14539888/closed_loop_summary.json; workspace/entries/lab_notebooks/2025-10-27_T5R4_real_campaign.md).
  • Performance analysis confirms statistically significant gains (p=0.031) when incorporating DFT feedback (data/architecture/performance_metrics.json; workspace/entries/lab_notebooks/2025-09-16_T5.4_performance.md).
  • DFT workflow release package (v1.0) archived with manifests for reproducibility (data/releases/dft_workflow_v1/; workspace/entries/lab_notebooks/2025-09-16_T5.5_dft_release.md).

Quantitative Highlights

  • 45 peer-reviewed sources captured across all research aims for literature grounding (T0.2).
  • Preprocessing validation pass rate: 1.0 across three target datasets; four noise scenarios ready for robustness (T1.2–T1.3).
  • Quantum SVR outperformed classical baseline by ~0.0016% on RMSE; QGPR coverage gap −0.014 vs 95% target (T2.2–T2.3).
  • QGAN valid sample rate: 88.3% pre-conditioning; 100% post-conditioning with HEA priors (T3.2–T3.3).
  • Quantum novelty advantage Δ=0.226 with feasibility improvement Δ=0.117 vs classical generator (T3.4).
  • Active-learning loop achieved 32% label reduction while maintaining performance (T4.4).
  • DFT-integrated loop delivered significant accuracy/label gains with p=0.031 (T5.4).

Key Artefacts & Reproducibility Hooks

  • Data releases: data/releases/dataset_v1.0/, data/releases/dft_workflow_v1/, data/releases/real_dft_campaign_v1/ include manifests, checksums, and runbooks.
  • QA & metrics: data/metadata/qa_reports/ for preprocessing, noise, constraint, and DFT validation; data/architecture/*.json for loop performance stats.
  • ML experiments: scripts under scripts/ (qsvr_benchmark.py, qgpr_benchmark.py, qgan_prototype.py, qal_orchestrator.py, etc.) with runs tracked in mlruns/.
  • Documentation: milestone briefs and interface specifications in docs/ plus final packaging trees in docs/hardware/, docs/benchmarking/, docs/reproducibility/, docs/manuscript/, docs/poster/, and docs/submission/.
  • Lab notebooks: workspace/entries/lab_notebooks/ provide task-by-task evidence, MLflow run IDs, and reviewer approvals satisfying RKMA standards.

Submission-Ready Outputs

  • M6 hardware adapters, pilot runs, and cost memo: scripts/hardware/run_hardware_pilots.py, docs/hardware/.
  • M7 robustness + benchmark report with acceptance metrics: scripts/benchmarking/run_m7_benchmarks.py, docs/benchmarking/.
  • M8 reproducibility closure and provenance finalization: scripts/repro/run_repro_check.py, docs/reproducibility/, data/reproducibility/provenance_graph_final.json.
  • M9 manuscript/poster/preprint bundle with reproducible figures: docs/manuscript/, docs/poster/, docs/preprint/, scripts/publication/build_figures.py.
  • One-command fail-fast paper pipeline (M5-real to M9, default paper-grade profile with 1,100 GPU benchmark runs, --paper-grade-fastest, and a 2-hour runtime gate): scripts/publication/run_paper_data_pipeline.py.

Server Run Command

cd /path/to/Quanutum_MS_Pipeline
source .venv/bin/activate
python scripts/publication/run_paper_data_pipeline.py \
  --profile paper-grade \
  --paper-grade-fastest \
  --campaign-id t5r4-20260211-fasttrack-221-mw4 \
  --tracking-uri "file://$(pwd)/mlruns"

About

Milestone-driven roadmap for Generative QML HEA discovery—kernels, quantum generators, QAL+DFT loop, QPU pilots, and reproducible artifacts.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors