-
Notifications
You must be signed in to change notification settings - Fork 3
Standardize catalog paths and add validation guardrails #165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
|
hi all, adding some thoughts to the Claude-generated summary above:
|
|
I think old versions that are not used can safely be deleted. The logic is that if someone wants to rerun something he just adds the catalog he/she wants. The only modification I did to the cat_config I believe is to add catalogs and add a path to a mask file in the most recent update. I also have very ugly management of the paths that could be improved if you have something better. |
|
Thank you for this! Indeed the cat_config file was getting rather long and cumbersome...I'm for deleting legacy stuff, and perhaps just keeping the versions of the catalogue that everyone is sharing. This info could also be reflected in the wiki (with consistent values) for completeness! |
Eliminate redundant redshift_file parameter and load n(z) directly from catalog configuration. Added get_redshift() method as single source of truth. - Renamed shear.redshift_distr → shear.redshift_path in cat_config.yaml - Added get_redshift(version) method for catalog-aware n(z) loading - Updated calculate_pure_eb(), plot_pure_eb(), calculate_pseudo_cl_eb_cov() to use get_redshift() - Removed redshift_file parameter from __init__, calculate_pure_eb(), plot_pure_eb() signatures and docstrings 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Update CosmologyValidation call to match current API. The data_base_dir parameter was removed in the catalog config refactor; all paths are now resolved from cat_config.yaml. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Add support for seed-specific mock catalog variants (e.g., SP_v1.4.5_glass_mock_seed1) by extracting and substituting seed tokens in shear paths. Enables exploring multiple random realizations of the same mock survey. - Add SP_v1.4.6_glass_mock catalog entry with v1.4.6 survey specs - Refactor version processing in __init__ to use recursive ensure_version_exists() - Support _seed<N> variants that deep-copy base config and substitute seed token - Handle _seed<N>_leak_corr combinations by materializing seed config first - Add explicit error checking for missing seed tokens in paths - Add regression tests for seed variant creation and error cases Seed variant examples: - SP_v1.4.5_glass_mock_seed1 → unions_glass_sim_00001_4096.fits - SP_v1.4.6_glass_mock_seed12 → unions_glass_sim_00012_4096.fits - SP_v1.4.5_glass_mock_seed1_leak_corr → combines both transforms 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add SP_v1.4.6_glass_mock to parametrized additive bias tests - Update SP_v1.4.6_glass_mock shear path to glass_mock_v1.4.6 directory - Add test_v1_4_6_glass_mock_seed_variant to verify seed9 variant loads correctly 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Replace brittle regex-based seed token detection with config-driven
path templates using Python string formatting. Each catalog specifies
a path_template with {seed:05d} or {seed} placeholders.
- Add path_template fields to SP_v1.4.5_glass_mock and SP_v1.4.6_glass_mock
- Simplify _split_seed_variant to pure string operations (no regex)
- Add _materialize_seed_path using .format() for clean templating
- Fallback to legacy seed-token extraction if no template provided
- All 15 tests pass including new v1.4.6 glass mock seed9 test
Benefits:
- Per-catalog control of path formatting without code changes
- No complex regex fragility
- Clear, self-documenting config entries
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Verify that SP_v1.4.6_glass_mock without a seed suffix correctly uses the default seed 00001 from the configured path field. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
|
thanks for the comments! i removed old catalog versions. some more changes:
should be ready to merge |
|
also updated the wiki with these values as Lisa suggested |
|
just to tabulate the versions being removed/retained: |
Remove 14 non-functional catalog versions from cat_config.yaml that have missing or misconfigured file paths: - LF_matched_SP_v1.0, LF_v1.0, LF_v2.0 - SP_matched_LF_v1.0, SP_v1.0_LFmask_4k, SP_v1.0_LFmask_8k - SP_v1.3_LFmask variants (4k, 8k, F2, SN7, SN8, li_2024, no_alpha) - SP_v1.4-P3_LFmask Retain 2 working LFmask versions: SP_v1.4_LFmask_8k and SP_v1.4_LFmask_8k_noalpha Consolidate test_catalog_paths.py functionality into test_cosmo_val.py: - test_catalog_paths_exist() now programmatically discovers all catalog versions - Simplify test_additive_bias_base_columns() to test only SP_v1.4.5 - Update test_additive_bias_leak_corrected_columns() to test SP_v1.4.6_leak_corr - Result: 10x faster test suite (614s → 25s) All tests pass. Remaining: 21 working catalog versions. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
|
|
||
| [pipeline] | ||
| values = cosmosis_config/values_psf.ini | ||
| priors = cosmosis_config/priors_psf.ini |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will have to modify the prior file with another template.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i copied priors_psf.ini to priors_mock.ini and committed it, if you want to modify that
cosmo_inference/cosmosis_config/cosmosis_pipeline_glass_mock_00001_cell.ini
Outdated
Show resolved
Hide resolved
|
|
also confirmed with the data fits file what keys to use for QUANT (not P+P...) |
|
It should be |
| return cl_ee_hdu, cl_bb_hdu | ||
|
|
||
|
|
||
| def cov_cl_to_fits(cov_file, nbins): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will save only the gaussian part of the covariance of iNKA. It should be using the output of calculate_pseudo_cl_g_ng_cov which contains the HDUs named:
COVAR_GAUSSIANCOVAR_NON_GAUSSIANCOVAR_FULL
The HDUCOVAR_FULLshould be used.
|
It is seems ready to merge for me @cailmdaley |
…rence run (Cosmosis). Added a prior file for harmonic space mocks.
Summary
This PR addresses catalog configuration maintenance issues by standardizing all catalog paths to absolute references and adding automated validation. The changes eliminate catalog path drift between related variants and make the configuration explicit about filesystem locations.
Problem
Dual maintenance burden: Each catalog had both a base entry and a
_leak_corrvariant, causing configuration drift when one was updated but not the other. This drift was invisible until regression tests failed.Implicit path resolution: Catalogs used relative
subdirpaths combined with adata_dirfallback, making it difficult to determine actual filesystem locations as data spread across multiple mounts.Changes
Catalog configuration normalization (
notebooks/cosmo_val/cat_config.yaml):subdirentries to explicit absolute paths (/n17data/...)data_dirfallback logicPseudo-C_ℓ code path updates (
src/sp_validation/cosmo_val.py):data_base_dirassumptionNew validation test (
src/sp_validation/tests/test_catalog_paths.py):Regression test updates:
Known Limitations
The following catalogs are missing source files and remain in configuration for provenance only:
SP_v1.0,SP_v1.1,SP_matched_MP_v1.0SP_v1.4,SP_v1.4_conv,SP_v1.4_noalphaSP_v1.4-P1+3*variantsThese are listed in the validator allow-list and excluded from regression tests.
Follow-up Work
/n17data/UNIONS/WL/v1.4.x/<variant>/directories when data becomes available🤖 Generated with Claude Code
Co-Authored-By: Claude noreply@anthropic.com