Ensure canonical key order and schema validity after merge/overlay#44
Merged
candleindark merged 2 commits intodandi:mainfrom Mar 26, 2026
Merged
Conversation
After `-M` deep merge, output keys could appear in non-canonical order because deepmerge preserves dict insertion order. After `-O` overlay, key ordering relied on a manual SchemaDefinition field list that also silently dropped unknown keys. Both functions now call a new `canonicalize_schema_yml` helper that round-trips the YAML through `SchemaDefinition` via linkml-runtime's yaml_loader/yaml_dumper. This produces canonical key ordering and raises `InvalidLinkMLSchemaError` (new exception) for any field name unknown to `SchemaDefinition` or its nested objects, which the CLI converts to a `BadParameter` error. `remove_schema_key_duplication` is moved to after both merge/overlay steps in the CLI pipeline (so it strips the `name`/`text`/`prefix_prefix` fields re-introduced by the round-trip), and is extended to also strip the redundant `prefix_prefix` key from prefix entries. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add `_get_meta_schema_validator()` (lazily initialized, cached via `functools.cache`) and extend `canonicalize_schema_yml` to validate the canonical output against the LinkML meta schema using `linkml.validator.Validator` with `JsonschemaValidationPlugin(closed=True)`. This catches unknown field names and wrong-type values that the `yaml_loader` round-trip alone does not detect. The two detection paths now produce distinct `InvalidLinkMLSchemaError` messages: "Unknown field in schema:" for `TypeError` from `yaml_loader`, and "Schema validation failed:" for violations found by the meta-schema validator. CLI `BadParameter` messages and all documentation are updated accordingly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #43.
Summary
-M(deep merge) or-O(overlay),the output YAML is round-tripped through
SchemaDefinitionviacanonicalize_schema_yml, so keys always appear in the same order as afreshly serialized
SchemaDefinition.LinkML meta schema using
linkml.validator.ValidatorwithJsonschemaValidationPlugin(closed=True). Unknown field names andwrong-type values raise
InvalidLinkMLSchemaError, which the CLIsurfaces as a
BadParametererror.prefix_prefixdeduplication:remove_schema_key_duplicationnowalso strips the redundant
prefix_prefixkey from each prefix entry(the dict key already identifies the prefix).
prefixes (
"Unknown field in schema:"fromyaml_loaderTypeError vs."Schema validation failed:"from the meta-schema validator), makingthe error origin unambiguous in tests and in the field.
Test plan
hatch run test.py3.10:pytest tests/— all 146 tests passruff check . && ruff format --check .— cleanTestCanonicalizeSchemaYml.test_wrong_type_raises_invalid_schema_error— mocksyaml_dumper.dumpsto inject a wrong-type canonical YAML and asserts"Schema validation failed:"in the errorTestApplySchemaOverlay.test_unknown_field_raises_invalid_schema_error— asserts"Unknown field in schema:"prefixTestApplyYamlDeepMerge.test_unknown_field_raises_invalid_schema_error— asserts"Unknown field in schema:"prefixpydantic2linkml dandischema.models | head -20echo "not_a_real_field: foo" > /tmp/bad.yaml && pydantic2linkml -M /tmp/bad.yaml dandischema.models🤖 Generated with Claude Code