Skip to content

make_column_metadata should prioritize spec-defined lengths over auto-detected lengths #1

@trinathpanda

Description

@trinathpanda

Problem

The function make_column_metadata automatically infers the maximum length of variables from the row-level dataframe. While this works dynamically, it can create conflicts when the specification (e.g., SDTM/ADaM spec or Define-XML) already prescribes a fixed length for certain variables. The inferred length may not match the required length in the specification, leading to inconsistencies.

Expected Behavior

  • If a variable length is explicitly defined in the specification, the function should honor the spec-defined length.
  • Auto-detection of maximum length should only apply when no specification length is provided.

Actual Behavior

  • The function always auto-detects the maximum length, even if the specification provides a fixed length.
  • This can result in mismatched metadata compared to the standard specification.

Example

  • Specification: --TERM length = 200
  • Row-level data max length detected = 87
  • Generated metadata sets length = 87 (instead of 200 from the spec).

Impact

  • This causes downstream misalignment with Define-XML and potential compliance issues in regulatory submissions.
  • Suggested Fix
  • Update make_column_metadata to:
  • Check for a specification-defined length first.
  • Fall back to auto-detection only if no length is defined in the spec.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions