Skip to content

Support for JSON Table Schema files? #11

@harrisj

Description

@harrisj

I know you wrote this to solve a simple problem of validating CSV files, but you have inadvertently created an alternative to the two existing approaches for specifying schemas for CSV files:

Remember that a schema is useful both to describe and validate data (you have the latter part). The first of these schema approaches seems to have the most momentum with apparently several tools for validating it. Most of these however are for validating the metadata specification vs. using that schema to validate the CSV. There is a tool for that, but its functionality is still pretty limited. But it's the most established schema out there for CSV. I just wish I found the functionality of the data validation tool as reliable as this one. What if I could have my cake and eat it too?!

It seems on first glance that the JSON Table Schema is mostly a subset of this one, in that it has fewer innate formats to validate against and simple constraints (no external functions), so in theory, you could also load JSON files in this format and give the world of CSV another tool for validating another schema. A few issues to figure out though:

  • I assume this is just mapping the records in the JSON to whatever you load from the YAML into
  • How to handle primary and foreign key references
  • Can we specify a format string that isn't in the JSON Table Schema spec? This could be useful for specifying formats like MongoDB IDs or such if possible (we can, and it seems to be considered a valid spec)
  • Would we want to provide a means of augmenting the base JSON with additional directives in a supplementary YAML or something?

Anyhow, I just wanted to get the discussion going. Let me know if this seems like a good idea. The one issue about using the format field for specifying more advanced formatting rules only supported by csv-test is that the schema will vary in its rigor depending on which checking program you use. Not sure if that's considered awful or not.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions