I know you wrote this to solve a simple problem of validating CSV files, but you have inadvertently created an alternative to the two existing approaches for specifying schemas for CSV files:
Remember that a schema is useful both to describe and validate data (you have the latter part). The first of these schema approaches seems to have the most momentum with apparently several tools for validating it. Most of these however are for validating the metadata specification vs. using that schema to validate the CSV. There is a tool for that, but its functionality is still pretty limited. But it's the most established schema out there for CSV. I just wish I found the functionality of the data validation tool as reliable as this one. What if I could have my cake and eat it too?!
It seems on first glance that the JSON Table Schema is mostly a subset of this one, in that it has fewer innate formats to validate against and simple constraints (no external functions), so in theory, you could also load JSON files in this format and give the world of CSV another tool for validating another schema. A few issues to figure out though:
Anyhow, I just wanted to get the discussion going. Let me know if this seems like a good idea. The one issue about using the format field for specifying more advanced formatting rules only supported by csv-test is that the schema will vary in its rigor depending on which checking program you use. Not sure if that's considered awful or not.
I know you wrote this to solve a simple problem of validating CSV files, but you have inadvertently created an alternative to the two existing approaches for specifying schemas for CSV files:
Remember that a schema is useful both to describe and validate data (you have the latter part). The first of these schema approaches seems to have the most momentum with apparently several tools for validating it. Most of these however are for validating the metadata specification vs. using that schema to validate the CSV. There is a tool for that, but its functionality is still pretty limited. But it's the most established schema out there for CSV. I just wish I found the functionality of the data validation tool as reliable as this one. What if I could have my cake and eat it too?!
It seems on first glance that the JSON Table Schema is mostly a subset of this one, in that it has fewer innate formats to validate against and simple constraints (no external functions), so in theory, you could also load JSON files in this format and give the world of CSV another tool for validating another schema. A few issues to figure out though:
Anyhow, I just wanted to get the discussion going. Let me know if this seems like a good idea. The one issue about using the
formatfield for specifying more advanced formatting rules only supported by csv-test is that the schema will vary in its rigor depending on which checking program you use. Not sure if that's considered awful or not.