We need several pieces of information, most critically the sampling rate of recordings / processed data, but also data types.
One option could be to generate and use info.json files, similar to those found in https://github.com/tridesclous/tridesclous_datasets
An example of an info.json file might look like this:
{
"dtype": "float32",
"sample_rate": 15000.0,
"shape": [-1, 4]
}