Centralized repository for datasets used by the GRC organization at IIT. Contains scientific simulation datasets and documentation for accessing petabyte-scale public datasets.
| Category | Datasets | Formats | Total Size | Description | 
|---|---|---|---|---|
| ADIOS | 23 | BP5 | 755 MB | CFD, MD, Weather simulations | 
| Oceanography | 2 | NetCDF | 2.4 MB | CTD profiles, surface analysis | 
| Genomics | 7 | FASTA, HDF5, SAM, VCF, FASTQ | 8.5 MB | Genomes, variants, RNA-seq | 
| Astronomy | 4 | FITS, HDF5 | 2.7 MB | Images, spectra, light curves | 
| Seismology | 3 | HDF5 | 16 MB | Earthquake data, noise, RFs | 
| Parquet | 2 | Parquet | 48 MB | NYC taxi, analytics samples | 
| NetCDF | 2 | NetCDF | 6.7 MB | NOAA climate data | 
| HDF5 | 5 | HDF5, PDB | 766 KB | OpenPMD, protein structures | 
| ROOT | 2 | ROOT | 20.5 MB | Higgs analysis, tutorials | 
| FITS | 2 | FITS | 4.8 MB | Hubble observations | 
| CIF | 3 | CIF | 85 KB | Crystal structures | 
| Darshan | Examples | LOG | 36 MB | I/O characterization traces | 
| Shadow | 50+ | Various | PB-scale | Documentation for public data | 
Total local datasets: ~840 MB across 50+ datasets Total accessible (shadow): Petabytes of public scientific data
- Adios - ADIOS2 I/O framework datasets
- HDF5 - Hierarchical Data Format files
- NetCDF - Network Common Data Form files
- Parquet - Columnar format files
- ROOT - Particle physics data from CERN
- FITS - Astronomy image and data files
- CIF - Crystallographic Information Files (crystal structures)
- Oceanography - Ocean and marine data (NetCDF)
- Astronomy - Astronomical observations (FITS, HDF5)
- Seismology - Earthquake and seismic data (HDF5)
- Genomics - Genomics and bioinformatics data (FASTA, HDF5, SAM, VCF, FASTQ)
- Darshan-Traces - HPC I/O characterization traces
Documentation for petabyte-scale public datasets: