This Node.js-based research tool analyzes the metadata quality of German municipal Open Data portals. Specifically, it evaluates the regional assignability of datasets depending on whether standardized metadata models like DCAT-AP and GeoDCAT-AP are used (typically via CKAN), best practice as a SPARQL endpoint, or non-standard formats (e.g. ArcGIS JSON).
π§ͺ Developed as part of the research poster submission for DCMI 2025 by Florian Hahn, TU Chemnitz (SODIC Research Group)
Does the use of DCAT and CKAN in municipal German Open Data portals improve the assignability of datasets to regional categories compared to non-standardized alternatives?
-
π Automated metadata harvesting via:
- ArcGIS REST API
- CKAN API
- CKAN SPARQL
- DCAT RDF
- DCAT RDF SPARQL
-
π§ Regional categorization logic using place/entity keywords
-
π CSV export for per-portal analysis
-
π¨ Colored console output via
chalk -
βοΈ Easily extendable keyword classification logic
git clone https://github.com/SODIC-research/SODRAM.git
cd SODRAM
npm installRun the analysis:
npm run startThis will:
- Fetch metadata from German city portals (e.g. Chemnitz, Dresden, Leipzig)
- Apply classification logic
- Export results to
/export/*.csv - Export summary to
/export/summary.json
Each analyzed portal produces a .csv file with the following structure:
Title,Description,Spatial,Assignable
"Population by District","...","Leipzig",true
"Verkehrsdaten 2022","...","",false
- All portals are evaluated using a fixed list of regional keywords and spatial metadata fields (
dct:spatial, title/description). - The code is designed to replicate the methodology described in the poster: "Regional Analysis of Topic-Specific Open Datasets Through Metadata: Evaluating the Analytical Usability of DCAT vs. Non-DCAT Metadata in a Municipal Portal"
MIT License Β© 2025 Florian Hahn, TU Chemnitz, SODIC