- Relational Database Management System (RDMS): Software for creating, managing, and interacting with relational databases
- AI-driven database management: Using AI to automate and optimize the management of databases, enhancing performance, security, and data access
- Vector Databases for AI Workloads: Databases that store and manage data as high-dimensional vectors, enabling efficient similarity searches, and are useful for AI workloads
- Information Retrieval: Discovery and extraction of relevant information from vast collections of data in response to a user’s specific query
- Geographical Information Systems: Computer systems that analyze and display geographically referenced information
- Cloud-native data formats: Data formats are designed to store and access large datasets directly in the cloud efficiently, e.g. Cloud Optimized GeoTIFFs (COGs) for raster data and Zarr for multi-dimensional arrays.
- Linked Data / Ontologies: Linked Data is structured data that is interlinked with other data, suitable for semantic queries and automatic retrieval. Ontologies are formal descriptions of data relationships for organizing and linking data effectively.
- No SQL: Data organization using various flexible data models like key-value pairs, documents, graphs, and wide-column stores
- Non-relational databases: Data organization using various flexible data models like key-value pairs, documents, graphs, and wide-column stores
- Handling Sensor Data: Collecting, processing, and analyzing the information generated by sensors that monitor physical conditions or activities
- Information Integration: Merging of information from heterogeneous sources with differing conceptual, contextual and typographical representations
- Data Assimilation: Methods that update information from numerical computer models with information from observations
- Stream Processing: Analyzing and processing large amounts of real-time data as it flows in from various sources
We welcome topics to be discussed during the SIG. To propose a topic, open an issue with a brief description of the topic and label it topic.
Do you have a data handling issue? You can bring your issue to the SIG -- we can look at the issue together and do our best to help you find the most suitable solution.
- Open an issue on github and label it
help wanted. - Describe the type of issue you want to address. Make sure to include:
- What is your final goal ?
- What is the challenge ?
- A sample of your data (if possible).
- Which technologies you are using to store and access the data.
We will discuss your issue during the next SIG meeting.
Did you do something really cool with your data ? Share your experiences with the SIG! Open an issue and propose it as a topic and label it topic.
Possible things you might like to share:
- Tools & Methodologies for storage
- Tools & Methodologies for access
- Data FAIRness
- Data handling
| Date | Topic | Presenter |
|---|---|---|
| 2026-03-12 | Using DuckDB to blur the boundary between storage, compute, and the user | Suvayu |
| 2026-04-09 | Cloud native data formats | Francesco |
| 2026-05-07 | ||
| 2026-06-04 | ||
| 2026-07-02 | ||
| 2026-09-24 | ||
| 2026-11-19 | ||
| 2026-12-17 |
Past meetings: