10:30am - 11:00am Invited Session KeynoteTopics: 06.01 Data Management, Research Data Infrastructures, AI-Applications and 3D Visualization Techniques: Meeting Today’s and Future Needs in GeosciencesEarthChem & Astromat: Aligning Research Data Infrastructure for Samples from the Deep Earth to Outer Space
Kerstin Lehnert, Lucy Profeta, Jennifer Mays, Peng Ji, Griffin Danninger, Mollie Celnick, Annika Johansson, Juan David Figueroa
Columbia University, United States of America
The data facilities EarthChem and Astromat (Astromaterials Data System) deliver similar but distinct services for the curation and management of laboratory analytical data generated by the study of material samples. EarthChem, developed and operated since 2006 with funding from the US NSF, focuses on geochemical data of terrestrial samples, while Astromat - started in 2019 with NASA funding - is NASA’s primary archive for astromaterials samples data. Both facilities operate data repositories that curate, publish, and preserve sample-based data and maintain synthesis databases that aggregate and harmonize published data into ‘science-ready’ collections that enable large-scale data analytics.
While the two data facilities differ in scope and funding level, there is wide overlap in the requirements for and design of system architecture, metadata schemas, vocabularies, data curation workflows, policies, and functionality of software applications leading to substantial synergies and economies of scale in the operation of the data systems. The development of the Astromat Data Archive and the Astromat Synthesis has provided an opportunity to modernize EarthChem’s system architecture and software ecosystem. Astromat’s new cloud-based repository platform, the new architecture of the synthesis database, and the design of user interfaces were adopted and adapted for EarthChem. New curation workflows and tools are also shared by both systems. Metadata schemas and vocabularies are kept in sync to the degree possible, advancing consistent data best practices and standards in geochemistry more broadly. Common data curation procedures and policies make it easier to maintain compliance with repository standards and obtain certification by CoreTrustSeal.
11:00am - 11:15amTopics: 06.01 Data Management, Research Data Infrastructures, AI-Applications and 3D Visualization Techniques: Meeting Today’s and Future Needs in GeosciencesOne platform cannot solve everything: FID GEO’s Collaborative Approach to Open Science
Melanie Lorenz1, Kirsten Elger1, Inke Achterberg2, Malte Semmler2
1GFZ Helmholtz Centre for Geosciences, Germany; 2Goettingen State and University Library, Germany
The Fachinformationsdienst für Geowissenschaften (FID GEO) is a DFG funded initiative that has been supporting the geoscience community for nearly a decade. It provides publication services through its partner domain repositories GFZ Data Services (for research data and software) and GEO-LEOe-docs (for text publications). FID GEO promotes digital transformation and Open Science practices through workshops, publications, conference contributions and engagement in thematic events. Its activities are aligned with international developments, striving to synchronize national progress with global standards and best practices for data management and distribution. Collaboration is a cornerstone of FID GEO's work.
For many years, researchers have expressed a strong desire for a ‘single source’ platform to manage growing datasets, publications and projects. And yet, the complexity of existing infrastructures is often overwhelming. A one-size-fits-all solution is neither technically feasible nor sustainably maintainable. Instead, widespread implementations of machine-readable (meta)data standards offer a path forward, enabling links between distributed data systems and the persistent identification of authors and institutions. Competing infrastructures, limited funding, and overlapping goals further complicate the landscape. FID GEO addresses these challenges through collaboration and guidance, helping researchers navigate this complex landscape and demonstrating practical ways to make scientific outputs visible, reusable and aligned with the FAIR and Open Science principles.
This presentation will share best practices, lessons learned, and future directions for fostering a collaborative and open research environment. FID GEO envisions a geoscience community empowered by shared data and collaborative infrastructures, better equipped to address pressing global challenges.
11:15am - 11:30amTopics: 06.01 Data Management, Research Data Infrastructures, AI-Applications and 3D Visualization Techniques: Meeting Today’s and Future Needs in GeosciencesAdvancing Earth System Sciences to FAIRness and Openness: the NFDI4Earth Commitment
Daniel Nüst1, Andreas Hübner2, Jörg Seegert1, Melanie Lorenz3, Kirsten Elger3
1TUD Dresden University of Technology; 2Freie Universität Berlin; 3Specialised Information Service for Geosciences | GFZ Helmholtz Centre for Geosciences
In the evolving landscape of geosciences, the need for findable, accessible, interoperable, and reusable as well as open (FAIR and Open) data has never been more important. The NFDI4Earth FAIRness and Openness Commitment (the NFDI4Earth Commitment, https://doi.org/10.5281/zenodo.10123880) invites the Earth System Sciences (ESS) community to take a step towards a more collaborative and impactful scientific future through their endorsement.
The principles of FAIR and Open data advance research data management (RDM) practices, e.g., using domain-specific research data infrastructures. Good RDM practices are crucial to ensure that data-driven methods are reproducible, robust, and transparent. The NFDI4Earth Commitment has been published in 2024 as the common vision of NFDI4Earth (https://nfdi4earth.de). It provides values and practical guidance to start a conversation on advancing FAIR and Open data practices in ESS. Signatories demonstrate that they strive to adhere to the commitment's values and to implement better RDM practices. ESS institutions and organisations as well as individual researchers can endorse and sign the NFDI4Earth Commitment.
We present how the endorsement of the NFDI4Earth Commitment can spark attention of individuals and initiate actions in organisations. For example, academic societies may take an endorsement as an opportunity to evaluate how they can support FAIR and Open data practices in the ESS in their role as platforms for discourse and policy setting, and as journal publishers. Through the Commitment, NFDI4Earth aims to drive progress in the Earth sciences and address global challenges. Join us in shaping the future of our field by signing today.
11:30am - 11:45amTopics: 06.01 Data Management, Research Data Infrastructures, AI-Applications and 3D Visualization Techniques: Meeting Today’s and Future Needs in GeosciencesThe rocky foundation of the GSEU project: a pan-European machine-readable hierarchical vocabulary to describe the lithology on geological maps and of boreholes.
Kristine Asch1, Stefan Bergman2, Paul Heckmann1, Hans-Georg Krenmayr3, Matevz Novak4, Marco Pantaloni5, Robert Schäfer1, Urszula Stepien6
1Bundesanstalt für Geowissenschaften und Rohstoffe, Germany; 2Sveriges geologiska undersökelse, Sweden; 3GeoSphere, Austria; 4Geoloski Zavod Slovenije, Slovenia; 5ISPRAmbiente, Italy; 6Panstwowy instytut geologiczny, Poland
To understand geological information across political boundaries, harmonisation (semantically and geometrically), is crucial. The GSEU project (Geological Service for Europe) within the EU Horizon Europe programme, addresses this challenge by building a geological framework which encompasses a pan-European data model, a metadata system, methods to visualize 3-D models and hierarchical, machine-readable vocabularies based on existing terminologies.
Building spatial geological databases for the European continent started from 1995 to 2005 with the project of the International Geological Map of Europe and Adjacent Areas (IGME 5000). Later, based on the OneGeology-Europe project vocabularies (2008-2010), the geology data specification of the European INSPIRE Directive became European standard in 2013.
While these past vocabularies are comprehensive, they lack terms to describe large scale geological map information and specific thematic properties. GSEU fills that gap and hierarchical scientific vocabularies for lithology, anthropogenic deposits and lithotectonic units are developed to define the concepts to which geometrical descriptions (lines, polygons, and volumes) can be linked. Custom programming scripts, written in Python and JavaScript help to automatize the data handling and visualisation of the hierarchical relations.
The endeavour faces considerable challenges, such as:
- setting up vocabularies that take into account differing classification which describe the same concept (term),
- coping with obsolete and/or strictly regional terms,
- taking into account multiple hierarchies and
- including genetically related terms, qualifiers and compound names.
The presentation demonstrates the project’s approach to build pan-European lithological vocabularies, its challenges and provides an outlook to the future development.
11:45am - 12:00pmTopics: 06.01 Data Management, Research Data Infrastructures, AI-Applications and 3D Visualization Techniques: Meeting Today’s and Future Needs in GeosciencesA Data-Driven Approach to Identifying Optimal Direct Air Capture Deployment Sites in Germany
Yifan Xu1,2, Mrityunjay Singh1, Cornelia Schmidt-Hattenberger1, Marton Pal Farkas1
1GFZ Helmholtz-Zentrum für Geowissenschaften; 2Technische Universität Berlin
Direct Air Capture (DAC) is a carbon dioxide removal (CDR) technology that extracts atmospheric CO₂ for storage or utilization. This study presents a data-driven approach to identify optimal DAC deployment sites in Germany, focusing on the North German Basin (NGB), a region with promising geological CO₂ storage capacity. Building on 92 identified storage traps, we integrate K-means clustering machine learning algorithm to evaluate surface conditions using both environmental and infrastructural datasets.
Criteria were selected from the literature and divided into onshore and offshore constraints, including factors such as population density, protected areas, pipeline access, seismic activity, and shipping routes. Spatial data processing involved aggregating raster inputs into 100 m² grid cells and calculating distance maps from key infrastructure and risk zones. All the features were homogenized in the same format to process it through K-means algorithm.
K-means clustering, supported by the Elbow method, grouped potential DAC sites based on geospatial similarities. Cluster results were compared using a 5-level suitability ranking system (J-scores), which assigned scores based on site characteristics and normalized them for comparability. The resulting surface suitability map highlights high-priority regions. The developed methodology can then be used for larger geological regions for which 5-level ranking suitability ranking system is not feasible. This methodology offers a scalable, machine learning-supported framework for DAC site selection that integrates geospatial, environmental, and infrastructure criteria. It serves as a valuable tool for planners, industrial operators and stakeholders aiming to strategically implement DAC. The method is transferable to other regions with appropriate data availability.
|