Data Management in Research
Laboratory automation as well as increasing computer power allow an increasing number of experiments and simulations to be carried out, generating a large amount of data through high resolutions, new analytical methods and imaging techniques. In order to make all this data comprehensible, analysable and reusable, it must be captured and documented in a structured way directly when it is generated. This requires the development and use of standardised terminologies, the standardisation of data formats as well as the development of workflows to automate data collection, data analysis and publication of data sets.
- Standardisation of Terminologies and Data Formats
-
Members of the University of Stuttgart are significantly involved in the development of ontologies for describing engineering research processes (m4i, NFDI4Ing), various types of microstructures and composite materials (MatWerk) and mathematical models (MaRDI) within the framework of participation in NFDI4Ing, MatWerk and MaRDI. Within NFDI4Chem and NFDI4Cat, ontologies and standardised data formats are being developed in the field of chemistry and catalysis science.
- Integration of data management into the research process
-
In order to integrate the documentation of research data with structured metadata into the research process, data stewards and research software engineers at the Clusters of Excellence SimTech and IntCDC, Collaborative Research Centres 1313 and 1333 are developing tools and standards for the collection, analysis and publication of data generated by experiments or simulations.
The RDM team at SimTech, for example, has set itself the task of creating a supportive environment that promotes efficient, effective and sustainable data management for research data generated in simulation research and projects. Various tools are used to improve and optimise processes in the life cycle of research data. Our goal is to provide researchers with the necessary resources to easily manage their data and ensure that it is stored securely and in compliance with relevant regulations.
Teaching
Research data and code management for reproducible research is an integral part of teaching and continuing education at the university. Courses on data management. open science and its implementation in the research process introduce even students to the topic of research data management. Hands-on workshops, lectures and courses support collaborative projects, working groups and young researchers in integrating research data management into everyday research.
The Seminar of the Special Interest Group Data Infrastructure provides a forum for interested working groups that want to establish or further develop an RDM infrastructure at the working group or institute level. To this end, we invite internal and external experts to a monthly SIGDIUS seminar for lecture and discussion, but primarily to give SIGDIUS members the opportunity to share their experiences with concrete RDM infrastructures.
Tools and Services
pyDaRUS
Python library for programmatic interaction with the Dataverse installation DaRUS
DaRUS
EasyDataverse
Python library as an interface to Dataverse installations such as the data repository DaRUS.
Infrastructure
The DaRUS data repository is actively used by members and partners of the University of Stuttgart to share and publish quality-assured data. Various workflows, tools and interfaces facilitate the metadating of data in the research process.
TIK storage services currently provide the institutes and research associations with network drives for the storage of actively processed research data, where access control for different user groups can be regulated on a fine-grained basis. With bwSFS2, a system is being planned that directly integrates research data management with the storage of currently processed data.