Topic: Quality Assurance for Data in Open Science

Topic: Quality Assurance for Data in Open Science

Personal details

Title Quality Assurance for Data in Open Science
Description

The Open Science movement is gaining momentum, and it’s transforming the way we conduct re-
search. By making every step of the scientific process more accessible, researchers can
accelerate progress and make new discoveries. This can be seen, in part, through different initiatives
like the NFDI initiative, which is funded through the DFG leading the charge towards a more trans-
parent and collaborative research landscape. One way to facilitate open science is described in the
FAIR(Findability, Accessibility, Interoperability and Reusability) Guidelines introduced by Wilkinson
et al., which describe how to share scientific data properly. By following these guidelines,
researchers can ensure that their data is a valuable resource for the scientific community.
As Open Science becomes the new norm, the practice of sharing data is becoming more widespread,
and public data repositories are becoming more common through it.
With the increased amount of data available, it is important that public data repositories can keep
up with their quality control measures, as the quality of the data is directly tied to the credibility of
the platform. As it is not possible to check every dataset manually through paid moderators other
quality assurance measures has to be found.
In this work we want to explore the most effective solutions for ensuring the quality of online data,
to gain a deeper understanding of the online content portals that are shaping the future of Open
Science.

Possible Language: The thesis can be written and supervised in English or German.
Supervisor: Alexandro Steinert, M. Sc. alexandro.steinert@offis.de
Evaluator: Prof. Dr.-Ing. Astrid Nieße astrid.niesse@uni-oldenburg.de

If you are interested in this thesis, please contact Alexandro Steinert.

Home institution Department of Computing Science
Associated institutions
Type of work conceptual / theoretical
Type of thesis Bachelor's or Master's degree
Author M. Sc. Alexandro Steinert
Status available
Problem statement

The goal of this thesis is to identify best practices to assure the quality of content in data repositories
that are used to hold openly accessible data.
For this, a comprehensive review of repositories such as Zenodo, GitHub and Hugging Face is
needed.
Afterwards, the identified repositories have to be examined regarding their quality assurance pro-
cesses. The collected processes have to be categorized in a thesaurus and recommendations should be
given, for which are most suitable in an open science context. This can be divided into the following
tasks
1. Repository Identification: Identify data repositories that allow for uploading open data
2. Quality Assurance Analysis: Examine the quality assurance processes of the collected repos-
itories
3. Thesaurus Creation: Create a thesaurus from the collected quality assurance processes
4. Process Recommmendations: Give Recommendations based on the thesauri for a quality
assurance process
5. Optional: Quality Assurance Recommendation System A Recommendation system for
the the right quality assurance process

Requirement

 Preliminary knowledge on open data and quality measures are good to have.

Created 13/12/24

Study data

Departments
  • Digitalisierte Energiesysteme
Degree programmes
  • Bachelor's Programme Business Informatics
  • Master's programme Digitalised Energy Systems
  • Master's Programme Computing Science
  • Dual-Subject Bachelor's Programme Computing Science
  • Bachelor's Programme Computing Science
Assigned courses
Contact person