Skip to content

Commit

Permalink
Update 02-rdm.md
Browse files Browse the repository at this point in the history
  • Loading branch information
jvddorpe authored Dec 12, 2024
1 parent 93868b8 commit de8ab13
Showing 1 changed file with 43 additions and 3 deletions.
46 changes: 43 additions & 3 deletions docs/_Research-Data-Management/02-rdm.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,16 +22,56 @@ Research data are valuable {% cite pauls_2023 %} and therefore need to be manage

## Advantages and drawbacks of RDM
---
As noted above, there are many benefits to incorporating robust RDM practices from the outset of a research project. For researchers, good RDM enhances visibility, reputation (by ensuring the quality of research), and data ownership (i.e. "the possession of and responsibility for information" [NCATS Toolkit](https://toolkit.ncats.nih.gov/)) {% cite bres_2022 jacob_2022 %} and helps them to meet formal requirements from third parties (e.g. research funders, institutions, and publishers). For the project, good RDM brings clarity and findability, supports coordination, data security, and good storage practices, helps to keep track of the project and deal with legal aspects, and increases eligibility for funding {% cite assmann_2022 bres_2022 bres_2023 %}. For the research group, good RDM enables knowledge management, transfer, and preservation, while improving teamwork and saving time, money, and resources {% cite assmann_2022 bobrov_2021 bres_2022 %}. For third parties, good RDM practices increase transparency, make data FAIR (i.e. findable, accessible, interoperable, and reusable (no need for unnecessary duplication)), and increase collaboration {% cite assmann_2022 bobrov_2021 bres_2022 jacob_2022 voigt_2022 assmann:2022-08 %}. Last but not least, good RDM practices help to address societal challenges by ensuring reproducibility, availability and verifiability, preventing data loss and preserving the scientific record, ensuring good research practice (GRP) and supporting open science (i.e. open transfer of research knowledge, open access to research data) {% cite assmann_2022 bobrov_2021 engelhardt_2022 jacob_2022 lindstädt_2019 voigt_2022 bres_2023 %}.
As noted above, there are many benefits to incorporating robust RDM practices from the outset of a research project. For researchers, good RDM enhances visibility, reputation (by ensuring the quality of research), data ownership (i.e. "the possession of and responsibility for information" [NCATS Toolkit](https://toolkit.ncats.nih.gov/)) {% cite bres_2022 jacob_2022 %} and helps them to meet formal requirements from third parties (e.g. research funders, institutions and publishers). For the project, good RDM brings clarity and findability, supports coordination, data security and good storage practices, helps to keep track of the project and deal with legal aspects, and increases eligibility for funding {% cite assmann_2022 bres_2022 bres_2023 %}. For the research group, good RDM enables knowledge management, transfer and preservation, while improving teamwork and saving time, money and resources {% cite assmann_2022 bobrov_2021 bres_2022 %}. For third parties, good RDM practices increase transparency, make data FAIR (i.e. findable, accessible, interoperable and reusable (no need for unnecessary duplication)) and increase collaboration {% cite assmann_2022 bobrov_2021 bres_2022 jacob_2022 voigt_2022 assmann:2022-08 %}. Last but not least, good RDM practices help to address societal challenges by ensuring reproducibility, availability and verifiability, preventing data loss and preserving the scientific record, ensuring good research practice (GRP) and supporting open science (i.e. open transfer of research knowledge, open access to research data) {% cite assmann_2022 bobrov_2021 engelhardt_2022 jacob_2022 lindstädt_2019 voigt_2022 bres_2023 %}.

There are also consequences of poor RDM practices, such as retractions of papers. For example, Dan Ariely, a professor of psychology and behavioural economics at Duke University, had one of his papers on dishonesty retracted. He could not remember in what year and in what form he had received the data from the company he was working with. Nor did he check the data for irregularities. The company could not find the data either {% cite bartlett:2021 %}.

## Problems in RDM
---
Current issues and challenges in RDM can be classified by stakeholder, as individual researchers, research funders, research organisations, librarians and reviewers have different needs {% cite science_europe:2024 %}.

For individual researchers, the different organisational requirements can be confusing, especially if they work with different organisations, change their home institution or collaborate with researchers from other organisations where different rules apply {% cite science_europe:2024 sheikh:2023 %}. The lack of connectivity between tools used at different stages of the research data lifecycle can also be a barrier to the proper management of their data.

For research funders, the development of technological infrastructure can be difficult {% cite sheikh:2023 %}.

For research organisations, the institutional commitment and academic engagement required can be overwhelming. The lack of policy, funding and storage also hinders progress in RDM {% cite sheikh:2023 %}.

For librarians and RDM staff, raising awareness among researchers of the benefits of data sharing remains a challenge. On another note, librarians need (discipline-specific) skills and competencies to provide RDM-based services {% cite sheikh:2023 %}.

There are also consequences of poor RDM practices, such as the retraction of papers. For example, Amorós and Puit 2015 had their paper retracted due to inconsistent and non-reproducible values and loss of raw data.

## Research data life cycle
---
The research data life cycle is a model that illustrates the steps of RDM and describes how data should ideally flow through a research project to ensure successful data curation and preservation {% cite NTU_LibGuides_RD_life_cycle princeton:2024 %} [NTU Library, Princeton Research Data Service]. The research data life cycle can be illustrated as follow {% cite RDMkit:2021 %}:
The research data life cycle is a model that illustrates the steps of RDM and describes how data should ideally flow through a research project to ensure successful data curation and preservation {% cite NTU_LibGuides_RD_life_cycle princeton:2024 %}. It is intended to help researchers understand the scope and importance of data management {% cite sheikh:2023 %}. The research data life cycle can be illustrated as follow {% cite RDMkit:2021 %}:

![Research data life cycle]({{ '/assets/img/research_data_life_cycle_elixir.png' | relative_url }})

NFDI4Microbiota offers dedicated services and tools along the research data life cycle:
* **Plan:** a [DMP template](https://doi.org/10.5281/zenodo.13628589).
* **Collect:**
* Protocols on [protocols.io](https://www.protocols.io/researchers/sarah-schulz).
* 2- to 3-hour workshops on ELNs (see example slides [here](https://doi.org/10.5281/zenodo.11578583)).
* Training with eLabFTW (see example demo [here](https://doi.org/10.5446/68306)).
* Annual seminar on ELNs.
* **Process:** metadata (standards):
* On this [Knowledge Base](https://nfdi4microbiota.github.io/nfdi4microbiota-knowledge-base/Research-Data-Management/03-md.html)
* On [GitHub](https://github.com/NFDI4Microbiota/MetadataStandards)
* **Analyse:** the Cloud-based Workflow Manager ([CloWM](https://clowm.bi.denbi.de/login?next=/dashboard)) (15 Nextflow workflows in production, more than 20 available soon).
* **Preserve:** the [ARUNA](https://aruna-storage.org/) data orchestration engine, an open-source data management platform that allows scientists and industry partners to store, annotate and share their data according to the FAIR data principles.
* **Reuse:**
* [StrainInfo](https://straininfo.dsmz.de/), a service developed to provide a resolution of microbial strain identifiers by storing culture collection numbers, their relations, and culture-associated data.
* [VirJenDB](https://www.virjendb.org/), a central hub connecting virus researchers to publicly available virus resources, metadata and sequences.

If the steps of the research data life cycle are not completed, data and results may be lost, or they may be preserved but without the necessary metadata to reuse them or make the research process reproducible (see Lost Data Map below). It is important to note that the research data life cycle is a model whose steps do not necessarily have to be followed in strict order, but they should all be completed.

![Lost Data Map]({{ '/assets/img/lost_data_map_rfii_Mau_CC-BY.png' | relative_url }})

## Current developments and initiatives
---
Internationally, the increasingly frequent requirement to produce a DMP has stimulated interest in RDM [Yamaji 2024] and encouraged libraries to take an active role in RDM through advocacy, policy development, and advisory and consultancy services [Cox et al. 2017]. Some institutions, such as KU Leuven, have also developed a dashboard to review datasets to meet funder requirements [Yamaji 2024].

In Germany, the National Research Data Infrastructure (NFDI) funds nearly 30 discipline-specific consortia to help researchers make their data reusable in the long term.


## Further resources
---
* General resources:
Expand Down

0 comments on commit de8ab13

Please sign in to comment.