Skip to content

Commit

Permalink
Merge branch 'main' of github.com:NFDI4Microbiota/nfdi4microbiota-kno…
Browse files Browse the repository at this point in the history
…wledge-base
  • Loading branch information
konrad committed Feb 16, 2024
2 parents 01ccd5f + d7f3510 commit 35622ac
Show file tree
Hide file tree
Showing 10 changed files with 105 additions and 16 deletions.
17 changes: 11 additions & 6 deletions docs/_Getting-Started/03-contributors.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,19 @@ layout: default
docs_css: markdown
---

1. Barbara Götz (ORCID ID: [0000-0002-6382-7211](https://orcid.org/0000-0002-6382-7211))
1. Barbara Götz (ORCID ID: [0000-0002-6382-7211](https://orcid.org/0000-0002-6382-7211), Wikidata: [Q94745883](https://www.wikidata.org/wiki/Q94745883))
2. Ekaterina Smirnova
3. Justine Vandendorpe (ORCID ID: [0000-0002-9421-8582](https://orcid.org/0000-0002-9421-8582))
4. Konrad U. Förstner (ORCID ID: [0000-0002-1481-2996](http://orcid.org/0000-0002-1481-2996))
3. Justine Vandendorpe (ORCID ID: [0000-0002-9421-8582](https://orcid.org/0000-0002-9421-8582), Wikidata: [Q62930742](https://www.wikidata.org/wiki/Q62930742))
4. Konrad U. Förstner (ORCID ID: [0000-0002-1481-2996](http://orcid.org/0000-0002-1481-2996), Wikidata: [Q18744528](https://www.wikidata.org/wiki/Q18744528))
5. Paul M. J. Klemm (ORCID ID: [0000-0002-3609-5713](https://orcid.org/0000-0002-3609-5713))
6. Uta Parmaksiz (ORCID ID: [0000-0002-0087-5056](https://orcid.org/0000-0002-0087-5056))
7. Charlie Pauvert (ORCID ID: [0000-0001-9832-2507](https://orcid.org/0000-0001-9832-2507))
7. Charlie Pauvert (ORCID ID: [0000-0001-9832-2507](https://orcid.org/0000-0001-9832-2507), Wikidata: [Q103017355](https://www.wikidata.org/wiki/Q103017355))
8. Maja Magel (ORCID ID: [0009-0004-2517-0791](https://orcid.org/0009-0004-2517-0791))
9. Martin Bole (ORCID ID: [0009-0004-9189-8852](https://orcid.org/0009-0004-9189-8852))
10. Frank Förster (ORCID ID: [0000-0003-4166-5423](https://orcid.org/0000-0003-4166-5423))
11. Rabea Müller (ORCID ID: [0000-0002-3096-8237](https://orcid.org/0000-0002-3096-8237))
10. Frank Förster (ORCID ID: [0000-0003-4166-5423](https://orcid.org/0000-0003-4166-5423), Wikidata: [Q42155371](https://www.wikidata.org/wiki/Q42155371))
11. Rabea Müller (ORCID ID: [0000-0002-3096-8237](https://orcid.org/0000-0002-3096-8237), Wikidata: [Q95461538](https://www.wikidata.org/wiki/Q95461538))
12. Jonas Coelho Kasmanas (ORCID ID: [0000-0001-6513-5350](https://orcid.org/0000-0001-6513-5350))
13. Michael Vockenhuber (ORCID ID: [0009-0006-8111-1723](https://orcid.org/0009-0006-8111-1723))
14. Noriko Cassman (ORCID ID: [0000-0003-1655-0931](https://orcid.org/0000-0003-1655-0931))
15. Katharina Markus (ORCID ID: [0000-0002-9316-8982](https://orcid.org/0000-0002-9316-8982))
16. Catherine Gonzalez (ORCID ID: [0000-0002-7585-9990](https://orcid.org/0000-0002-7585-9990))
3 changes: 2 additions & 1 deletion docs/_Research-Data-Management/01-rd.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ Data types in microbiology include the following:
* Genomic features
* Genomic organization
* RNA sequences
* 16S ribosomal RNA sequences
* 16S, 18S and ITS ribosomal RNA sequences
* Functional genomics / gene expression data (e.g. ribosome profiling)
* RNA-protein interactions
* Small RNA (sRNA)
Expand All @@ -34,6 +34,7 @@ Data types in microbiology include the following:
* Protein-protein interactions
* Proteomes
* Quantitative and predictive food microbiology
* Sample and project (meta)data
* Scientific texts
* Semantic data
* Species interaction data (e.g. physical microbial interaction data)
Expand Down
11 changes: 9 additions & 2 deletions docs/_Research-Data-Management/02-rdm.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,15 +33,22 @@ Benefits of RDM are numerous, some of them are listed below {% cite assmann_2022
* Helps keep track of the project
* Helps meet formal and legal requirements
* Enhances teamwork and collaborations
* Guaranteeing transparency, verifiability and reproducibility
* Guarantees transparency, verifiability and reproducibility

## Consequences of poor RDM
Consequences of poor RDM include paper retraction (e.g. [González Amorós & de Puit](https://doi.org/10.1016/j.scijus.2015.04.005)).

## Further resources
* Brief Guide - Research Data Management: [Training Expert Group 2020](https://doi.org/10.5281/zenodo.4000989)
* Essential scientific and technical information about software tools, databases and services for bioinformatics and the life sciences: [bio.tools](https://bio.tools/)
* Research data management platform: [Coscine](https://coscine.de/)

* Research data management platforms:
- [Coscine](https://coscine.de/) by [RWTH Aachen](https://www.rwth-aachen.de)
- [BEXIS2](https://demo.bexis2.uni-jena.de) by [NFDI4Biodiversity](https://www.nfdi4biodiversity.org/en/) at [FSU Jena](https://www.uni-jena.de)
- [GfBio](https://www.gfbio.org) consortium services

* General reources:
- The Research Data Management toolkit for Life Sciences [RDMkit](https://rdmkit.elixir-europe.org) by [ELIXIR](https://elixir-europe.org)

## References
{% bibliography --cited_in_order %}
7 changes: 7 additions & 0 deletions docs/_Research-Data-Management/08-dmp.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,13 @@ In a DMP, researchers usually describe the data, their generation and processing
* Legal aspects and anonymisation
* Deletion

## Digital Preservation in DMPs
Data Management Plans usually ask for “long-term archiving” or “long-term preservation” of research data, “data preservation”, “long-term data accessibility” or sometimes “data sharing”. Exact terminology varies according to the different funders and their DMP templates and research data guidelines.
For long-term archiving, preservation and accessibility/sharing, publication of research data in a Trusted Digital Repository (TDR) / trustworthy repository is recommended {% cite OpenAIRE_2024 england_2023_10125224 %}. TRD fall usually into two categories:
* a repository that has a CoreTrustSeal, nestor seal (DIN 31644) or ISO 16363 certification
* a repository that is commonly used and endorsed by the international research communities
For finding a TDR, check the [Data Repository page of the Knowledge Base](https://nfdi4microbiota.github.io/nfdi4microbiota-knowledge-base/Research-Data-Management/22-data-repositories).

# Benefits
When implemented correctly, a DMP can [benefit all stakeholders](https://doi.org/10.1371/journal.pcbi.1006750) of a research project despite the initial overhead of creating the DMP itself:

Expand Down
23 changes: 22 additions & 1 deletion docs/_Research-Data-Management/12-eln.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,17 @@ ELNs offer features and functions that can pave the way for significant time sav
ELNs are not data publishing platforms and are not suitable for storing large files. Large files require special technology for secure storage (e.g. Object Store, Nextcloud), but can still be linked in the ELN {% cite rehwald_2022 %}.

# Benefits and drawbacks
## Pros & Cons of Physical Lab Notebooks

Historically, documentation of experiments have been done in a physical, paper and pen notebook. For some researchers this is still a preferred method of documentation. It’s easy and inexpensive to use as it does not require computers nor internet access. However, with the technological advancement in data collection and processing there is a greater amount of data produced than ever before and with it there is a need for data to be digitized and managed electronically.

Also with advances in communication and travel it is even easier to work collaboratively with researchers at different institutes around the world. This collaboration would be difficult or hindered if scans of physical lab notebooks would need to be shared. Of course, this also does not take into consideration the legibility of the experiment notes or the ability to make general sense of it.

These drawbacks outweigh the benefits of a physical laboratory notebook. Adoption of an ELN is one of the essential steps needed in making research data FAIR. The purpose of adherence to the FAIR principles is so that the data can continue to be reused, validated, and expanded by researchers in the future. The data life cycle starts with planning and goes through the production and analysis followed by storage and access and ideally ends with data re-use.

In order for data to be re-used it needs to meet the criteria of the FAIR principles. This is where the largest drawback of physical laboratory notebooks lies. Data in a physical notebook cannot be found, accessed, or reused by other researchers. Data in an ELN can be extracted, downloaded, shared, and stored in a FAIR capacity. This data can also be described with metadata which gives more context needed to make sense of the data and ensure it can be reused.

Unfortunately, the major drawback to the wide use of ELNs in all areas of research appears to be data security risks, specifically when used in medical research. There is still an ongoing discussion on how to best securely manage patient research data in an ELN. However, according to Guerrero the best solutions involve using private servers on site or private institutionally based cloud services {% cite Guerrero2016 %}.

## Boosting efficiency of everyday tasks
ELNs increase the efficiency of everyday tasks by providing time-saving features and functions such as search and filtering {% cite vandendorpe_nd %}. ELNs also take advantage of standardisation {% cite rathmann_2021 %}: they have the ability to create templates such as protocols, Standard Operating Procedures (SOPs) and workflows. This facilitates data documentation with metadata {% cite vandendorpe_nd %} and supports clarity and organisation of data and protocols {% cite n4m_wc_elns_2023 %}. ELNs also provide ubiquitous access {% cite vandendorpe_nd %}: protocols, observations, notes and other data can be entered using a computer or mobile device {% cite lma_rdmwg %}.
Expand Down Expand Up @@ -57,7 +68,6 @@ ELNs also prevent data loss by eliminating problems with data deletion {% cite b
Finally, ELNs contribute to GSP by providing for data security and collaboration (see Data sharing and publishing) {% cite lma_rdmwg %}.

# Criteria to select an ELN

## Basic systems
Basic systems allow for traditional text entry, which can be searched and made available on multiple devices via the cloud. They also allow files (e.g. images, spreadsheets) to be attached to text and the attachments to be viewed, annotated and searched. Such systems include Word, Evernote and Dropbox. Basic systems have the advantage of being inexpensive, easily accessible and already familiar to many researchers. However, considerable effort is required to achieve the functionality of a traditional ELN with such a system {% cite bobrov_2021 vandendorpe_2020 %}.

Expand All @@ -70,6 +80,17 @@ High-end systems have all the features of specialised systems and more. High-end
## Electronic Lab Notebooks *vs.* Laboratory Information Management System (LIMS)
ELNs are sometimes confused with Laboratory Information Management Systems (LIMS). They both streamline laboratory workflow and data management and are complementary, but they have different functionalities and features. A LIMS is a comprehensive software for managing and tracking laboratory operations and data. A LIMS covers sample management, workflow management and automation, quality control and sample tracking throughout the laboratory. On the other hand, an ELN focuses on experimental data acquisition, experiment documentation and (real-time) collaboration {% cite eln_lims_linkedin eln_lims_sapio %}.

# Implementing an ELN
## Changing Culture
A cultural change is needed in order to transition researchers not only to ELNs from physical notebooks but to adhere to FAIR principles thus working towards open science. According to Nosek's [Strategy for Culture Change](https://www.cos.io/blog/strategy-for-culture-change), at the Center for Open Science there are “five levels of intervention” which starts at the bottom with infrastructure.

Changes to infrastructure would help ease the transition to an ELN by making such it possible to adopt. Covering costs for the use of an ELN at the institutional level, rather than leaving it up to individually funded research projects, would make it possible for groups to justify their use. The next step would be to ensure a good experience with ELNs through a user-friendly interface, training researchers how to use the ELN, and incorporating it into existing workflows.

A community of researchers will begin to form who use ELNs as a common practice. Provide incentives to researchers in order to ensure the continued use of the ELN. According to the Center for Open Science there are over 100 journals which offer badges that indicate when there is data or materials available to the reader. These badges incentivize researchers to share data as it adds more credibility to their findings. After moving through the bottom four levels of infrastructure, experience, community, and incentives the top level policy change will be possible. At the policy level the institution can now make the transition to ELN a requirement for its affiliated researchers.

While there are still researchers who may be apprehensive regarding sharing their data this will change as the culture surrounding research transitions to more transparency.


# Further resources
* [ELN Finder - Demo](https://eln-finder.ulb.tu-darmstadt.de/home) - Tool to help researchers searching and selecting a suitable ELN thanks to more than 40 filter criteria.
* ELN Filter - Selection of ELNs that are suitable for the life sciences and that can be filtered out according to criteria ([English](https://www.publisso.de/fileadmin/user_upload/PUBLISSO/PUBLISSO_ELN-Filter_2021-06_english.xlsx), [German](https://www.publisso.de/fileadmin/user_upload/PUBLISSO/PUBLISSO_ELN-Filter_2020-12-01.xlsx)).
Expand Down
10 changes: 8 additions & 2 deletions docs/_Research-Data-Management/22-data-repositories.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,8 @@ Below are listed criteria you might want to consider when selecting a repository
4. A cost-free interdisciplinary repository (e.g. [Figshare](https://figshare.com/), [Zenodo](https://zenodo.org/)).
5. Another repository that you can search for using the above-mentioned criteria in a repository finder.

## Well-established repositories in microbiology
Below are liste well-established repositories in microbiology. For each repository, the FAIRsharing and re3data pages are linked. On the FAIRsharing page, you will find information such as which journals endorse the repository (under "Collections & Recommendations" and then "In Policies"). On the re3data page, you will find information such as the above-mentioned criteria to select a trusted repoository.
## Well-established repositories for data deposition in microbiology
Below are listed well-established repositories in microbiology. For each repository, the FAIRsharing and re3data pages are linked. On the FAIRsharing page, you will find information such as which journals endorse the repository (under "Collections & Recommendations" and then "In Policies"). On the re3data page, you will find information such as the above-mentioned criteria to select a trusted repoository.

| Data type | Data repository | FAIRsharing | re3data |
|--- |--- |--- |--- |
Expand All @@ -45,6 +45,9 @@ Below are liste well-established repositories in microbiology. For each reposito
| | [Cell Image Library](http://www.cellimagelibrary.org/home) | [FAIRsharing](https://fairsharing.org/FAIRsharing.8t18te) | [re3data](https://www.re3data.org/repository/r3d100000023) |
| **Linked genotype and phenotype data** | European Genome-phenome Archive ([EGA](https://ega-archive.org/)) | [FAIRsharing](https://fairsharing.org/FAIRsharing.mya1ff) | [re3data](https://www.re3data.org/repository/r3d100011242) |
| **Macromolecular structures** | Worldwide Protein Data Bank ([wwPDB](http://www.wwpdb.org/)) | [FAIRsharing](https://fairsharing.org/FAIRsharing.mckkb4) | [re3data](https://www.re3data.org/repository/r3d100011104) |
| | RCSB Protein Data Bank ([RCSB PDB](https://www.rcsb.org)) | [FAIRsharing](https://fairsharing.org/FAIRsharing.2t35ja) | [re3data](https://www.re3data.org/repository/r3d100010327) |
| | Protein Data Bank of Japan ([PDBj](https://pdbj.org)) | [FAIRsharing](https://fairsharing.org/FAIRsharing.rs2815) | [re3data](https://www.re3data.org/repository/r3d100010910) |
| | Protein Data Bank of Europe ([PDBe](https://www.ebi.ac.uk/pdbe/)) | [FAIRsharing](https://fairsharing.org/FAIRsharing.26ek1v) | [re3data](https://www.re3data.org/repository/r3d100010538) |
| | Biological Magnetic Resonance Data Bank ([BMRB](https://bmrb.io/)) | [FAIRsharing](https://fairsharing.org/FAIRsharing.p06nme) | [re3data](https://www.re3data.org/repository/r3d100010191) |
| **Electron microscopy data** | Electron Microscopy Data Bank ([EMDB](https://www.ebi.ac.uk/emdb/)) | [FAIRsharing](https://fairsharing.org/FAIRsharing.651n9j) | [re3data](https://www.re3data.org/repository/r3d100010562) |
| | Electron Microscopy Public Image Archive ([EMPIAR](https://www.ebi.ac.uk/empiar/)) | [FAIRsharing](https://fairsharing.org/FAIRsharing.dff3ef) | [re3data](https://www.re3data.org/repository/r3d100012356) |
Expand Down Expand Up @@ -155,6 +158,9 @@ For more details, see this [guide](https://www.openaire.eu/zenodo-guide).
* To find a suitable interdisciplinary repository: [Generalist Repository Comparison Chart](https://doi.org/10.5281/zenodo.3946720)
* To find Open Access repositories: [OpenDOAR](https://v2.sherpa.ac.uk/opendoar/): Directory of Open Access Repositories

## See Also
* [Data Deposition and Standardization](https://academic.oup.com/nar/pages/data_deposition_and_standardization) help page of the [Oxford Academic](https://academic.oup.com) Nucleic Acids Research ([NAR Journal](https://academic.oup.com/nar)).

## References
* Engelhardt, C., Biernacka, K., Coffey, A., Cornet, R., Danciu, A., Demchenko, Y., Downes, S., Erdmann, C., Garbuglia, F., Germer, K., Helbig, K., Hellström, M., Hettne, K., Hibbert, D., Jetten, M., Karimova, Y., Kryger Hansen, K., Kuusniemi, M. E., Letizia, V., … Zhou, B. (2022). D7.4 How to be FAIR with your data. A teaching and training handbook for higher education institutions (V1.2.1). Zenodo. [https://doi.org/10.5281/ZENODO.6674301](https://doi.org/10.5281/ZENODO.6674301)
* Lindlar, M., Rudnik, P., Horton, L., & Jones, S. (2020). “You say potato, I say potato” - Mapping Digital Preservation and Research Data Management Concepts towards Collective Curation and Preservation Strategies. [https://doi.org/10.5281/ZENODO.3672773](https://doi.org/10.5281/ZENODO.3672773)
Expand Down
8 changes: 6 additions & 2 deletions docs/_Research-Data-Management/24-aruna-object-storage.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,9 +50,13 @@ The main component of AOS is a distributed database system. It synchronizes all
## AOS data structure
AOS organizes data in Version 1.x into Projects, Collections, Object Groups, and Objects, starting with version 2.x the data structure will be even more flexible and are organized into Projects, Collections, Datasets, and Objects with a more flexible relation model.

![Aruna Object Storage Structure V1](/nfdi4microbiota-knowledge-base/assets/img/aruna-1-structure.png "Aruna Object Storage Structure V1"){:width="40%"}
|![Aruna Object Storage Structure V1](/nfdi4microbiota-knowledge-base/assets/img/aruna-1-structure.png "Aruna Object Storage Structure V1"){:width="50%"} |
|-|
| UML diagram of the Aruna Object Storage data structure in Version v1.0.x |

![Aruna Object Storage Structure V2](/nfdi4microbiota-knowledge-base/assets/img/aruna-2-structure.png "Aruna Object Storage Structure V2"){:width="40%"}
| ![Aruna Object Storage Structure V2](/nfdi4microbiota-knowledge-base/assets/img/aruna-2-structure.png "Aruna Object Storage Structure V2"){:width="50%"} |
|-|
| UML diagram of the Aruna Object Storage data structure starting in Version v2.0. All resources form a directed acyclic graph of belongs to relationships (blue) with Projects as roots and Objects as leaves. Resources can also describe horizontal version relationships (orange), data/metadata relationships (yellow) or even custom user-defined relationships (green). |

# References
* Dokumentation and Aruna start page: [https://aruna-storage.org](https://aruna-storage.org)
Expand Down
2 changes: 1 addition & 1 deletion docs/_Research-Data-Management/25-digital-preservation.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ category: Research-Data-Management
layout: default
docs_css: markdown
---
# Definition of digital preservation
# Definition
Digital preservation is the act of ensuring continued findability and access to digital material and maintaining it independently understandable and reusable by a designated community, and with evidence supporting its authenticity, for as long as necessary. Preservation actions include:
* Data cleaning
* Data validation
Expand Down
Loading

0 comments on commit 35622ac

Please sign in to comment.