Skip to content

Commit

Permalink
Merge branch 'main' into GuilhemSempere-patch-1
Browse files Browse the repository at this point in the history
  • Loading branch information
BrapiCoordinatorSelby authored Apr 11, 2024
2 parents ff5fe50 + a07f8fd commit 00609df
Show file tree
Hide file tree
Showing 8 changed files with 100 additions and 8 deletions.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -43,4 +43,5 @@ Thumbs.db
.vscode
.markdownlint.json

external_contributions/
external_contributions/
/.project
23 changes: 23 additions & 0 deletions content/03.00.success.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,26 @@
<!-- success stories highlighting BrAPI usefulness in breeding cycle. Perhaps reference the original BrAPI paper where possible use cases were proposed. -->

Below are a number of short success stories from the BrAPI community. These tools, applications, and infrastructure projects serve as another indicator of community growth and success over the past 5-10 years. These stories clearly illustrate all the different ways the BrAPI Standard can be used productively and in practice.

<!-- Contribution BrAPI 2.0 paper
Suggested Authors: Matthias Lange, Patrick König, Stephan Weise, Gouripriya Davuluri, Suman Kumar, Joseph Ruff, Paul Kersey, Cyril Pommier, Michael Alaux, Erwan Le-Floch -->

###Success story activates data stock in European ex situ genebanks of plant genetic resources
{#sec:Success_Story_AGENT}

In the global system for ex situ conservation of plant genetic resources (PGR) [1], a total of ~5.8 million accessions are conserved in 1750 ex situ genebanks [2]. Unique and permanent identifiers in the form of DOIs are available for more than 1.7 million accessions [3]. Each DOI is linked to some basic descriptive data that facilitates the use of these resources. Many DOIs are also linked to additional data from different domains or will be in the future. In order to answer questions on the global biological diversity of a plant species, on duplicate detection, on provenance tracking for the identification of genetic integrity, on the selection of the most suitable material for various purposes, including breeding and research, and to support further applications in data mining or AI, a data space beyond the most basic information is needed that includes genotypic and phenotypic data. In this context, the aim of the AGENT project (https://www.agent-project.eu/) funded by the European Commission is to develop a concept for the digital exploitation and activation of this GenRes data space via European ex situ genebanks according to the FAIR criteria [4] and to test it in practice using two important crops, barley and wheat. In two work packages, standards and technology for data interoperability will be developed to establish a genetic resources infrastructure, which regulates data acquisition of genotypic and phenotypic data, integrates and archives them and makes them accessible according to FAIR principles. To this end, 13 European genebanks and 5 bioinformatics centers are cooperating and have agreed on standards and protocols for (i) the data flow (see figure {@fig:AGENT_Genotyping_Data_Flow}) and data formats [5] for central archiving of genotypic and phenotypic data.
![Figure Data flow of genotypic data from AGENT partner databases](images/AGENT_Genotyping_Data_Flow.png){#fig:AGENT_Genotyping_Data_Flow}

The AGENT portal as described in more detail in section {@sec:Success_Story_AGENT} unlock the full potential of the biological material stored in genebanks around the globe by using FAIR international data standards and an open digital infrastructure for the management of plant genetic resources. The implemented BrAPI interface enables to mine current and historic genotypic and phenotypic information to drive the discovery of genes, traits and knowledge for future missions, complement existing information for wheat and barley and the new data standards and infrastructure to foster an improved management of PGR for other crop species across European genebanks.


####References
1. Engels JMM, Ebert AW (2021) A Critical Review of the Current Global Ex Situ Conservation System for Plant Agrobiodiversity. I. History of the Development of the Global System in the Context of the Political/Legal Framework and Its Major Conservation Components. Plants 10:1557. https://doi.org/10.3390/plants10081557
2. Fu Y (2017) The Vulnerability of Plant Genetic Resources Conserved Ex Situ. Crop Sci 57:2314–2328. https://doi.org/10.2135/cropsci2017.01.0014
3. Food and Agriculture Organization (FAO) The Global Information System for PGRFA
4. Wilkinson MD, Dumontier M, Aalbersberg IjJ, et al (2016) The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3:160018. https://doi.org/10.1038/sdata.2016.18
5. Beier S, Fiebig A, Pommier C, et al (2022) Recommendations for the formatting of Variant Call Format (VCF) files to make plant genotyping data FAIR. F1000Research 11:231. https://doi.org/10.12688/f1000research.109080.2
6. König P, Beier S, Mascher M, et al (2022) DivBrowse—interactive visualization and exploratory data analysis of variant call matrices. GigaScience 12:giad025. https://doi.org/10.1093/gigascience/giad025
7. Street K, Street K (2017) Genebank mining with FIGS, the Focused Identification of Germplasm Strategy. https://doi.org/10.22004/AG.ECON.266624
8. Kotni P, van Hintum T, Maggioni L, et al (2023) EURISCO update 2023: the European Search Catalogue for Plant Genetic Resources, a pillar for documentation of genebank material. Nucleic Acids Res 51:D1465–D1469. https://doi.org/10.1093/nar/gkac852

23 changes: 18 additions & 5 deletions content/03.03.federation-infrastructure.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,23 @@
* Alternate solutions/ why is it better with BrAPI - Schema.ORG lightweight meta data harvesting, ARCs as collaborative data decoration, API and publication pipeline
* future related use cases, areas to improve - LIMS to BrAPI proxies -->

#### AGENT
#### BraPI endpoints for AGENT Portal
{#sec:AGENT_BrAPI_Backend}
For the joint research data infrastructure for the federation of collections of genotypic and phenotypic data from European gene banks and bioinformatics institutes, a AGENT portal ({@fig:AGENT_WebFrontend}) as database infrastructure for integrated plant genetic resources on ex-situ genebanks is being created. It provides, manual data exploration, machine-readable access via BrAPI and provide data to the cored data deposition resources at the European Bioinformatics Institute (EBI).
![Figure AGENT Portal](images/AGENT_WebFrontend.png){#fig:AGENT_WebFrontend}

<!-- Peter S: Stub paragraph to stimulate the writing process. Please edit, rewrite, or delete as needed. -->
BraPI endpoints for AGENT
The AGENT database backend aggregates curated and integrated passport data, phenotypic and genotypic data about wheat and barley accessions of 18 project partners are harmonized and integrated via BrAPI endpoints (https://github.com/AGENTproject/BrAPI) and explorable in a web portal (https://agent.ipk-gatersleben.de). The BrAPI endpoints were made available by scattered implementation. Genotyping data use DivBrowse [6] storage engine and BrAPI interface. Endpoints for sample data are implemented using AGENT database SQL to BrAPI broker service.
To integrate those BrAPI endpoint provider into a single service and URL scheme, we work on their integration in a BrAPI proxy service. As next steps, we will expand BrAPI implementation to enable the integration of analysis pipelines in the AGENT portal, e.g. for genebank mining tools such as the FIGS+ pipeline developed by AGENT partner ICARDA [7]. Another perspective is to integrate the data collected in the AGENT project into the European Search Catalogue for Plant Genetic Resources (EURISCO) [8] and to implement BrAPI endpoints to make data on PGR collections in European genebanks programmatically accessible.

####References
1. Engels JMM, Ebert AW (2021) A Critical Review of the Current Global Ex Situ Conservation System for Plant Agrobiodiversity. I. History of the Development of the Global System in the Context of the Political/Legal Framework and Its Major Conservation Components. Plants 10:1557. https://doi.org/10.3390/plants10081557
2. Fu Y (2017) The Vulnerability of Plant Genetic Resources Conserved Ex Situ. Crop Sci 57:2314–2328. https://doi.org/10.2135/cropsci2017.01.0014
3. Food and Agriculture Organization (FAO) The Global Information System for PGRFA
4. Wilkinson MD, Dumontier M, Aalbersberg IjJ, et al (2016) The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3:160018. https://doi.org/10.1038/sdata.2016.18
5. Beier S, Fiebig A, Pommier C, et al (2022) Recommendations for the formatting of Variant Call Format (VCF) files to make plant genotyping data FAIR. F1000Research 11:231. https://doi.org/10.12688/f1000research.109080.2
6. König P, Beier S, Mascher M, et al (2022) DivBrowse—interactive visualization and exploratory data analysis of variant call matrices. GigaScience 12:giad025. https://doi.org/10.1093/gigascience/giad025
7. Street K, Street K (2017) Genebank mining with FIGS, the Focused Identification of Germplasm Strategy. https://doi.org/10.22004/AG.ECON.266624
8. Kotni P, van Hintum T, Maggioni L, et al (2023) EURISCO update 2023: the European Search Catalogue for Plant Genetic Resources, a pillar for documentation of genebank material. Nucleic Acids Res 51:D1465–D1469. https://doi.org/10.1093/nar/gkac852

#### IPK-Genebank

Expand All @@ -19,5 +32,5 @@ Agrosystem Integration of germplasm collections in context of data trustee model

#### MIAPPE "BrAPI to ISA" service

<!-- Peter S: Stub paragraph to stimulate the writing process. Please edit, rewrite, or delete as needed. -->
MIAPPE and BrAPI are designed to be inter-compatible. ISA-TAB is a file based implementation of MIAPPE. The "BrAPI to ISA" service is a converter between the ISA-TAB files and the BrAPI RESTful endpoints.

Since the release of BrAPI 1.3, efforts have been made to incorporate support for the Minimum Information About Plant Phenotyping Experiments (MIAPPE) standard into the specification [@doi:10.1111/nph.16544]. This integration was finalized in BrAPI 2.0, resulting in full compatibility between the two standards. Consequently, BrAPI now encompasses all attributes necessary for MIAPPE compliance, adhering to standardized descriptions in accordance with MIAPPE guidelines. Leveraging BrAPI as a standardized RESTful web service API specification, we employ the ISA standard for storing metadata and phenotyping data in a standardized manner. This data is structured in the ISA-TAB file format and subjected to validation using the [MIAPPE ISA configuration](https://github.com/ELIXIR-Belgium/isatab-validation). The "BrAPI to ISA" service functions as a converter between BrAPI RESTful endpoints and ISA-TAB, facilitating the archiving of metadata and data and thereby enhancing data preservation and accessibility. The [BrAPI2ISA](https://github.com/elixir-europe/plant-brapi-to-isa) tool is designed to be compatible with BrAPI 1.3, and we invite contributions from the community to extend support for the latest versions of BrAPI.
7 changes: 5 additions & 2 deletions content/03.06.samples-and-genotypes.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,11 @@ MGIS has germplasm and genotype data stored for many musa accessions. Through Br

#### GIGWA

<!-- Peter S: Stub paragraph to stimulate the writing process. Please edit, rewrite, or delete as needed. -->
GIGWA is an efficient storage system for genotype variant data. GIGWA uses BrAPI to query specific variant data out of the database. This allows for more efficient data transfer and analysis. Instead of transferring whole massive files, specific pieces, samples, markers, or chunks of data can be retrieved.
Gigwa is a JEE web application providing means to centralize, share, finely filter, and visualize high-throughput genotyping data [@doi:10.1093/gigascience/giz051]. Built on top of MongoDB, it is scalable and can support working smoothly with datasets containing billions of genotypes. Installable from docker images or all-in-one bundle archives, it is pretty straightforward to deploy on servers or local computers and has thus been adopted by numerous research institutes from around the world. Notably, Gigwa serves as a collaborative management tool and/or a portal for exposing the data for genebanks and breeding programs for some CGIAR centers [@doi:10.1002/ppp3.10187]. Thus, the amount of data hosted and made widely accessible using this system has kept growing over the last few years.

Gigwa developers have been involved in the BrAPI community since 2016 and took part in designing the genotype-related part of the API's specifications. Its first BrAPI-compliant features were designed for compatibility with the Flapjack visualization tool [@doi:10.1093/bioinformatics/btq580] and thus primarily turned it into a BrAPI datasource. Consequently, over time, Gigwa being the first and most reliable application implementing BrAPI-Genotyping server calls, local collaborators and even external partners used it as a reference solution to design a number of tools taking advantage of those features (e.g., [BeegMac](https://webtools.southgreen.fr/BrAPI/Beegmac/), [SnpClust](https://github.com/jframi/snpclust), [QBMS](https://github.com/icarda-git/QBMS)). But further use-cases also required Gigwa to be able to consume data from other BrAPI servers, which led to also implement API-client features into the system. Thanks to all this work, a close collaboration was progressively established with the Integrated Breeding Platform team developing the widely used Breeding Management System, that ended up in both applications now being frequently deployed together, Gigwa pulling germplasm or sample metadata from BMS, and BMS displaying Gigwa-hosted genotypes within its own UI.

Client BrAPI libraries being available for R, community members typically write ad-hoc scripts syndicating data from multiple BrAPI sources (for instance phenotypes from a datasource and genotypes from another) in order to run various kinds of analyses such as GWAS, genomic selection or phylogenetic investigations. As a perspective, we may expect the most generic and widely-used of those pipelines to be at least publicly distributed, and possibly web-interfaced using solutions like R-Shiny in order to provide new, excitingly useful online services, based on Gigwa-hosted data.

#### PHG

Expand Down
Binary file added content/images/AGENT_Genotyping_Data_Flow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added content/images/AGENT_WebFrontend.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added content/images/AGENT_WebFrontend.pptx
Binary file not shown.
52 changes: 52 additions & 0 deletions content/metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -162,4 +162,56 @@ authors:
affiliations:
- CIRAD (french agricultural research and international cooperation organization)
- South Green Platform
- name: Stephan Weise
email: [email protected]
initials: SW
affiliations:
- Leibniz Institute of Plant Genetics and Crop Plant Research
- name: Patrick König
email: [email protected]
initials: PK
affiliations:
- Leibniz Institute of Plant Genetics and Crop Plant Research
- name: Gouripriya Davuluri
email: [email protected]
initials: GD
affiliations:
- Leibniz Institute of Plant Genetics and Crop Plant Research
- name: Paul Kersey
email: [email protected]
initials: PK
affiliations:
- Royal Botanic Gardens, Kew
- name: Erwan Le-Floch
email: [email protected]
initials: ELF
affiliations:
- URGI PlantBioinfoPF, INRAE France
- name: Jospeh Ruff
email: [email protected]
initials: JR
affiliations:
- Royal Botanic Gardens, Kew
- name: Michael Alaux
email: [email protected]
initials: MA
affiliations:
- URGI PlantBioinfoPF, INRAE France
- name: Suman Kumar
email: [email protected]
initials: SK
affiliations:
- Leibniz Institute of Plant Genetics and Crop Plant Research
- name: Matthijs Brouwer
email: [email protected]
initials: MB
affiliations:
- Wageningen University and Research
- name: Bert Droesbeke
initials: BD
github: bedroesb
orcid: 0000-0003-0522-5674
email: [email protected]
affiliations:
- VIB Data Core
corresponding: false

0 comments on commit 00609df

Please sign in to comment.