From 9ef1aed4ec47a7af9a993d806ba55488523928b3 Mon Sep 17 00:00:00 2001 From: Peter Selby <32845555+BrapiCoordinatorSelby@users.noreply.github.com> Date: Mon, 19 Aug 2024 20:11:16 +0000 Subject: [PATCH] Merge pull request #77 from plantbreeding/edits_by_patrick [ci skip] This build is based on https://github.com/plantbreeding/BrAPI-Manuscript2/commit/1d3e3a1526b0f6215dabf65bfeb018efbb9db827. This commit was created by the following CI build and job: https://github.com/plantbreeding/BrAPI-Manuscript2/commit/1d3e3a1526b0f6215dabf65bfeb018efbb9db827/checks https://github.com/plantbreeding/BrAPI-Manuscript2/actions/runs/10460189828 --- README.md | 4 +- index.html | 158 +- manuscript.pdf | Bin 2044973 -> 2046722 bytes .../images/AGENT_Genotyping_Data_Flow.png | Bin 0 -> 177493 bytes .../images/AGENT_WebFrontend.pptx | Bin 0 -> 1447179 bytes .../images/AGENT_Web_Frontend.png | Bin 0 -> 268377 bytes .../images/BrAPI_Application_Chart.pdf | Bin 0 -> 41589 bytes .../images/BrAPI_Application_Chart.xlsx | Bin 0 -> 13588 bytes .../images/BrAPI_Domains_v2-1_vertical.png | Bin 0 -> 495793 bytes .../images/BrAPI_Paper_Applications_Chart.png | Bin 0 -> 1987579 bytes .../images/BrAPI_org_structure.jpg | Bin 0 -> 41681 bytes .../images/Schema_FAIDARE.png | Bin 0 -> 142775 bytes .../images/Schema_Florilege.jpg | Bin 0 -> 72787 bytes .../images/github.svg | 4 + .../images/mastodon.svg | 4 + .../images/orcid.svg | 4 + .../images/twitter.svg | 4 + .../index.html | 4855 +++++++++++++++++ .../index.html.ots | Bin 0 -> 503 bytes .../manuscript.pdf | Bin 0 -> 2046722 bytes .../manuscript.pdf.ots | Bin 0 -> 503 bytes v/freeze/index.html | 6 +- v/latest/index.html | 158 +- v/latest/index.html.ots | Bin 503 -> 503 bytes v/latest/manuscript.pdf | Bin 2044973 -> 2046722 bytes v/latest/manuscript.pdf.ots | Bin 573 -> 503 bytes 26 files changed, 5040 insertions(+), 157 deletions(-) create mode 100644 v/1d3e3a1526b0f6215dabf65bfeb018efbb9db827/images/AGENT_Genotyping_Data_Flow.png create mode 100644 v/1d3e3a1526b0f6215dabf65bfeb018efbb9db827/images/AGENT_WebFrontend.pptx create mode 100644 v/1d3e3a1526b0f6215dabf65bfeb018efbb9db827/images/AGENT_Web_Frontend.png create mode 100644 v/1d3e3a1526b0f6215dabf65bfeb018efbb9db827/images/BrAPI_Application_Chart.pdf create mode 100644 v/1d3e3a1526b0f6215dabf65bfeb018efbb9db827/images/BrAPI_Application_Chart.xlsx create mode 100644 v/1d3e3a1526b0f6215dabf65bfeb018efbb9db827/images/BrAPI_Domains_v2-1_vertical.png create mode 100644 v/1d3e3a1526b0f6215dabf65bfeb018efbb9db827/images/BrAPI_Paper_Applications_Chart.png create mode 100644 v/1d3e3a1526b0f6215dabf65bfeb018efbb9db827/images/BrAPI_org_structure.jpg create mode 100644 v/1d3e3a1526b0f6215dabf65bfeb018efbb9db827/images/Schema_FAIDARE.png create mode 100644 v/1d3e3a1526b0f6215dabf65bfeb018efbb9db827/images/Schema_Florilege.jpg create mode 100644 v/1d3e3a1526b0f6215dabf65bfeb018efbb9db827/images/github.svg create mode 100644 v/1d3e3a1526b0f6215dabf65bfeb018efbb9db827/images/mastodon.svg create mode 100644 v/1d3e3a1526b0f6215dabf65bfeb018efbb9db827/images/orcid.svg create mode 100644 v/1d3e3a1526b0f6215dabf65bfeb018efbb9db827/images/twitter.svg create mode 100644 v/1d3e3a1526b0f6215dabf65bfeb018efbb9db827/index.html create mode 100644 v/1d3e3a1526b0f6215dabf65bfeb018efbb9db827/index.html.ots create mode 100644 v/1d3e3a1526b0f6215dabf65bfeb018efbb9db827/manuscript.pdf create mode 100644 v/1d3e3a1526b0f6215dabf65bfeb018efbb9db827/manuscript.pdf.ots diff --git a/README.md b/README.md index 57669b32..92af29ca 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,7 @@ # Output directory containing the formatted manuscript The [`gh-pages`](https://github.com/plantbreeding/BrAPI-Manuscript2/tree/gh-pages) branch hosts the contents of this directory at . -The permalink for this webpage version is . +The permalink for this webpage version is . To redirect to the permalink for the latest manuscript version at anytime, use the link . ## Files @@ -35,4 +35,4 @@ Verifying timestamps with the `ots verify` command requires running a local bitc ## Source The manuscripts in this directory were built from -[`dab19d16cde75b4702f4d6bcc35796de67310c51`](https://github.com/plantbreeding/BrAPI-Manuscript2/commit/dab19d16cde75b4702f4d6bcc35796de67310c51). +[`1d3e3a1526b0f6215dabf65bfeb018efbb9db827`](https://github.com/plantbreeding/BrAPI-Manuscript2/commit/1d3e3a1526b0f6215dabf65bfeb018efbb9db827). diff --git a/index.html b/index.html index 9100ce17..205e1321 100644 --- a/index.html +++ b/index.html @@ -129,8 +129,8 @@ - - + + @@ -368,9 +368,9 @@ - - - + + + @@ -387,9 +387,9 @@

BrAPI v2: An application showcase of a unified framework for d

This manuscript -(permalink) +(permalink) was automatically generated -from plantbreeding/BrAPI-Manuscript2@dab19d1 +from plantbreeding/BrAPI-Manuscript2@1d3e3a1 on August 19, 2024.

Authors

@@ -723,7 +723,7 @@

Authors


Leibniz Institute of Plant Genetics and Crop Plant Research -· Funded by The AGENT project is funded by the European Union’s Horizon 2020 research and innovation programme under grant agreement no. 862613. +· Funded by The SHAPE3 project is funded by a grant from the German Ministry of Research and Education (BMBF, FKZ 031B1302A).

  • Suman Kumar
    @@ -1215,32 +1215,31 @@

    DArTView

    DArTView is a desktop application for marker data curation via metadata filtering. DArTView enables genotype variant data visualization designed such that users can easily identify trends or correlations within their data. The primary goal of the tool is to overcome tedious manual calculation of marker data through common spreadsheet applications like Excel. Users are able to import marker data from csv files, but DArTView has been recently enhanced to be BrAPI compatible. BrAPI provides a consistent data standard across databases and data resources, which allows DArTView to use any BrAPI-compatible server as an input data source. DArTView’s compatibility with BrAPI also ensures easy integration with other tools and pipelines that would use DArTView for marker filtering and exploration.

    Initially developed by DArT, the tool is gaining popularity within the breeding community, especially in Africa. Future releases will focus on enhancing the BrAPI compatibility, making it accessible to more breeders and researchers. A web enabled version of DArTView is in development. This new version will allow for further collaboration opportunities with other interested partners who would like to integrate it as part of their pipelines.

    DivBrowse

    - +

    DivBrowse17 is a web platform for exploratory data analysis of large genotyping studies. The software can be run standalone or integrated as a plugin into existing web portals. At its core, DivBrowse combines the convenience of a genome browser with features tailored to germplasm diversity analysis. DivBrowse provides visual access to VCF files obtained through genotyping experiments and can handle hundreds of millions of variants across thousands of samples. It is able to display genomic features such as nucleotide sequence, associated gene models, and short genomic variants. DivBrowse also calculates and displays variant statistics such as minor allele frequencies, the proportion of heterozygous calls, and the proportion of missing variant calls. -Dynamic principal component analyses can be performed on a user-specified genomic area to provide information on local genomic diversity.

    -

    DivBrowse employs the BrAPI-Genotyping module to access genotypic data from external BrAPI endpoints. -DivBrowse also has an interface to BLASTdoi:10.1016/S0022-2836?, which can be used to directly access genes or other genomic features. -The modular structure of DivBrowse allows developers to configure and easily embed links to other external information systems.

    +Dynamic principal component analyses can be performed on a user-specified genomic area to provide information on local genomic diversity. +DivBrowse also has an interface to BLAST+ tools18 installed on Galaxy servers19, which can be used to directly access genes or other genomic features from results of custom BLAST query. +DivBrowse employs the BrAPI-Genotyping module to serve genotypic data as a BrAPI endpoint and to get genotypic data from other BrAPI endpoints.

    Flapjack

    -

    Flapjack18 is a multi-platform desktop application for data visualization and breeding analysis (e.g., pedigree verification, marker-assisted backcrossing and forward breeding) using high-throughput genotype data. +

    Flapjack20 is a multi-platform desktop application for data visualization and breeding analysis (e.g., pedigree verification, marker-assisted backcrossing and forward breeding) using high-throughput genotype data. Data can be imported into Flapjack from any BrAPI-compatible data source with genotype data available. Flapjack Bytes is a smaller, lightweight, and fully web-based counterpart to Flapjack that can be easily embedded into a database website to provide similar visualizations online. Traditionally supporting its own text-based data formats, Flapjack’s use of BrAPI has streamlined the end-user experience for data import. Work is underway to determine the best methods to exchange analysis results using future versions of the API.

    Gigwa

    -

    Gigwa is a Java EE web application providing a means to centralize, share, finely filter, and visualize high-throughput genotyping data19. Built on top of MongoDB, it is scalable and can support working smoothly with datasets containing billions of genotypes. It is installable as a Docker image or as an all-in-one bundle archive. It is straightforward to deploy on servers or local computers and has thus been adopted by numerous research institutes from around the world. Notably, Gigwa serves as a collaborative management tool and a portal for exploring public data for genebanks and breeding programs at some CGIAR centers20. The total amount of data hosted and made widely accessible using this system has continued to grow over the last few years.

    -

    The Gigwa development team has been involved in the BrAPI community since 2016 and took part in designing the genotype-related section of the BrAPI standard. Gigwa’s first BrAPI-compliant features were designed for compatibility with the Flapjack visualization tool18. Over time, Gigwa has established itself as the first and most reliable implementation of the BrAPI-Genotyping module. Local collaborators and external partners used it as a reference solution to design a number of tools taking advantage of the BrAPI-Genotyping features (e.g., BeegMac, SnpClust, QBMS).

    +

    Gigwa is a Java EE web application providing a means to centralize, share, finely filter, and visualize high-throughput genotyping data21. Built on top of MongoDB, it is scalable and can support working smoothly with datasets containing billions of genotypes. It is installable as a Docker image or as an all-in-one bundle archive. It is straightforward to deploy on servers or local computers and has thus been adopted by numerous research institutes from around the world. Notably, Gigwa serves as a collaborative management tool and a portal for exploring public data for genebanks and breeding programs at some CGIAR centers22. The total amount of data hosted and made widely accessible using this system has continued to grow over the last few years.

    +

    The Gigwa development team has been involved in the BrAPI community since 2016 and took part in designing the genotype-related section of the BrAPI standard. Gigwa’s first BrAPI-compliant features were designed for compatibility with the Flapjack visualization tool20. Over time, Gigwa has established itself as the first and most reliable implementation of the BrAPI-Genotyping module. Local collaborators and external partners used it as a reference solution to design a number of tools taking advantage of the BrAPI-Genotyping features (e.g., BeegMac, SnpClust, QBMS).

    Some use-cases require Gigwa to also consume data from other BrAPI servers. This requirement led to the implementation of BrAPI client features within Gigwa. A close collaboration was established with the Integrated Breeding Platform team and their widely used Breeding Management System (BMS). This collaboration means both applications are now frequently deployed together; Gigwa pulling germplasm or sample metadata from BMS, and BMS displaying Gigwa-hosted genotypes within its own UI.

    PHG

    -

    The Practical Haplotype Graph (PHG) is a graph-based computational framework that represents large-scale genetic variation and is optimized for plant breeding and genetics21. Using a pangenome approach, each PHG stores haplotypes (the sequence of part of an individual chromosome) to represent the collective genes of a species. This allows for a simplified approach for dealing with large scale variation in plant genomes. The PHG pipeline provides support for a range of genomic analyses and allows for the use of graph data to impute complete genomes from low density sequence or variant data.

    +

    The Practical Haplotype Graph (PHG) is a graph-based computational framework that represents large-scale genetic variation and is optimized for plant breeding and genetics23. Using a pangenome approach, each PHG stores haplotypes (the sequence of part of an individual chromosome) to represent the collective genes of a species. This allows for a simplified approach for dealing with large scale variation in plant genomes. The PHG pipeline provides support for a range of genomic analyses and allows for the use of graph data to impute complete genomes from low density sequence or variant data.

    Users can access the haplotype data either with direct calls to the PHG embedded server or indirectly using the rPHG library from an R environment. The PHG server accepts BrAPI queries to return information on sample lists and the variants used to define the graph’s haplotypes. In addition, PHG users utilize the BrAPI variant sets endpoint query to return links to VCF files containing haplotype data. Work on the PHG is ongoing and it is expected to support additional BrAPI endpoints that allow for fine tuned slicing genotypic data in the near future.

    Germplasm Management

    @@ -1249,38 +1248,38 @@

    Germplasm Management

    Suggested Authors: Matthias Lange, Patrick König, Stephan Weise, Gouripriya Davuluri, Suman Kumar, Joseph Ruff, Paul Kersey, Cyril Pommier, Michael Alaux, Erwan Le Floch -->

    AGENT

    -

    The aim of the AGENT project, funded by the European Commission, is to develop a concept for the digital exploitation and activation of plant genetic resources (PGRs) throughout Europe22. -In the global system for ex situ conservation of PGRs, material is being conserved in about 1750 collections totalling ~5.8 million accessions23. Unique and permanent identifiers in the form of DOIs are available for more than 1.7 million accessions via the Global Information System24 of the International Treaty on Plant Genetic Resources for Food and Agriculture. Each DOI is linked to some basic descriptive data that facilitates the use of these resources, mainly passport data. However, a data space beyond the most basic information is needed that includes genotypic and phenotypic data. This space will help to answer questions about the global biological diversity of plant species, the detection of duplicates, the tracking of provenance for the identification of genetic integrity, the selection of the most suitable material for different purposes, and to support further applications in the field of data mining or AI. In this context, the AGENT project will activate and utilize the PGRs from European ex situ genebanks according to the FAIR principles, and test the resources in practice using two important crops, barley and wheat25. Thirteen European genebanks and five bioinformatics centers are working together and have agreed on standards and protocols for data flow and data formats for the collection, integration, and archiving of genotypic and phenotypic data26.

    -

    The BrAPI specification is one of the agreed standards that are detailed in the AGENT guidelines for dataflow27. The implemented BrAPI interface enables the analysis of current and historic genotypic and phenotypic information. This will drive the discovery of genes, traits, and knowledge for future missions, complement existing information for wheat and barley, and use the new data standards and infrastructure to promote better access and use of PGR for other crops in European genebanks. The AGENT database backend aggregates curated passport data, phenotypic data, and genotypic data on wheat and barley accessions of 18 project partners. This data is accessible via BrAPI endpoints and explorable in a web portal. Genotyping data uses the DivBrowse17 storage engine and its BrAPI interface. Soon, the BrAPI implementation will be expanded to enable the integration of analysis pipelines in the AGENT portal, such as the FIGS+ pipeline developed by ICARDA28. There is also a plan to integrate the data collected by the AGENT project into the European Search Catalogue for Plant Genetic Resources (EURISCO)29. +

    The aim of the AGENT project, funded by the European Commission, is to develop a concept for the digital exploitation and activation of plant genetic resources (PGRs) throughout Europe24. +In the global system for ex situ conservation of PGRs, material is being conserved in about 1750 collections totalling ~5.8 million accessions25. Unique and permanent identifiers in the form of DOIs are available for more than 1.7 million accessions via the Global Information System26 of the International Treaty on Plant Genetic Resources for Food and Agriculture. Each DOI is linked to some basic descriptive data that facilitates the use of these resources, mainly passport data. However, a data space beyond the most basic information is needed that includes genotypic and phenotypic data. This space will help to answer questions about the global biological diversity of plant species, the detection of duplicates, the tracking of provenance for the identification of genetic integrity, the selection of the most suitable material for different purposes, and to support further applications in the field of data mining or AI. In this context, the AGENT project will activate and utilize the PGRs from European ex situ genebanks according to the FAIR principles, and test the resources in practice using two important crops, barley and wheat27. Thirteen European genebanks and five bioinformatics centers are working together and have agreed on standards and protocols for data flow and data formats for the collection, integration, and archiving of genotypic and phenotypic data28.

    +

    The BrAPI specification is one of the agreed standards that are detailed in the AGENT guidelines for dataflow29. The implemented BrAPI interface enables the analysis of current and historic genotypic and phenotypic information. This will drive the discovery of genes, traits, and knowledge for future missions, complement existing information for wheat and barley, and use the new data standards and infrastructure to promote better access and use of PGR for other crops in European genebanks. The AGENT database backend aggregates curated passport data, phenotypic data, and genotypic data on wheat and barley accessions of 18 project partners. This data is accessible via BrAPI endpoints and explorable in a web portal. Genotyping data uses the DivBrowse17 storage engine and its BrAPI interface. Soon, the BrAPI implementation will be expanded to enable the integration of analysis pipelines in the AGENT portal, such as the FIGS+ pipeline developed by ICARDA30. There is also a plan to integrate the data collected by the AGENT project into the European Search Catalogue for Plant Genetic Resources (EURISCO)31.

    Florilège

    Florilège is a web portal designed primarily for the general public to access public plant genetic resources held by the Biological Resource Centers across France, as part of France’s National Research Institute for Agriculture, Food and Environment (INRAE). Through this portal, users can browse accessions from over 50 plant genera, spread across 19 genebanks. It allows users to view available seeds and plant material, including options for ordering material. Florilège provides a centralized access to the various French collections of plant genetic resources available to the public.

    -

    Florilège retrieves accession information from several BrAPI-compliant systems. Key among these are OLGA, a genebank accessions management system, and GnpIS, an INRAE data repository for plant genetic resources, phenomics, and genetics30,31. Using BrAPI to gather data from these systems reduced development efforts and enabled standardized data retrieval. As a result, BrAPI has become the de facto standard within the French plant genetic resources community for exchanging information. During development, the Florilège team also proposed several enhancements to the BrAPI specifications themselves, such as additional support for Collection objects or improved reference linking, to better accommodate their specific use case.

    +

    Florilège retrieves accession information from several BrAPI-compliant systems. Key among these are OLGA, a genebank accessions management system, and GnpIS, an INRAE data repository for plant genetic resources, phenomics, and genetics32,33. Using BrAPI to gather data from these systems reduced development efforts and enabled standardized data retrieval. As a result, BrAPI has become the de facto standard within the French plant genetic resources community for exchanging information. During development, the Florilège team also proposed several enhancements to the BrAPI specifications themselves, such as additional support for Collection objects or improved reference linking, to better accommodate their specific use case.

    GLIS

    -

    The Global Information System (GLIS) on Plant Genetic Resources for Food and Agriculture (PGRFA) of the International Treaty on Plant Genetic Resources for Food and Agriculture (ITPGRFA) is a web-based, BrAPI-compliant global entry point for PGRFA data24. It allows users and third-party systems to access information and knowledge on scientific, technical, and environmental matters to strengthen PGRFA conservation, management, and utilization activities. The system and its portal also enable recipients of PGRFA to make available all non-confidential information on germplasm according to the provisions of the Treaty and facilitates access to the results of their research and development.

    +

    The Global Information System (GLIS) on Plant Genetic Resources for Food and Agriculture (PGRFA) of the International Treaty on Plant Genetic Resources for Food and Agriculture (ITPGRFA) is a web-based, BrAPI-compliant global entry point for PGRFA data26. It allows users and third-party systems to access information and knowledge on scientific, technical, and environmental matters to strengthen PGRFA conservation, management, and utilization activities. The system and its portal also enable recipients of PGRFA to make available all non-confidential information on germplasm according to the provisions of the Treaty and facilitates access to the results of their research and development.

    Thanks to the adoption of Digital Object Identifiers (DOIs) for Multi-Crop Passport Descriptors (MCPD) of PGRFA accessions, the GLIS Portal provides access to 1.7 million PGRFA in collections conserved worldwide. Of these, over 1.5 million are accessible for research, training and plant breeding in the food and agriculture domain.

    The Scientific Advisory Committee of the ITPGRFA have repeatedly welcomed efforts on interoperability among germplasm information systems. In this context, the GLIS Portal adopted the BrAPI v1.3 in 2022. Integrating BrAPI among the GLIS content negotiators facilitates queries and the exchange of content for data management in plant breeding. The Portal also offers other protocols (XML, DarwinCore, JSON and JSON-LD) to increase data and metadata connectivity. In the near future, depending on the availability of resources, upgrading to BrAPI v2 is planned.

    Helium

    -

    Helium32 is a plant pedigree visualization platform designed to account for the specific problems that are unique to plant pedigrees. A pedigree is a representation of how genetically discrete individuals are related to one another and is therefore a representation of the genetic relationship between individual plant lines, their parents and progeny. Plant pedigrees are often used to check for potential genotyping or phenotyping errors, since these errors, by the very nature of Mendelian inheritance, are constrained by the pedigree structure in which they exist33. The accurate representation of plant pedigrees, and the ability to pull pedigree data from different data sources is important in plant breeding and genetics. Therefore, ways to visualize and interact this complex data in meaningful ways is critical.

    -

    From its original desktop interface, Helium has developed into a web-based visualization platform implementing BrAPI calls to allow users to import data from other BrAPI-compliant databases. The ability to pull data from BrAPI-compliant data sources has significantly expanded Helium’s capability and utility within the community. Helium is used in projects ranging in size from tens to tens of thousands of lines and across a wide variety of crops and species. While originally designed for plant data34 it has also found utility in other non-plant projects35 highlighting its broad utility. BrAPI also allows Helium to provide direct dataset links to collaborators, allowing the original data to be held with the data provider and utilizing Helium for its visualization functionality. Our current Helium deployment includes example BrAPI calls to a barley dataset at the James Hutton Institute to allow users to test the system and features it offers.

    +

    Helium34 is a plant pedigree visualization platform designed to account for the specific problems that are unique to plant pedigrees. A pedigree is a representation of how genetically discrete individuals are related to one another and is therefore a representation of the genetic relationship between individual plant lines, their parents and progeny. Plant pedigrees are often used to check for potential genotyping or phenotyping errors, since these errors, by the very nature of Mendelian inheritance, are constrained by the pedigree structure in which they exist35. The accurate representation of plant pedigrees, and the ability to pull pedigree data from different data sources is important in plant breeding and genetics. Therefore, ways to visualize and interact this complex data in meaningful ways is critical.

    +

    From its original desktop interface, Helium has developed into a web-based visualization platform implementing BrAPI calls to allow users to import data from other BrAPI-compliant databases. The ability to pull data from BrAPI-compliant data sources has significantly expanded Helium’s capability and utility within the community. Helium is used in projects ranging in size from tens to tens of thousands of lines and across a wide variety of crops and species. While originally designed for plant data36 it has also found utility in other non-plant projects37 highlighting its broad utility. BrAPI also allows Helium to provide direct dataset links to collaborators, allowing the original data to be held with the data provider and utilizing Helium for its visualization functionality. Our current Helium deployment includes example BrAPI calls to a barley dataset at the James Hutton Institute to allow users to test the system and features it offers.

    MGIS

    -

    The Musa Germplasm Information System (MGIS) serves as a comprehensive community portal dedicated to banana diversity, a crop critical to global food security14. MGIS offers detailed information on banana germplasm, focusing on the collections held by the CGIAR International Banana Genebank (ITC)36. It is built on the Drupal/Tripal technology, like BIMS37 and Florilège.

    -

    Since its inception, MGIS developers have actively participated in the BrAPI community. The MGIS team pushed for the integration of the Multi-Crop Passport Data (MCPD) standard into the Germplasm module of the API. MCPD support was added in BrAPI v1.3, and MGIS now provides passport data information on ITC banana genebank accessions (with GLIS DOI), synchronized with Genesys. MGIS also enriches the passport data by incorporating additional information from other germplasm collections worldwide. All the germplasm data is available through the BrAPI-Germplasm module implementation. For genotyping data, MGIS integrates with Gigwa19, which provides a tailored implementation of the BrAPI genotyping module. Furthermore, MGIS supports a set of BrAPI-Phenotyping module endpoints, facilitating the exposure of morphological descriptors and trait information supported by ontologies like the Crop Ontology38. MGIS has integrated the Trait Selector BrAPP, and there are use cases implemented to interlink genebank and breeding data between MGIS and the breeding database MusaBase.

    +

    The Musa Germplasm Information System (MGIS) serves as a comprehensive community portal dedicated to banana diversity, a crop critical to global food security14. MGIS offers detailed information on banana germplasm, focusing on the collections held by the CGIAR International Banana Genebank (ITC)38. It is built on the Drupal/Tripal technology, like BIMS39 and Florilège.

    +

    Since its inception, MGIS developers have actively participated in the BrAPI community. The MGIS team pushed for the integration of the Multi-Crop Passport Data (MCPD) standard into the Germplasm module of the API. MCPD support was added in BrAPI v1.3, and MGIS now provides passport data information on ITC banana genebank accessions (with GLIS DOI), synchronized with Genesys. MGIS also enriches the passport data by incorporating additional information from other germplasm collections worldwide. All the germplasm data is available through the BrAPI-Germplasm module implementation. For genotyping data, MGIS integrates with Gigwa21, which provides a tailored implementation of the BrAPI genotyping module. Furthermore, MGIS supports a set of BrAPI-Phenotyping module endpoints, facilitating the exposure of morphological descriptors and trait information supported by ontologies like the Crop Ontology40. MGIS has integrated the Trait Selector BrAPP, and there are use cases implemented to interlink genebank and breeding data between MGIS and the breeding database MusaBase.

    Breeding and Genetics Data Management

    While specialty data management is important for some use cases, often breeders want a central repository or access point of critical data. General breeding and genetics data management systems and web portals support some level of phenotypic, genotypic, and germplasm data, as well as trial, equipment, and people management. By enabling BrAPI support, these larger systems can connect with smaller tools and specialty systems to provide more functionality under the same user interface. There are several breeding data management systems developed in the BrAPI community, each with their own strengths.

    BIMS

    -

    The Breeding Information Management System (BIMS)37 is a free, secure, online breeding management system which allows breeders to store, manage, archive, and analyze their private breeding program data. +

    The Breeding Information Management System (BIMS)39 is a free, secure, online breeding management system which allows breeders to store, manage, archive, and analyze their private breeding program data. BIMS enables individual breeders to have complete control of their own breeding data along with access to tools such as data import, export, analysis, and archiving for their germplasm, phenotype, genotype, and image data. -BIMS is currently implemented in five community databases, the Genome Database for Rosaceae39, CottonGEN40, the Citrus Genome Database, the Pulse Crop Database, and the Genome Database for Vaccinium, where it enables individual breeders to import publicly available data. +BIMS is currently implemented in five community databases, the Genome Database for Rosaceae41, CottonGEN42, the Citrus Genome Database, the Pulse Crop Database, and the Genome Database for Vaccinium, where it enables individual breeders to import publicly available data. BIMS is also implemented in the public database breedwithbims.org that any breeder can use.

    -

    BIMS primarily utilizes BrAPI to connect with Field Book4, enabling seamless data transfer between data collection and subsequent management in BIMS. Data transfer through BrAPI between BIMS and other resources such as Breedbase13, GIGWA19, and the Breeder Genomics Hub41/ is under development.

    +

    BIMS primarily utilizes BrAPI to connect with Field Book4, enabling seamless data transfer between data collection and subsequent management in BIMS. Data transfer through BrAPI between BIMS and other resources such as Breedbase13, GIGWA21, and the Breeder Genomics Hub43/ is under development.

    BMS

    The Breeding Management System (BMS), developed by the Integrated Breeding Platform (IBP), is a suite of tools designed to enhance the efficiency and effectiveness of plant breeding. BMS covers all stages of the breeding process, with the emphasis on germplasm management and ontology-harmonized phenotyping (i.e. with the Crop Ontology). It also features analytics and decision-support tools. With its focus on interoperability, BMS integrates smoothly with BrAPI, facilitating easy connections with a broad array of complementary tools and databases. Notably, the BMS is often deployed together with Gigwa to fulfill the genotyping data management needs of BMS users.

    @@ -1288,24 +1287,24 @@

    BMS

    Additionally, brapi-sync improves data management by maintaining links to the original source of each entity it transmits. This retains the original context of the data and establishes a traceability mechanism for accurate data source attribution and verification. Such practices are crucial for maintaining data integrity and fostering trust among collaborative partners, ensuring access to accurate, reliable, and current information.

    Breedbase

    -

    Breedbase is a comprehensive, open-source, breeding data management system13,42 that implements a digital ecosystem for all breeding data, including trial data, phenotypic data, and genotypic data. Data acquisition is supported through data collection apps such as Fieldbook4, Coordinate, and InterCross, as well as through drone imagery, Near Infra-Red Spectroscopy (NIRS), and other technologies. Search functions, such as the Search Wizard interface, provide powerful query capabilities. Various breeding-centric analysis tools are available, including mixed models, heritability, stability, principal component analysis (PCA), and various clustering algorithms. The original impetus for creating Breedbase was the advent of new breeding paradigms based on genomic information such as genomic prediction algorithms43 and the accompanying data management challenges. Thus, complete genomic prediction workflow is integrated in the system.

    -

    The BrAPI interface is crucial for Breedbase. Breedbase uses BrAPI to connect with the data collection apps, other projects such as CLIMMOB6, and native BrAPPs built into the Breedbase webpage. Users also appreciate the ability to connect to Breedbase instances using packages such as QBMS44 for data import into R for custom analyses. The Breedbase team has been part of the BrAPI community since its inception and has continuously adopted and contributed to the BrAPI standard.

    +

    Breedbase is a comprehensive, open-source, breeding data management system13,44 that implements a digital ecosystem for all breeding data, including trial data, phenotypic data, and genotypic data. Data acquisition is supported through data collection apps such as Fieldbook4, Coordinate, and InterCross, as well as through drone imagery, Near Infra-Red Spectroscopy (NIRS), and other technologies. Search functions, such as the Search Wizard interface, provide powerful query capabilities. Various breeding-centric analysis tools are available, including mixed models, heritability, stability, principal component analysis (PCA), and various clustering algorithms. The original impetus for creating Breedbase was the advent of new breeding paradigms based on genomic information such as genomic prediction algorithms45 and the accompanying data management challenges. Thus, complete genomic prediction workflow is integrated in the system.

    +

    The BrAPI interface is crucial for Breedbase. Breedbase uses BrAPI to connect with the data collection apps, other projects such as CLIMMOB6, and native BrAPPs built into the Breedbase webpage. Users also appreciate the ability to connect to Breedbase instances using packages such as QBMS46 for data import into R for custom analyses. The Breedbase team has been part of the BrAPI community since its inception and has continuously adopted and contributed to the BrAPI standard.

    DeltaBreed

    DeltaBreed is an open-source breeding data management system designed and developed by Breeding Insight to support U.S. Department of Agriculture - Agricultural Research Service (USDA-ARS) specialty crop and animal breeders. DeltaBreed differs from other related systems in that it is customizable to small breeding teams and generalized enough to support the workflows of diverse species. DeltaBreed is a unified system that connects a variety of BrAPI applications. BrAPI integration allows the complexity underlying interoperability to be hidden, shielding users from multifactorial differences between various applications. DeltaBreed, adhering to the BrAPI model, establishes data standards and validations for users and provides a singular framework for data management and user training. BrAPI enabled connections are being used with all of the following tools: BrAPI Java Test Server, BreedBase, Field Book, Gigwa, QBMS, Mr Bean, Helium and the Pedigree Viewer BrAPP.

    DeltaBreed users may not be aware of BrAPI or the specifics of underlying tools, but will notice that BrAPI interoperability reduces the need for human-mediated file transfers and data manipulation. Field Book users, for example, can connect to their DeltaBreed program, authenticate, and pull studies and observation variables directly from DeltaBreed to Field Book on their data collection device. The subsequent step of pushing observations from Field Book to DeltaBreed is straightforward via BrAPI, but is pending implementation until data quality validations are put in place; these include improved data transaction handling and differentiation of intentional and inadvertent repeated measures.

    FAIDARE

    -

    FAIDARE45 is a data discovery portal providing a biologist-friendly search system over a global federation of 40 plant research databases. +

    FAIDARE47 is a data discovery portal providing a biologist-friendly search system over a global federation of 40 plant research databases. It allows users to identify data resources using a full text search approach combined with domain specific filters. Each search result contains a link back to the original database for visualization, analysis, and download. The indexed data types are broad and include genomic features, selected bibliography, QTL, markers, genetic variation studies, phenomic studies, and plant genetic resources. This inclusiveness is achieved thanks to a two stage indexation data model. The first index, more generic, provides basic search functionalities and relies on five fields: name, link back URL, data type, species, and exhaustive description. To provide more advanced filtering, the second stage indexation mechanism takes advantage of BrAPI endpoints to get more detailed metadata on germplasm, genotyping studies and phenotyping studies.

    -

    The FAIDARE indexation mechanism relies on a public software package46 that allows data resource managers to request the indexation of their database. +

    The FAIDARE indexation mechanism relies on a public software package48 that allows data resource managers to request the indexation of their database. This BrAPI client is currently able to extract data from any BrAPI v1.3 and v1.2 endpoint, and the development of BrAPI v2.x indexation will be initiated in 2025. Since not all databases are willing to implement BrAPI endpoints, it is possible to generate metadata as static BrAPI-compliant JSON files, using the BrAPI standard as a file exchange format.

    -

    The FAIDARE architecture has been designed by elaborating on the BrAPI data model in combination with the GnpIS Software Architecture30. +

    The FAIDARE architecture has been designed by elaborating on the BrAPI data model in combination with the GnpIS Software Architecture32. It uses an Elasticsearch NoSQL engine that searches and serves enriched versions of the BrAPI JSON data model. FAIDARE also includes a BrAPI endpoint using all indexed metadata. It has been adopted by several communities including the ELIXIR and EMPHASIS European infrastructures, and the WheatIS of the Wheat-Initiative. @@ -1313,7 +1312,7 @@

    FAIDARE

    Germinate

    -

    Germinate47,48 is an open-source plant genetic resources database that combines and integrates various types of plant breeding data including genotypic, phenotypic, passport, image, geographic, and climate data into a single repository. +

    Germinate49,50 is an open-source plant genetic resources database that combines and integrates various types of plant breeding data including genotypic, phenotypic, passport, image, geographic, and climate data into a single repository. Germinate is tightly linked to the BrAPI specification and supports the majority of BrAPI endpoints for querying, filtering, and submission.

    Germinate connects with other BrAPI-enabled tools such as GridScore for phenotypic data collection, Flapjack for genotypic data visualization, and Helium for pedigree visualization. Additionally, due to the nature of BrAPI, Germinate can act as a data repository for any BrAPI-compatible tool. @@ -1329,13 +1328,13 @@

    QBMS

    Many plant breeders and geneticists analyze their datasets using the R statistical programming language, but this requires the import of data into an R environment. BrAPI enables access to pull datasets into R from compatible databases, but API backend processes, such as authentication, tokens, TCP/IP protocol, JSON format, pagination, stateless calls, asynchronous communication, and database IDs are complex for users to navigate. -The QBMS R package eliminates technical barriers scientists experience when using the BrAPI specification in their analysis scripts and pipelines by providing breeders with stateful functions familiar to them when navigating their GUI systems44. +The QBMS R package eliminates technical barriers scientists experience when using the BrAPI specification in their analysis scripts and pipelines by providing breeders with stateful functions familiar to them when navigating their GUI systems46. QBMS enables users to query and extract data into a dataframe, a common structure in the R language, providing an intuitive connection with breeding data management systems.

    -

    The community has built extended solutions on top of QBMS, incorporating the package into R-Shiny BrAPPs such as Mr.Bean49 (described below). +

    The community has built extended solutions on top of QBMS, incorporating the package into R-Shiny BrAPPs such as Mr.Bean51 (described below). QBMS is open-source and available on the official CRAN repository, where it has garnered over 9400 downloads.

    Mr.Bean

    -

    Mr.Bean49 is a graphical user interface (GUI) designed to assist breeders, statisticians, and individuals involved in plant breeding programs with the analysis of field trials. By utilizing innovative methodologies such as SpATS for modeling spatial trends, and autocorrelation models to address spatial variability. Mr.Bean proves highly practical and powerful in facilitating faster and more effective decision-making. Modeling Genotype-by-environment interaction poses its challenges, but Mr.Bean offers the capability to explore various variance-covariance matrices, including Factor Analytic, compound symmetry, and heterogeneous variances. This aids in the assessment of genotype performance across diverse environments.

    +

    Mr.Bean51 is a graphical user interface (GUI) designed to assist breeders, statisticians, and individuals involved in plant breeding programs with the analysis of field trials. By utilizing innovative methodologies such as SpATS for modeling spatial trends, and autocorrelation models to address spatial variability. Mr.Bean proves highly practical and powerful in facilitating faster and more effective decision-making. Modeling Genotype-by-environment interaction poses its challenges, but Mr.Bean offers the capability to explore various variance-covariance matrices, including Factor Analytic, compound symmetry, and heterogeneous variances. This aids in the assessment of genotype performance across diverse environments.

    Mr.Bean boasts flexibility in importing different file types, yet for users managing their data within data management systems, the process of downloading from their systems and importing it into Mr.Bean can be cumbersome. To address this issue, QBMS was integrated into the back end. This feature prompts users to input the URL of a BrAPI compatible server, enter their credentials (if necessary), and select the specific trial they wish to analyze. Subsequently, users can seamlessly access their dataset through BrAPI and utilize it across the entire Mr.Bean interface.

    SCT

    @@ -1353,7 +1352,7 @@

    BrAPIMapper

    BrAPIMapper is a full BrAPI implementation designed to be a convenient wrapper for any breeding related data source. BrAPIMapper is provided as a Docker application that can connect to a variety of external data sources including mySQL or PostgreSQL databases, generic REST services, flat files (XML, JSON, CSV/TSV/GFF3/VCF, YAML), or any combination of these. It provides an administration user interface to map BrAPI data models to external data sources. The interface allows administrators to select the BrAPI specification versions to use and which endpoints to enable. Data mapping configuration import and export features simplify upgrades to future BrAPI versions; administrators only have to map missing fields or make minor adjustments. BrAPIMapper supports the primary BrAPI features including paging, deferred search results, user lists, and authentication. Access restrictions to specific endpoints can be managed through the administration interface as well. This tool aims to accelerate BrAPI services deployment while ensuring specification compliance.

    MIRA and BrAPI2ISA

    -

    Since the release of BrAPI 1.3, efforts have been made to incorporate support for the MIAPPE (Minimal Information About a Plant Phenotyping Experiment)10 standard into the specification, achieving full compatibility in BrAPI 2.0. Consequently, BrAPI now includes all attributes necessary for MIAPPE compliance, adhering to standardized descriptions in accordance with MIAPPE guidelines. In some communities and projects, phenotyping data and metadata are archived and published as structured ISA-Tab files, validated using the MIAPPE ISA configuration50. Although ISA-Tab is easy to read for non-technical experts due to its file-based approach, it lacks programmatic accessibility, particularly for web applications.

    +

    Since the release of BrAPI 1.3, efforts have been made to incorporate support for the MIAPPE (Minimal Information About a Plant Phenotyping Experiment)10 standard into the specification, achieving full compatibility in BrAPI 2.0. Consequently, BrAPI now includes all attributes necessary for MIAPPE compliance, adhering to standardized descriptions in accordance with MIAPPE guidelines. In some communities and projects, phenotyping data and metadata are archived and published as structured ISA-Tab files, validated using the MIAPPE ISA configuration52. Although ISA-Tab is easy to read for non-technical experts due to its file-based approach, it lacks programmatic accessibility, particularly for web applications.

    MIRA enables the automatic deployment of a BrAPI server on a MIAPPE-compliant dataset in ISA-Tab format, facilitating programmatic access to these datasets. It is deployable from a Docker image with the dataset mounted. The tool leverages the mapping between MIAPPE, ISA-Tab, and BrAPI, eliminating the need for parsing or manual mapping of datasets compliant with (meta-)data standards. By providing programmatic access through BrAPI, MIRA facilitates the integration of phenotyping datasets into web applications.

    The BrAPI2ISA service functions as a converter between a BrAPI-compatible server and the ISA-Tab format. The tool simplifies, automates, and facilitates the archiving of data, thereby enhancing data preservation and accessibility. The BrAPI2ISA tool is compatible with BrAPI 1.3 and welcomes community contributions to support the latest versions of BrAPI.

    GraphQL Data-warehouse

    @@ -1361,7 +1360,7 @@

    GraphQL Data-warehouse

    Using the Zendro set of automatic software code generators, a fully functional, efficient, and cloud-capable BrAPI data-warehouse has been created for the current version of the BrAPI data models. Unlike most BrAPI-compliant data sources, this data-warehouse supports a GraphQL API rather than a RESTful API. This API provides secure access to data read and write functions for all BrAPI data models. It provides create, read, update, and delete (CRUD) functions that are standardized and accept the same parameters for all data models. Zendro supports a large number of underlying database systems, allowing flexibility during installation and integration.

    The GraphQL server is particularly rich in features. Logical filters allow for exhaustive search queries, whose structure is highly intuitive and based around logical triplets. A large collection of operators is available and triplets can be combined to logical search trees using “and” or “or” operators. Searches can be extended over relationships between data models, thus enabling a user to query the warehouse for exactly the required data. Authorization is based on user roles and can be configured differently for each single data model read or write function. The generated graphical interface allows for the integration of interactive scientific plots and analysis tools written in JavaScript or WebAssembly.

    -

    An example data warehouse is publicly available and offers full read access in the graphical user interface and through the GraphQL API. The example warehouse is populated with public CassavaBase data51 to create fully BrAPI-compliant example based on Zendro. Three interactive scientific example plots are available to explore the data. The first is a boxplot comparing Cassava harvest indices measured for four different experiments. Next, an interactive raincloud plot provides an alternative visualization of the same data. Finally, a scatterplot shows how Cassava fresh root yield and plant height are correlated based on data from a single study.

    +

    An example data warehouse is publicly available and offers full read access in the graphical user interface and through the GraphQL API. The example warehouse is populated with public CassavaBase data53 to create fully BrAPI-compliant example based on Zendro. Three interactive scientific example plots are available to explore the data. The first is a boxplot comparing Cassava harvest indices measured for four different experiments. Next, an interactive raincloud plot provides an alternative visualization of the same data. Finally, a scatterplot shows how Cassava fresh root yield and plant height are correlated based on data from a single study.

    Discussion

    BrAPI for Breeders

    While the BrAPI technical specification is designed to be read and used by software developers, its underlying purpose is to support the work of breeders and other scientists by making routine processes faster, easier, and cheaper. BrAPI offers a convenient path to automation, interoperability, and data integration for software tools in breeding, genetics, phenomics, and other related agricultural domains. By integrating the tools described above, breeders and scientists can spend less time on data management and more time focusing on science. For many, the ultimate goal is the development of a digital data ecosystem: a collection of software tools and applications that can all work together seamlessly. In this scenario, data is digitally collected, automatically sent to quality control systems, batch analyzed to provide actionable insights, and finally stored in accessible databases for long-term applications. As tools continue to adopt the BrAPI standard, this vision is beginning to approach reality.

    @@ -1396,6 +1395,7 @@

    Acknowledgements

  • The French Networks of Biological Resource Centres for Agricultural, Environmental and Life Sciences, doi: 10.15454/b4ec-tf49
  • The work was supported by the German Research Foundation DFG under the grant agreement number 442032008 (NFDI4Biodiversity). NFDI4Biodiversity is part of NFDI, the National Research Data Infrastructure in Germany (www.nfdi.de).
  • The Bill and Malinda Gates Foundation in cooperation with the Excellence in Breeding Platform of the CGIAR
  • +
  • The development of DivBrowse is funded by a grant from the German Ministry of Research and Education (BMBF, FKZ 031B1302A).
  • Author Contributions

    Author Contributions