From 031169392c7b086850740141be23886c2e5a63cb Mon Sep 17 00:00:00 2001 From: Peter Selby Date: Tue, 7 May 2024 16:02:14 -0400 Subject: [PATCH] break up success stories into files --- ....00.success.md => 03.00.HEADER.Success.md} | 0 content/03.01.--.HEADER.Data_Collection.md | 7 ++ content/03.01.01.Field_Book.md | 6 ++ content/03.01.02.ClimMob.md | 5 ++ content/03.01.03.Image_Breed.md | 5 ++ content/03.01.04.GridScore.md | 5 ++ content/03.02.--.HEADER.Data_Management.md | 6 ++ content/03.02.01.PHIS.md | 9 +++ content/03.02.02.DeltaBreed.md | 19 +++++ content/03.02.03.BMS.md | 7 ++ content/03.02.04.Breedbase.md | 3 + content/03.02.05.BIMS.md | 3 + content/03.02.06.Germinate.md | 5 ++ content/03.02.07.PIPPA.md | 13 +++ content/03.02.08.MGIS.md | 4 + ....03.--.HEADER.Federation_Infrastructure.md | 8 ++ content/03.03.01.AGENT_Portal.md | 16 ++++ content/03.03.02.MIAPPE_MIRA.md | 4 + content/03.03.03.MIAPPE_BrAPI2ISA.md | 3 + content/03.03.04.BrAPIMapper.md | 4 + content/03.04.--.HEADER.Data_Visualization.md | 6 ++ content/03.04.01.Flapjack.md | 3 + content/03.04.02.Helium.md | 5 ++ content/03.04.03.Trait_Selector_BrAPP.md | 6 ++ content/03.04.04.DArTView.md | 4 + content/03.04.05.DivBrowse.md | 5 ++ content/03.05.--.HEADER.Analytics.md | 6 ++ content/03.05.01.QBMS.md | 7 ++ content/03.05.02.Mr_Bean.md | 5 ++ content/03.05.03.G-Crunch.md | 4 + .../03.06.--.HEADER.Samples_and_Genotypes.md | 6 ++ ...les-and-genotypes.md => 03.06.01.GIGWA.md} | 19 ----- content/03.06.02.PHG.md | 6 ++ content/03.07.00.HEADER.Data_Portals.md | 6 ++ content/03.07.01.FAIDARE.md | 8 ++ content/03.07.02.GLIS.md | 7 ++ content/03.07.03.FLORILEGE.md | 8 ++ content/03.07.04.Zendro.md | 15 ++++ .../03.01.data-collection.md | 34 -------- .../03.02.data-management.md | 79 ------------------- .../03.03.federation-infrastructure.md | 44 ----------- .../03.success_stories/03.04.visualization.md | 45 ----------- content/03.success_stories/03.05.analytics.md | 30 ------- .../03.success_stories/03.07.data-portals.md | 53 ------------- 44 files changed, 239 insertions(+), 304 deletions(-) rename content/{03.success_stories/03.00.success.md => 03.00.HEADER.Success.md} (100%) create mode 100644 content/03.01.--.HEADER.Data_Collection.md create mode 100644 content/03.01.01.Field_Book.md create mode 100644 content/03.01.02.ClimMob.md create mode 100644 content/03.01.03.Image_Breed.md create mode 100644 content/03.01.04.GridScore.md create mode 100644 content/03.02.--.HEADER.Data_Management.md create mode 100644 content/03.02.01.PHIS.md create mode 100644 content/03.02.02.DeltaBreed.md create mode 100644 content/03.02.03.BMS.md create mode 100644 content/03.02.04.Breedbase.md create mode 100644 content/03.02.05.BIMS.md create mode 100644 content/03.02.06.Germinate.md create mode 100644 content/03.02.07.PIPPA.md create mode 100644 content/03.02.08.MGIS.md create mode 100644 content/03.03.--.HEADER.Federation_Infrastructure.md create mode 100644 content/03.03.01.AGENT_Portal.md create mode 100644 content/03.03.02.MIAPPE_MIRA.md create mode 100644 content/03.03.03.MIAPPE_BrAPI2ISA.md create mode 100644 content/03.03.04.BrAPIMapper.md create mode 100644 content/03.04.--.HEADER.Data_Visualization.md create mode 100644 content/03.04.01.Flapjack.md create mode 100644 content/03.04.02.Helium.md create mode 100644 content/03.04.03.Trait_Selector_BrAPP.md create mode 100644 content/03.04.04.DArTView.md create mode 100644 content/03.04.05.DivBrowse.md create mode 100644 content/03.05.--.HEADER.Analytics.md create mode 100644 content/03.05.01.QBMS.md create mode 100644 content/03.05.02.Mr_Bean.md create mode 100644 content/03.05.03.G-Crunch.md create mode 100644 content/03.06.--.HEADER.Samples_and_Genotypes.md rename content/{03.success_stories/03.06.samples-and-genotypes.md => 03.06.01.GIGWA.md} (59%) create mode 100644 content/03.06.02.PHG.md create mode 100644 content/03.07.00.HEADER.Data_Portals.md create mode 100644 content/03.07.01.FAIDARE.md create mode 100644 content/03.07.02.GLIS.md create mode 100644 content/03.07.03.FLORILEGE.md create mode 100644 content/03.07.04.Zendro.md delete mode 100644 content/03.success_stories/03.01.data-collection.md delete mode 100644 content/03.success_stories/03.02.data-management.md delete mode 100644 content/03.success_stories/03.03.federation-infrastructure.md delete mode 100644 content/03.success_stories/03.04.visualization.md delete mode 100644 content/03.success_stories/03.05.analytics.md delete mode 100644 content/03.success_stories/03.07.data-portals.md diff --git a/content/03.success_stories/03.00.success.md b/content/03.00.HEADER.Success.md similarity index 100% rename from content/03.success_stories/03.00.success.md rename to content/03.00.HEADER.Success.md diff --git a/content/03.01.--.HEADER.Data_Collection.md b/content/03.01.--.HEADER.Data_Collection.md new file mode 100644 index 0000000..ab6db92 --- /dev/null +++ b/content/03.01.--.HEADER.Data_Collection.md @@ -0,0 +1,7 @@ +### Data Collection + + + diff --git a/content/03.01.01.Field_Book.md b/content/03.01.01.Field_Book.md new file mode 100644 index 0000000..53ac086 --- /dev/null +++ b/content/03.01.01.Field_Book.md @@ -0,0 +1,6 @@ +#### Field Book + + +Phenotypic data collection is an essential part of the breeding process. Historically, gathering data in the field was done with pen and paper, or perhaps some version of a digital spreadsheet. The abundance and prevalence of smart phones has allowed the Field Book mobile app to enhance data collection. Field Book can create well-formed digital observation records from the moment they are taken. This can improve the efficiency of data collection and reduce human error. + +In 2018, BrAPI was introduced into Field Book; specifically, the Core and Phenotyping modules. BrAPI was able to take things a step further by automating the flow of data from the Field Book mobile app to a central database server. This workflow allows data collection and storage to be expedited, removing the need of the user to transfer export files manually. Since Field Book’s adoption of BrAPI, many community servers have been integrated to simplify data storage. In this work flow, data is collected and stored completely digitally with little-to-no human involvement. diff --git a/content/03.01.02.ClimMob.md b/content/03.01.02.ClimMob.md new file mode 100644 index 0000000..a2d981a --- /dev/null +++ b/content/03.01.02.ClimMob.md @@ -0,0 +1,5 @@ +#### ClimMob + +Not all data can be collected by a single person, or even by a single organization. ClimMob is a tool to easily allow citizen scientists to assist in the data collection process. Although this data may not be as detailed as a focused scientific program, it can be very useful to collect simple data from a wide range of locations and environments. + +When it comes to BrAPI compatibility, ClimMob follows the same patterns established by Field Book. During a survey, all the farmer collected data is stored in a central ClimMob node. When the survey is complete, all the data is uploaded automatically via BrAPI to a central breeding database for long term storage and analysis. diff --git a/content/03.01.03.Image_Breed.md b/content/03.01.03.Image_Breed.md new file mode 100644 index 0000000..f77ea3e --- /dev/null +++ b/content/03.01.03.Image_Breed.md @@ -0,0 +1,5 @@ +#### ImageBreed + +High-throughput phenotyping has been gaining significant traction lately as a way to collect lots of data very quickly. Image collection from unmanned arial and ground vehicles (UAVs and UGVs) are a great way to collect a lot of raw data all at once, then analyze it later. ImageBreed is a image collection pipeline tool to support regular use of UAVs and UGVs. + +When the raw images have been processed through the standardization pipelines in ImageBreed, useful phenotypes can be extracted from the images. The BrAPI standard is used to push these phenotypes back to a central breeding database where they can be analyzed with other data. In addition to this, ImageBreed also has the option to use BrAPI to upload the raw images to the central breeding database, or any other BrAPI compatible long term storage service. The BrAPI models in the current version of the standard (V2.1) are rudimentary, but effective. The ImageBreed team has put in some work to enhance the BrAPI image data standards. diff --git a/content/03.01.04.GridScore.md b/content/03.01.04.GridScore.md new file mode 100644 index 0000000..3febbe7 --- /dev/null +++ b/content/03.01.04.GridScore.md @@ -0,0 +1,5 @@ +#### GridScore + +Phenotypic data collection underpins scientific crop research and plant breeding. Knowledge gained from collected data and its analysis alongside data visualizations inform further phenotypic trials and ideally support research hypotheses. The importance of accuracy and efficiency in the collection of this data as well as the infrastructure to facilitate the flow of data from the field to a knowledge base cannot be underestimated. [GridScore](https://ics.hutton.ac.uk/get-gridscore/) [@doi:10.1186/s12859-022-04755-2] is a modern mobile application for phenotypic observations that harnesses technological advancements in the area of mobile devices to enrich the data collection process. + +BrAPI has further increased the value of GridScore by integrating it into the overarching workflow from trial creation, data collection, and its ultimate data storage for further processing. Specifically, trial designs as well as trait definitions can be imported into GridScore using BrAPI and a finalized trial can ultimately be exported via BrAPI to any compatible database. diff --git a/content/03.02.--.HEADER.Data_Management.md b/content/03.02.--.HEADER.Data_Management.md new file mode 100644 index 0000000..de65d87 --- /dev/null +++ b/content/03.02.--.HEADER.Data_Management.md @@ -0,0 +1,6 @@ +### Data Management + + diff --git a/content/03.02.01.PHIS.md b/content/03.02.01.PHIS.md new file mode 100644 index 0000000..8a9d6a9 --- /dev/null +++ b/content/03.02.01.PHIS.md @@ -0,0 +1,9 @@ +#### PHIS + +The Hybrid Phenotyping Information System ([PHIS](http://www.phis.inrae.fr/) [@doi:https://doi.org/10.1111/nph.15385]), based on the [OpenSILEX](https://github.com/OpenSILEX/) framework, is an ontology-driven information system based on semantic web technologies. PHIS is deployed in several field and greenhouse platforms of the national [PHENOME](https://www.phenome-emphasis.fr/) and European [EMPHASIS](https://emphasis.plant-phenotyping.eu/) infrastructure. It manages and collects data from Phenotyping and High Throughput Phenotyping experiments on a day to day basis. PHIS unambiguously identifies all the objects and traits in an experiment, and establishes their types and relationships via ontologies and semantics. + +PHIS has been designed to be BrAPI-compliant. PHIS adheres to the standards and protocols specified by BrAPI and implements various services aligning with the BrAPI standards, encompassing the Core, Phenotyping, and Germplasm modules. This enables integration and compatibility with BrAPI-compliant systems and platforms. This prerequisite served as the basis for formalizing the data model, while also facilitating compatibility with other standards, such as the Minimal Information About a Plant Phenotyping Experiment ([MIAPPE](https://www.miappe.org/) [@doi:https://doi.org/10.1111/nph.16544]). By integrating BrAPI requirements into its structure, PHIS not only meets the standards of the phenotyping field, but also strengthens its capacity for interoperability and effective collaboration in the wider context of plant breeding and related fields. + +The fact that data within a PHIS instance can be queried through BrAPI services makes the indexing of PHIS in [FAIDARE](https://urgi.versailles.inra.fr/faidare/) very easy to implement. + +Furthermore, as PHIS offers BrAPI-compliant Web Services, it simplifies the integration and data exchange with other European information systems that handle phenotyping data. The adherence to BrAPI standards ensures a common interface and compatibility, facilitating communication and collaboration between PHIS and other systems in the European context. This interoperability not only eases data sharing, but also promotes a more coherent and efficient approach to the management and use of phenotyping data on various platforms and research initiatives within the European scientific community. diff --git a/content/03.02.02.DeltaBreed.md b/content/03.02.02.DeltaBreed.md new file mode 100644 index 0000000..247457b --- /dev/null +++ b/content/03.02.02.DeltaBreed.md @@ -0,0 +1,19 @@ +#### DeltaBreed + + +DeltaBreed is an open-source data management system designed and developed by Breeding Insight to support USDA-ARS specialty crop and animal breeders. DeltaBreed is a unified system for managing breeding data that connects a variety of BrAPI applications (see list below). BrAPI integration allows the complexity underlying interoperability to be hidden, shielding users from multifactorial differences between diverse applications. DeltaBreed, adhering to the BrAPI model, establishes data standards and validations for users and provides a singular framework for data management and user training. + +DeltaBreed users need not be aware of BrAPI or the specifics of underlying applications but will notice that BrAPI interoperability reduces the need for human-mediated file transfers and data manipulation. Field Book users, for example, can connect to their DeltaBreed program, authenticate, and pull studies and traits directly from DeltaBreed to Field Book on their data collection device. The subsequent step of pushing observations from Field Book to DeltaBreed is straightforward via BrAPI, but will not be implemented until repeated observation handling workflows are established to differentiate and validate repeated observations, such as accidental repeats, overwrite requests, time-series observations, and repeated sub-entity measures. Users can expect DeltaBreed observation handling to become more seamless with future development. + +**DeltaBreed Connected Applications** +<< Submission is expected April 2024. We may need to trim this aspirational list down to reality in final edits.>> + ++ BIMS ++ BrAPI Java Server ++ BrAPI Sync ++ BreedBase ++ Diversity Arrays Technologies (DArT) genotyping services ++ Field Book ++ Gigwa ++ Mr Bean ++ Pedigree Viewer diff --git a/content/03.02.03.BMS.md b/content/03.02.03.BMS.md new file mode 100644 index 0000000..61662cb --- /dev/null +++ b/content/03.02.03.BMS.md @@ -0,0 +1,7 @@ +#### BMS + +The [Breeding Management System (BMS)](https://bmspro.io), developed by the [Integrated Breeding Platform (IBP)](https://integratedbreeding.net/), is a suite of tools designed to enhance the efficiency and effectiveness of plant breeding. BMS covers all stages of the breeding process, with the emphasis on germplasm management and [ontology](https://cropontology.org)-harmonized phenotyping. It also features analytics and decision-support tools. With its focus on interoperability, BMS integrates smoothly with BrAPI, facilitating easy connections with a broad array of complementary tools and databases, notably [Gigwa](https://southgreen.fr/content/gigwa) which is deployed together with the BMS to fulfill the genotyping data management needs of BMS users. + +The [brapi-sync](https://github.com/IntegratedBreedingPlatform/brapi-sync) tool, a significant component of BMS’s BrAPI capabilities, was developed by the IBP and released as a BrAPP for community use. Brapi-sync is designed to enhance collaboration among partner institutes within a network such as Innovation and Plant Breeding in West Africa ([IAVAO](https://www.iavao.org/en)), by enabling the sharing of germplasm and trials across BrAPI-enabled systems. This tool helps overcome traditional barriers to collaboration, ensuring data that was once isolated within specific programs or platforms can now be easily shared, integrated, and synchronized. + +Additionally, brapi-sync improves data management by utilizing the externalReferences field to maintain links to the origin IDs of each entity it transmits. This not only retains the original context of the data but also establishes a traceability mechanism for accurate data source attribution and verification. Such practices are crucial for maintaining data integrity and fostering trust among collaborative partners, ensuring access to accurate, reliable, and current information. diff --git a/content/03.02.04.Breedbase.md b/content/03.02.04.Breedbase.md new file mode 100644 index 0000000..bd823f8 --- /dev/null +++ b/content/03.02.04.Breedbase.md @@ -0,0 +1,3 @@ +#### Breedbase + +Breedbase is a comprehensive breeding data management system [@doi:10.1093/g3journal/jkac078] [@doi:10.1371/journal.pone.0240059] that implements a digital ecosystem for all breeding data, including trial data, phenotypic data, and genotypic data. Data acquisition is through tabled-based apps such as Fieldbook [@doi:10.2135/cropsci2013.08.0579] and related apps, such as Coordinate and InterCross apps, through drone imagery, Near Infra-Red Spectroscopy (NIRS), and other technologies. Search functions such as the Search Wizard interface provide powerful query capabilities, and various breeding-centric analysis tools are available, including mixed models, heritability, stability, PCA, and various clustering algorithms. The original impetus for creating Breedbase was the advent of new breeding paradigms based on genomic information such as genomic prediction algorithms [@doi:10.1093/genetics/157.4.1819] and the accompanying data management challenges, and complete genomic prediction workflow is integrated in the system. The first instance was created for the NextGen Cassava project in 2012 as the Cassavabase () database. Databases for other CGIAR root, tuber and banana (RTB) crops followed with database for yam (), sweet potato (), banana () as well as instances in labs and companies. The BrAPI interface [@doi:10.1093/bioinformatics/btz190] is crucial for Breedbase: Breedbase communicates via BrAPI with the data collection tablets, connection to other projects such as CLIMMOB [@doi:10.1016/j.compag.2023.108539], and many native tools use the BrAPI interface for accessing data. Users also appreciate the ability to connect to Breedbase instances using packages such as QBMS for data import into R for custom analyses. Breedbase has been an early and continuous adopter of, and contributor to, the BrAPI standard. diff --git a/content/03.02.05.BIMS.md b/content/03.02.05.BIMS.md new file mode 100644 index 0000000..d201fd7 --- /dev/null +++ b/content/03.02.05.BIMS.md @@ -0,0 +1,3 @@ +#### BIMS + +BIMS (Breeding Information Management System) [@doi:10.1093/database/baab054] is a free, secure, and online breeding management system which allows breeders to store, manage, archive, and analyze their private breeding program data. BIMS enables individual breeders to have complete control of their own breeding data along with access to tools such as data import/export, data analysis and data archiving for their germplasm, phenotype, genotype, and image data. BIMS is currently implemented in five community databases, the Genome Database for Rosaceae [@doi:10.1093/nar/gky1000], CottonGEN [@doi:10.3390/plants10122805], the Citrus Genome Database, the Pulse Crop Database, and the Genome Database for Vaccinium, as well as a crop-independent website, . BIMS in these five community databases enables individual breeders to import publicly available data so that they can utilize public data in their breeding program. BIMS utilizes the Android App Field Book, enabling seamless data transfer between BIMS and the Field Book App through either files or BrAPI. Data transfer through BrAPI between BIMS and other resources such as BreedBase, GIGWA, and Breeder Genomics Hub is also on the way. diff --git a/content/03.02.06.Germinate.md b/content/03.02.06.Germinate.md new file mode 100644 index 0000000..72d44bf --- /dev/null +++ b/content/03.02.06.Germinate.md @@ -0,0 +1,5 @@ +#### Germinate + +[Germinate](https://ics.hutton.ac.uk/get-germinate/) [@doi:10.1002/csc2.20248] is an open-source plant genetic resources database that combines and integrates various kinds of plant breeding data including genotypic data, phenotypic trials data, passport data, images, geographic information and climate data into a single repository. Germinate is tightly linked to the BrAPI specification and supports a majority of BrAPI endpoints for querying, filtering and submission. + +Germinate integrates and connects with other BrAPI-enabled tools such as GridScore for phenotypic data collection, Flapjack for genotypic data visualization and Helium for pedigree visualization, but, due to the nature of BrAPI, Germinate can act as a data repository for any BrAPI-compatible tool. Thanks to the interoperability provided by BrAPI the need for manual data handling becomes a rarity with the direct benefit of faster data processing, fewer to no human errors, data security and integrity. diff --git a/content/03.02.07.PIPPA.md b/content/03.02.07.PIPPA.md new file mode 100644 index 0000000..4e8f2d6 --- /dev/null +++ b/content/03.02.07.PIPPA.md @@ -0,0 +1,13 @@ +#### PIPPA + +[PIPPA](https://pippa.psb.ugent.be) is a data management system used for collecting data from the [WIWAM](https://www.wiwam.be/) range of automated high throughput phenotyping platforms. These platforms have been deployed at different research institutes and commercial breeders across Europe in a variety of configurations with different types of equipment such as weighing scales, cameras and environment sensors. Examples are: + ++ [Umea Plant Science Centre](https://www.upsc.se/plant-growth-facilities-at-upsc-and-slu-umea/325-upsc-tree-phenotyping-platform.html) ++ [Fondazione Edmund Mach](https://cri.fmach.it/en/Facilities/Technological-Facilities/Plant-Phenotyping#application_fields) ++ [Phenovision](https://www.psb.ugent.be/phenotyping/phenovision) + +Developed from 2016 onwards, the software features a web interface with functionality for setting up new experiments for the platform(s), planning imaging and irrigation treatments, linking metadata to pots (genotype, growth media, manual treatments), exporting data, importing data and visualizing data as charts. It also supports the integration of image analysis scripts and connections to a compute cluster for job submission. + +To share the phenotype data of the experiments linked to publications, an implementation of BrAPI 1.3 was developed on a separate public PIPPA server open to the public, which allowed read only access to the data in a standardized format. This endpoint was registered on [FAIDARE](https://urgi.versailles.inra.fr/faidare/) and allows the data to be found alongside data from other BrAPI endpoints. + +As the BrAPI ecosystem has matured, it created a clear path for the development of PIPPA as to how to share data in a manner according to the FAIR principles which are becoming standard in plant research data management best practices. In combination with the support for [MIAPPE](https://www.miappe.org/), these have served as guidelines in the current development, which is focussed on delivering a public BraPI 2.1 endpoint and making more high throughput datasets publicly available via BrAPI. diff --git a/content/03.02.08.MGIS.md b/content/03.02.08.MGIS.md new file mode 100644 index 0000000..ddaf3ff --- /dev/null +++ b/content/03.02.08.MGIS.md @@ -0,0 +1,4 @@ +#### MGIS + + +The Musa Germplasm information system, [MGIS](https://www.crop-diversity.org/mgis/), serves as a comprehensive community portal dedicated to banana diversity, a crop critical to global food security [@doi:10.1093/database/bax046]. MGIS offers detailed information on banana germplasm, focusing on the collections held by the CGIAR International Banana Genebank (ITC) [@doi:10.1186/s43170-020-00015-6]. It is built on the Build on the Drupal/Tripal technology, like BIMS and Florilège. Since its inception, MGIS developers have actively participated in the Breeding API (BrAPI) community, pushing for the integration of Multicrop Passport Data (MCPD) into Germplasm module call of the API. MGIS thus provides passport data information on ITC banana genebank accessions (with GLIS DOI), synchronized with [Genesys](https://www.genesys-pgr.org/a/overview/v2YdWZGrZjD), but also enriches it by incorporating additional data from other germplasm collections worldwide. All those germplasm data are available through BrAPI germplasm module calls implementations. For genotyping data, MGIS incorporates GIGWA [@doi:10.1093/gigascience/giz051], which provides tailored implementations for BrAPI genotyping module calls. Furthermore, MGIS supports the implementation of a set of BrAPI phenotyping module calls, facilitating the exposing of morphological descriptors and trait information supported by ontologies like the Crop Ontology [@doi:10.1093/aobpla/plq008]. It is integrated with the Trait Selector BrAPP, developed as part of a project involving Breedbase [@doi:10.1093/g3journal/jkac078]. Uses cases between the Musa implementation of Breedbase, MusaBase, and MGIS to interlink genebank and breeding data. diff --git a/content/03.03.--.HEADER.Federation_Infrastructure.md b/content/03.03.--.HEADER.Federation_Infrastructure.md new file mode 100644 index 0000000..a255818 --- /dev/null +++ b/content/03.03.--.HEADER.Federation_Infrastructure.md @@ -0,0 +1,8 @@ +### Federated Data Management Infrastructures + diff --git a/content/03.03.01.AGENT_Portal.md b/content/03.03.01.AGENT_Portal.md new file mode 100644 index 0000000..f8b87cc --- /dev/null +++ b/content/03.03.01.AGENT_Portal.md @@ -0,0 +1,16 @@ + +#### AGENT Portal + +In the global system for ex situ conservation of plant genetic resources (PGR) [@doi:10.3390/plants10081557], a total of ~5.8 million accessions are conserved in 1750 ex situ genebanks [@doi:10.2135/cropsci2017.01.0014]. Unique and permanent identifiers in the form of DOIs are available for more than 1.7 million accessions [@doi:Food and Agriculture Organization (FAO) The Global Information System for PGRFA]. Each DOI is linked to some basic descriptive data that facilitates the use of these resources. Many DOIs are also linked to additional data from different domains or will be in the future. In order to answer questions on the global biological diversity of a plant species, on duplicate detection, on provenance tracking for the identification of genetic integrity, on the selection of the most suitable material for various purposes, including breeding and research, and to support further applications in data mining or AI, a data space beyond the most basic information is needed that includes genotypic and phenotypic data. In this context, the aim of the AGENT project () funded by the European Commission is to develop a concept for the digital exploitation and activation of this GenRes data space via European ex situ genebanks according to the FAIR criteria [@doi:10.1038/sdata.2016.18] and to test it in practice using two important crops, barley and wheat. In two work packages, standards and technology for data interoperability will be developed to establish a genetic resources infrastructure, which regulates data acquisition of genotypic and phenotypic data, integrates and archives them and makes them accessible according to FAIR principles. To this end, 13 European genebanks and 5 bioinformatics centers are cooperating and have agreed on standards and protocols for (i) the data flow (see figure {@fig:AGENT_Genotyping_Data_Flow}) and data formats [@doi:10.12688/f1000research.109080.2] for central archiving of genotypic and phenotypic data. + +![Figure Data flow of genotypic data from AGENT partner databases](images/AGENT_Genotyping_Data_Flow.png){#fig:AGENT_Genotyping_Data_Flow width="100%"} + +The AGENT portal as described in more detail in section unlock the full potential of the biological material stored in genebanks around the globe by using FAIR international data standards and an open digital infrastructure for the management of plant genetic resources. The implemented BrAPI interface enables to mine current and historic genotypic and phenotypic information to drive the discovery of genes, traits and knowledge for future missions, complement existing information for wheat and barley and the new data standards and infrastructure to foster an improved management of PGR for other crop species across European genebanks. + +For the joint research data infrastructure for the federation of collections of genotypic and phenotypic data from European gene banks and bioinformatics institutes, a AGENT portal ({@fig:AGENT_WebFrontend}) as database infrastructure for integrated plant genetic resources on ex-situ genebanks is being created. It provides, manual data exploration, machine-readable access via BrAPI and provide data to the cored data deposition resources at the European Bioinformatics Institute (EBI). + +![Figure AGENT Portal](images/AGENT_WebFrontend.png){#fig:AGENT_WebFrontend width="100%"} + +The AGENT database backend aggregates curated and integrated passport data, phenotypic and genotypic data about wheat and barley accessions of 18 project partners are harmonized and integrated via BrAPI endpoints () and explorable in a web portal (). The BrAPI endpoints were made available by scattered implementation. Genotyping data use DivBrowse [@doi:10.1093/gigascience/giad025] storage engine and BrAPI interface. Endpoints for sample data are implemented using AGENT database SQL to BrAPI broker service. +To integrate those BrAPI endpoint provider into a single service and URL scheme, we work on their integration in a BrAPI proxy service. As next steps, we will expand BrAPI implementation to enable the integration of analysis pipelines in the AGENT portal, e.g. for genebank mining tools such as the FIGS+ pipeline developed by AGENT partner ICARDA [@doi:10.22004/AG.ECON.266624]. Another perspective is to integrate the data collected in the AGENT project into the European Search Catalogue for Plant Genetic Resources (EURISCO) [@doi:10.1093/nar/gkac852] and to implement BrAPI endpoints to make data on PGR collections in European genebanks programmatically accessible. diff --git a/content/03.03.02.MIAPPE_MIRA.md b/content/03.03.02.MIAPPE_MIRA.md new file mode 100644 index 0000000..a188433 --- /dev/null +++ b/content/03.03.02.MIAPPE_MIRA.md @@ -0,0 +1,4 @@ +#### MIAPPE ISA to BrAPI service + +Phenotyping is crucial in the breeding process as it enables the identification of desirable traits, selection of breeding lines, and evaluation of breeding success. In the plant community, MIAPPE (Minimal Information About a Plant Phenotyping Experiment) [@doi:10.1111/nph.16544] is the established standard for phenotyping experiments and is commonly serialized as ISA Tab [@doi:10.1038/ng.1054]. Although ISA Tab is easy to read for non-technical experts due to its file-based approach, it lacks programmatic access, particularly for web applications. BrAPI, which is aligned with MIAPPE, can help solve this problem. +MIRA is a tool that enables the automatic deployment of a BrAPI server on a MIAPPE-compliant dataset in ISA Tab format. It can be deployed from a Docker image with the dataset mounted. By utilizing the mapping between MIAPPE, ISA, and BrAPI, there is no need for parsing or manual mapping of datasets that are already compliant with (meta-)data standards. By gaining programmatic access through BrAPI to these datasets, it facilitates the integration of phenotyping datasets into web applications. diff --git a/content/03.03.03.MIAPPE_BrAPI2ISA.md b/content/03.03.03.MIAPPE_BrAPI2ISA.md new file mode 100644 index 0000000..490fd80 --- /dev/null +++ b/content/03.03.03.MIAPPE_BrAPI2ISA.md @@ -0,0 +1,3 @@ +#### MIAPPE "BrAPI to ISA" service + +Since the release of BrAPI 1.3, efforts have been made to incorporate support for the Minimum Information About Plant Phenotyping Experiments (MIAPPE) standard into the specification [@doi:10.1111/nph.16544]. This integration was finalized in BrAPI 2.0, resulting in full compatibility between the two standards. Consequently, BrAPI now encompasses all attributes necessary for MIAPPE compliance, adhering to standardized descriptions in accordance with MIAPPE guidelines. Leveraging BrAPI as a standardized RESTful web service API specification, we employ the ISA standard for storing metadata and phenotyping data in a standardized manner. This data is structured in the ISA-TAB file format and subjected to validation using the [MIAPPE ISA configuration](https://github.com/ELIXIR-Belgium/isatab-validation). The "BrAPI to ISA" service functions as a converter between BrAPI RESTful endpoints and ISA-TAB, facilitating the archiving of metadata and data and thereby enhancing data preservation and accessibility. The [BrAPI2ISA](https://github.com/elixir-europe/plant-brapi-to-isa) tool is designed to be compatible with BrAPI 1.3, and we invite contributions from the community to extend support for the latest versions of BrAPI. diff --git a/content/03.03.04.BrAPIMapper.md b/content/03.03.04.BrAPIMapper.md new file mode 100644 index 0000000..d8e7b5a --- /dev/null +++ b/content/03.03.04.BrAPIMapper.md @@ -0,0 +1,4 @@ +#### BrAPIMapper + + +BrAPIMapper is a full BrAPI implementation of all calls for any data source missing BrAPI implementation or compliance with some BrAPI versions. BrAPIMapper is provided as a docker application that can get its external data sources from mySQL or PostgreSQL databases (with a dedicated interface for Chado database schema), generic REST services (with a dedicated interface for BrAPI endpoints), flat files (XML, JSON, CSV/TSV/GFF3/VCF, YAML) or any combination of any of those. It provides an administration interface to map BrAPI data models to external data sources. The interface allows administrators to select the BrAPI specification versions to use and the calls to enable. Data mapping configuration export and import features simplify upgrades to future BrAPI specifications changes as administrators would only have to map missing fields or make minor adjustments. Amongst others, it supports paging, search calls, either by providing direct results or using deferred results with a search identifier, lists, authentication and manages access restrictions to calls that can be setup through the administration interface as well. This tool aims to accelerate BrAPI services deployment while ensuring specification compliance. diff --git a/content/03.04.--.HEADER.Data_Visualization.md b/content/03.04.--.HEADER.Data_Visualization.md new file mode 100644 index 0000000..64a4888 --- /dev/null +++ b/content/03.04.--.HEADER.Data_Visualization.md @@ -0,0 +1,6 @@ +### Data visualization + diff --git a/content/03.04.01.Flapjack.md b/content/03.04.01.Flapjack.md new file mode 100644 index 0000000..d04ae06 --- /dev/null +++ b/content/03.04.01.Flapjack.md @@ -0,0 +1,3 @@ +#### Flapjack + +[Flapjack](https://ics.hutton.ac.uk/flapjack) [@doi:10.1093/bioinformatics/btq580] is a multi-platform desktop application for data visualization and breeding analysis (eg, pedigree verification, marker-assisted backcrossing and forward breeding) using high-throughput genotype data. Data can be easily imported into Flapjack from any BrAPI compatible data source with genotype data available. [Flapjack Bytes](https://github.com/cropgeeks/flapjack-bytes) is a smaller, lightweight and fully web-based counterpart to Flapjack, which can be easily embedded into a database website to provide similar visualizations online. Traditionally supporting its own text-based data formats, Flapjack's use of BrAPI has streamlined the end-user experience for data import and work is underway to determine the best methods to exchange analysis results using future versions of the API. diff --git a/content/03.04.02.Helium.md b/content/03.04.02.Helium.md new file mode 100644 index 0000000..ad04fe8 --- /dev/null +++ b/content/03.04.02.Helium.md @@ -0,0 +1,5 @@ +#### Helium + +Helium () [@doi:10.1186/1471-2105-15-259] is a plant pedigree visualization platform designed to account for the specific problems that are unique to plant pedigrees. A pedigree is a representation of how genetically discrete individuals are related to one another and is therefore a representation of the genetic relationship between individual plant lines, their parents and progeny. Plant pedigrees are often used to check for potential genotyping or phenotyping errors, since these errors, by the very nature of Mendelian inheritance, are constrained by the pedigree structure in which they exist (Paterson 2011). The accurate representation of pedigrees, and the ability to pull pedigree data from different data sources is therefore important in plant breeding and genetics and therefore ways to visualize and interact this complex data in meaningful ways is critical. + +From its original desktop interface (), Helium has developed into a web-based visualization platform implementing BrAPI calls to allow users to import data from other BrAPI compliant databases (). The ability to pull data from BrAPI compliant data sources has significantly expanded Helium’s capability and utility within the community. Helium is used in projects ranging in size from tens to tens of thousands of lines and across a wide variety of crops and species. While originally designed for plant data [@doi:10.3389/fpls.2024.1268847] it has also found utility in other non-plant projects [@doi:10.1007/s10592-024-01611-z] highlighting its broad utility. This also allows Helium users to provide direct dataset links to collaborators allowing the original data to be held with the data provider and utilising Helium for its visualization functionality. Our current Helium deployment includes example BrAPI calls to a barley dataset at Hutton to allow users to test the system and features it offers. diff --git a/content/03.04.03.Trait_Selector_BrAPP.md b/content/03.04.03.Trait_Selector_BrAPP.md new file mode 100644 index 0000000..f4f1df9 --- /dev/null +++ b/content/03.04.03.Trait_Selector_BrAPP.md @@ -0,0 +1,6 @@ +#### Trait Selector BrAPP + + +BrAPPs are simple tools developed by the BrAPI community that are entirely reliant on BrAPI for their data requirements. This means a single BrAPI can be shared and used by many organizations, as long as those organizations have the standard BrAPI endpoints available. + +The Trait Selector BrAPP is used to search and select useful traits, using a visual aid to help the user find exactly what they need. This BrAPP works with both breeding databases and genebanks. Breeding databases would need to only implement the trait, observation and observation variable calls, while genebanks would require trait, germplasm attribute and germplasm attribute value calls. So, BrAPI servers compliant with version 2 implementing any of these sets of calls would just need to follow the documented steps to create an SVG image of a plant of interest in order to use this BrAPP. CassavaBase and MGIS are two successful examples of the use of this BrAPP. (example screenshots coming + supplementary data: links to the git and the doc) diff --git a/content/03.04.04.DArTView.md b/content/03.04.04.DArTView.md new file mode 100644 index 0000000..05fbb22 --- /dev/null +++ b/content/03.04.04.DArTView.md @@ -0,0 +1,4 @@ +#### DArTView + + +DArTView is a desktop application for visualizing genotype variant data and looking for trends or correlations. It is newly BrAPI compatible and can use BrAPI as an input data source. diff --git a/content/03.04.05.DivBrowse.md b/content/03.04.05.DivBrowse.md new file mode 100644 index 0000000..24e26fa --- /dev/null +++ b/content/03.04.05.DivBrowse.md @@ -0,0 +1,5 @@ +#### DivBrowse + +DivBrowse [@doi:10.1093/gigascience/giad025] is a web platform for exploratory data analysis of huge genotyping studies. The software can be run standalone or integrated as a plugin into existing data web portals. It provides a powerful interactive visualization of variant call matrices with hundreds of millions of variants and thousands of samples and enables easy data import and export by using standardized and established bioinformatics file formats. +At its core, DivBrowse combines the convenience of a genome browser and adds features tailored to the diversity analysis of germplasm. It is able to display genomic features such as nucleotide sequence, associated gene models and short genomic variants. DivBrowse provides visual access to large VCF files obtained through genotyping experiments. In addition to visualizing variant calls per variant and genotype, DivBrowse also calculates and displays variant statistics such as minor allele frequencies, proportion of heterozygous calls or missing variant calls for each visualized genomic window. In addition, dynamic Principal Component Analyses (PCAs) can be performed on a user specified genomic area to provide information on local genomic diversity. +DivBrowse has a Javascript API to control the tool from a hosting web portal (e.g. to control the list of genotypes to be displayed and the reference genome). DivBrowse has an interface to BLAST, which can be used to directly access genes or other genomic features. The modular structure of DivBrowse also allows developers to configure and easily embed links to external information systems. Furthermore, parts of BrAPI are implemented to provide genotypic data via its server-side component and is also able to consume and visualize genotypic data via an external BrAPI endpoint through the client-side GUI. diff --git a/content/03.05.--.HEADER.Analytics.md b/content/03.05.--.HEADER.Analytics.md new file mode 100644 index 0000000..f2a62cf --- /dev/null +++ b/content/03.05.--.HEADER.Analytics.md @@ -0,0 +1,6 @@ +### Analytics + diff --git a/content/03.05.01.QBMS.md b/content/03.05.01.QBMS.md new file mode 100644 index 0000000..5b746e5 --- /dev/null +++ b/content/03.05.01.QBMS.md @@ -0,0 +1,7 @@ +#### QBMS + +Modern breeding programs can utilize data management systems to maintain both phenotypic and genotypic data. Numerous systems are available for adoption. To fully leverage the benefits of digitalization in this ecosystem, breeders need to utilize data from different sources to make efficient data-driven decisions. With increased computational power at their disposal, scientists can construct more advanced analysis pipelines by combining various data sources. + +[QBMS](https://icarda-git.github.io/QBMS) [@doi:10.5281/zenodo.10791627] R package eliminates technical barriers scientists experience when using the BrAPI calls in their analysis scripts and pipelines. This barrier arises from the complexity of managing API backend processes, such as authentication, tokens, TCP/IP protocol, JSON format, pagination, stateless calls, asynchronous communication, database IDs, and more. To bridge this gap, we have developed the QBMS R package. This package abstracts the technical complexities, providing breeders (targetted end users) with stateful action verbs/functions familiar to them when navigating their GUI systems. It enables them to query and extract data into a standard data frame structure, consistent with their use of R language, one of the most common statistical tools in the breeding community. + +Since its release on the official CRAN repository in October 2021, the QBMS R package has garnered over 9400 downloads. Several tools, such as MrBean, rely on the QBMS package as their source data adapter. Moreover, the community has started building extended solutions on top of it. QBMS can serve as a cornerstone in the breeding modernization revolution by providing access to actionable data and enabling the creation of dashboards to reduce the time between harvest and decision-making for the next breeding cycle. diff --git a/content/03.05.02.Mr_Bean.md b/content/03.05.02.Mr_Bean.md new file mode 100644 index 0000000..6a5e8d0 --- /dev/null +++ b/content/03.05.02.Mr_Bean.md @@ -0,0 +1,5 @@ +#### Mr. Bean + +Mr.Bean [@doi:10.3389/fpls.2023.1290078] is a graphical user interface designed to assist breeders, statisticians, and individuals involved in plant breeding programs with the analysis of field trials. By utilizing innovative methodologies such as SpATS for modeling spatial trends and autocorrelation models to address spatial variability, Mr.Bean proves highly practical and powerful in facilitating faster and more effective decision-making. Modeling Genotype-by-environment interaction poses its challenges, but Mr.Bean offers the capability to explore various variance-covariance matrices, including Factor Analytic, compound symmetry, and heterogeneous variances, among others, aiding in the assessment of genotype performance across diverse environments. + +Mr.Bean boasts flexibility in importing different file types, yet for users managing their data within data management systems (DMS), the process of downloading from their DMS and importing it into MrBean can be cumbersome. To address this issue, QBMS operates in the back-end. This feature prompts users to input the URL of the server, their credentials if necessary, and the specific trial they wish to analyze. Subsequently, users can seamlessly access and utilize their dataset within the entire interface. diff --git a/content/03.05.03.G-Crunch.md b/content/03.05.03.G-Crunch.md new file mode 100644 index 0000000..f8a4878 --- /dev/null +++ b/content/03.05.03.G-Crunch.md @@ -0,0 +1,4 @@ +#### G-Crunch + +G-Crunch is an upcoming user-facing analysis tool that attempts to fill the space of simple, user driven analytics requests, with a generic user interface and the ability to swap out data sources and analysis tools. G-Crunch hopes to streamline repeatable, debuggable simple analytic requests and results. +G-Crunch, as a tool, couldn't feasibly exist without BrAPI. The support of BrAPI interfaces allows G-Crunch to use one unified request method, and adapt to the user's (BrAPI-compliant) existing network of tools, which lowers the barrier to entry for adoption. diff --git a/content/03.06.--.HEADER.Samples_and_Genotypes.md b/content/03.06.--.HEADER.Samples_and_Genotypes.md new file mode 100644 index 0000000..41f529a --- /dev/null +++ b/content/03.06.--.HEADER.Samples_and_Genotypes.md @@ -0,0 +1,6 @@ +### Samples and Genotypes + + diff --git a/content/03.success_stories/03.06.samples-and-genotypes.md b/content/03.06.01.GIGWA.md similarity index 59% rename from content/03.success_stories/03.06.samples-and-genotypes.md rename to content/03.06.01.GIGWA.md index e4c3f7a..adbd502 100644 --- a/content/03.success_stories/03.06.samples-and-genotypes.md +++ b/content/03.06.01.GIGWA.md @@ -1,15 +1,3 @@ -### Samples and Genotypes - - - -#### DArT Sample Submission - - -The DArT genotyping lab is heavily used world wide when it comes to plant genotyping. Developers at DArT have worked with the BrAPI community to establish a standard API for sending sample metadata to the lab before genotyping. This eliminates much of the human error involved with sending samples to en external lab. - #### GIGWA Gigwa is a JEE web application providing means to centralize, share, finely filter, and visualize high-throughput genotyping data [@doi:10.1093/gigascience/giz051]. Built on top of MongoDB, it is scalable and can support working smoothly with datasets containing billions of genotypes. Installable from docker images or all-in-one bundle archives, it is pretty straightforward to deploy on servers or local computers and has thus been adopted by numerous research institutes from around the world. Notably, Gigwa serves as a collaborative management tool and/or a portal for exposing the data for genebanks and breeding programs for some CGIAR centers [@doi:10.1002/ppp3.10187]. Thus, the amount of data hosted and made widely accessible using this system has kept growing over the last few years. @@ -17,10 +5,3 @@ Gigwa is a JEE web application providing means to centralize, share, finely filt Gigwa developers have been involved in the BrAPI community since 2016 and took part in designing the genotype-related part of the API's specifications. Its first BrAPI-compliant features were designed for compatibility with the Flapjack visualization tool [@doi:10.1093/bioinformatics/btq580] and thus primarily turned it into a BrAPI datasource. Consequently, over time, Gigwa being the first and most reliable application implementing BrAPI-Genotyping server calls, local collaborators and even external partners used it as a reference solution to design a number of tools taking advantage of those features (e.g., [BeegMac](https://webtools.southgreen.fr/BrAPI/Beegmac/), [SnpClust](https://github.com/jframi/snpclust), [QBMS](https://github.com/icarda-git/QBMS)). But further use-cases also required Gigwa to be able to consume data from other BrAPI servers, which led to also implement API-client features into the system. Thanks to all this work, a close collaboration was progressively established with the Integrated Breeding Platform team developing the widely used Breeding Management System, that ended up in both applications now being frequently deployed together, Gigwa pulling germplasm or sample metadata from BMS, and BMS displaying Gigwa-hosted genotypes within its own UI. Client BrAPI libraries being available for R, community members typically write ad-hoc scripts syndicating data from multiple BrAPI sources (for instance phenotypes from a datasource and genotypes from another) in order to run various kinds of analyses such as GWAS, genomic selection or phylogenetic investigations. As a perspective, we may expect the most generic and widely-used of those pipelines to be at least publicly distributed, and possibly web-interfaced using solutions like R-Shiny in order to provide new, excitingly useful online services, based on Gigwa-hosted data. - -#### PHG - - -The Practical Haplotype Graph (PHG) is a graph-based computational framework that represents large-scale genetic variation and is optimized for plant breeding and genetics. Using a pangenome approach, each PHG stores haplotypes (the sequence of part of an individual chromosome) to represent the collected genes of a species. This allows for a simplified approach for dealing with large scale variation in plant genomes. The PHG pipeline provides support for a range of genomic analyses and allows for the use of graph data to impute complete genomes from low density sequence or variant data. - -Users access the crop databases either with direct calls to the PHG embedded server or indirectly using the rPHG library from an R environment. The PHG server accepts BrAPI endpoint queries to return information on sample lists and the variants used to define the graph's haplotypes. In addition, PHG users utilize the BrAPI variantsets endpoint query to return links to VCF files containing haplotype data. Work on the PHG is ongoing. We expect to support additional BrAPI endpoints that allow for slicing genotypic data based on samples and regions. diff --git a/content/03.06.02.PHG.md b/content/03.06.02.PHG.md new file mode 100644 index 0000000..091c844 --- /dev/null +++ b/content/03.06.02.PHG.md @@ -0,0 +1,6 @@ +#### PHG + + +The Practical Haplotype Graph (PHG) is a graph-based computational framework that represents large-scale genetic variation and is optimized for plant breeding and genetics. Using a pangenome approach, each PHG stores haplotypes (the sequence of part of an individual chromosome) to represent the collected genes of a species. This allows for a simplified approach for dealing with large scale variation in plant genomes. The PHG pipeline provides support for a range of genomic analyses and allows for the use of graph data to impute complete genomes from low density sequence or variant data. + +Users access the crop databases either with direct calls to the PHG embedded server or indirectly using the rPHG library from an R environment. The PHG server accepts BrAPI endpoint queries to return information on sample lists and the variants used to define the graph's haplotypes. In addition, PHG users utilize the BrAPI variantsets endpoint query to return links to VCF files containing haplotype data. Work on the PHG is ongoing. We expect to support additional BrAPI endpoints that allow for slicing genotypic data based on samples and regions. diff --git a/content/03.07.00.HEADER.Data_Portals.md b/content/03.07.00.HEADER.Data_Portals.md new file mode 100644 index 0000000..f656416 --- /dev/null +++ b/content/03.07.00.HEADER.Data_Portals.md @@ -0,0 +1,6 @@ +### Data Portal + + diff --git a/content/03.07.01.FAIDARE.md b/content/03.07.01.FAIDARE.md new file mode 100644 index 0000000..cb8876a --- /dev/null +++ b/content/03.07.01.FAIDARE.md @@ -0,0 +1,8 @@ +#### FAIDARE + + +FAIDARE () is a data discovery portal providing a biologist friendly search system over a global federation of 33 plant research databases. It allows to identify data resources using a full text approach completed with domain specific filters and to link back to the original database for visualization, analysis and download. For instance, it is possible to search for "wheat drought" then to refine the search to the "Triticum aestivum" taxon and yield component traits such as "Thousand Grain Weight". The indexed data types are very broad and include genomic features, such as genes or transposable elements, selected bibliography, QTL, markers, genetic variation studies, phenomic studies and plant genetic resources ie germplasm. This inclusiveness is achieved thanks to a two stage indexation data model. The most generic one provides basic search functionalities and relies on five fields : name, link back URL, data type, species and exhaustive description. The filtering is directly tied to some of those fields. Therefore, to provide more advanced filtering, FAIDARE is also providing a second stage indexation mechanism by taking advantage of BrAPi endpoints to get more detailed metadata on genotyping and phenotyping studies as well as germplasm. In parallel, FAIDARE provides a pre-visualization of germplasm and studies using dedicated cards. +![Figure FAIDARE Federation](images/Schema_FAIDARE.png){#fig:Schema_FAIDARE width="100%"} +The indexation mechanism relies on a dedicated public software () that allows data resources manager to request the indexation of there database using pull requests. This BrAPI client is able to extract data from any BrAPI 1.3 and 1.2 endpoint and development of BrAPI 2.x indexation will be initiated in 2025. Since not all databases are willing to implement BrAPI endpoints, we also provide the possibility to generate metadata as BrAPI json files, hence using the standard as a file exchange format. +FAIDARE architecture has been designed by elaborating on the GnpIS Software Architecture [@doi:10.34133/2019/1671403]. As a consequence, BrAPI is at the core of its datamodel, and in particular the JSON data files served by the Elasticsearch NoSQL engine are enriched version of the BrAPI JSON files. FAIDARE also includes a BrAPI endpoint that serves all indexed metadata. +FAIDARE has been adopted by several communities and in particular in the ELIXIR and EMPHASIS european infrastructures. It is also used by the WheatIS of the Wheat-Initiative. Several databases are added each year to the FAIDARE global federation, allowing to increase both the portal and the BrAPI adoption. diff --git a/content/03.07.02.GLIS.md b/content/03.07.02.GLIS.md new file mode 100644 index 0000000..b3de5dd --- /dev/null +++ b/content/03.07.02.GLIS.md @@ -0,0 +1,7 @@ +#### GLIS + +The Global Information System (GLIS) on Plant Genetic Resources for Food and Agriculture (PGRFA) of the International Treaty on Plant Genetic Resources for Food and Agriculture (ITPGRFA) is a web-based global entry point for users and third-party systems to access information and knowledge on scientific, technical and environmental matters to strengthen PGRFA conservation, management and utilization activities. The system and its portal also enable recipients of PGRFA to make available all non-confidential information on germplasm according to the provisions of the Treaty and facilitates access to the results of their research and development. + +Thanks to the adoption of Digital Object Identifiers (DOIs) to PGRFA ex situ and in situ based on the Multi-Crop Passport Descriptors (MCPD), the Portal provides access to 1.7 million PGRFA in collections conserved worldwide. Of these, over 1.5 million are accessible for research, training and plant breeding in the food and agriculture domain. + +The Scientific Advisory Committee of the International Treaty and the Governing Body have repeatedly welcomed efforts on interoperability among germplasm information systems. In this context, the GLIS Portal adopted the Breeding API (BrAPI v1) in 2022. Integrating the BrAPI among the GLIS content negotiators facilitates queries and the exchange of content for data management in plant breeding. The Portal also offers other protocols (XML, DarwinCore, JSON and JSON-LD) to increase data and metadata connectivity. In the near future, depending on the availability of resources, upgrading to BrAPI v2 is planned. diff --git a/content/03.07.03.FLORILEGE.md b/content/03.07.03.FLORILEGE.md new file mode 100644 index 0000000..f5c71c6 --- /dev/null +++ b/content/03.07.03.FLORILEGE.md @@ -0,0 +1,8 @@ +#### FLORILÈGE (Gateway to French Plant Genetic Resources) + +Designed primarily for the general public, Florilège provides access to public collections of all French plant biological resources centers. This web portal allows to browse available plant genetic resource accessions and gives the possibility to order some seeds or plant material for cultivation. It includes plant genetic resources of around fifty plant genus from 19 genebanks. + +Florilège retrieves accession information from different BrAPI-compliant systems. They include OLGA, a genebank accessions information management system, and GnpIS[@doi:10.34133/2019/1671403] [@doi:10.1007/978-1-4939-6658-5_5], an INRAE data repository for plant genetic resources, phenomics and genetics. Using BrAPI to gather data from these systems reduced the efforts and enabled standardized data retrieval. As a consequence, BrAPI is the de facto standard for exchanging data within the French plant genetic resources community. The Florilège team also requested several update of the BrAPi specifications to better serve this use case, such as Collection or improved external references. +![Figure Florilege Workflow](images/Schema_Florilege.jpg){#fig:Schema_Florilege width="100%"} + +Florilège is developed in Drupal 10, and uses xnttbrapi module (to easily connect to BrAPI compliant external databases). diff --git a/content/03.07.04.Zendro.md b/content/03.07.04.Zendro.md new file mode 100644 index 0000000..655ee1f --- /dev/null +++ b/content/03.07.04.Zendro.md @@ -0,0 +1,15 @@ +#### BrAPI plug and play GraphQL based data-warehouse + +Using the "Zendro" set of automatic software program-code generators (zendro-dev.github.io) a fully functional, efficient, and cloud-capable BrAPI data-warehouse has been created for the current version of the BrAPI data models. The resulting data-warehouse has two interfaces, one application programming interface implemented in the form of a GraphQL web-server and another intuitive point and click graphical user interface in the browser. Both provide secure access to data read and write functions for all BrAPI data models. These data administration methods comprise create, read, update, and delete (CRUD) functions that are standardized and accept the same parameters for all data models. + +While data write access comprises both persisting single or multiple records, data read access is particularly rich in features and includes access to single records referred to by their id and access to multiple records selected by logical filters. In this, multiple records are paginated using the highly efficient cursor based pagination model as proposed in the GraphQL standard. Logical filters allow for exhaustive search queries, whose structure is highly intuitive and based around logical triplets in which a data model field is validated using an operator and a value, e.g. "Study name equals 'xyz'". In this a large collection of operators is available and triplets can be combined to logical search trees using "and" or "or" operators. Searches can be extended over relationships between data models, thus enabling a user to query the warehouse exactly for the data wanted. + +Access security is implemented with the OAuth2 user authentication standard (datatracker.ietf.org/doc/html/rfc6749). Authorization is based on user roles and can be configured differently for each single data model read or write function. + +The browser based graphical user interface is implemented in React.js with Next and exposes an intuitive and self explanatory set of functions for each data model. In the left a menu allows the user to access all BrAPI data models. Upon clicking on a model a table is shown which allows the user to paginate through all existing records, sort them by any column, search the records, add new records, or update or delete existing records, if the user role authorizes these functions. Record data can be inspected in a detail view and here relationships to other data records can be reviewed using the very same graphical visual representations. Breadcrumbs allow the user to navigate back and forth in the trail of relationships inspected. Finally, the generated graphical interface allows for the integration of interactive scientific plots and analysis tools written in JavaScript or WebAssembly. + + + +The Zendro based BrAPI plug and play data-warehouse is capable of forming an efficient cloud of data servers. This is achieved simply by linking (URLs) other Zendro based warehouses that expose the same GraphQL API to the same data models, or a subset of data models. Any network of such Zendro GraphQL servers can be set up using this configuration approach. The code generated then exposes full access to all data records stored on any node of the network, while maintaining full security control at each node. Importantly, the warehouses are programmed in such a way that any number of data servers can be joined without loss of efficiency. Only the network connection speed and size of requested record sets influence the performance. + +As explained, Zendro is a code generator and creates a fully functional data warehouse from input data model definitions, i.e. a schema. The schema is given in the form of special data model descriptions, in which each model is defined using JavaScript Object Notation (JSON). Each model is defined in its respective JSON file. A translator has been developed to create the Zendro schema from the BrAPI data model definitions. This ensures that Zendro can create plug and play data warehouses for future versions of the BrAPI with great ease, i.e. by translating the BrAPI models to Zendro input and subsequently running Zendro to create the plug and play warehouse. diff --git a/content/03.success_stories/03.01.data-collection.md b/content/03.success_stories/03.01.data-collection.md deleted file mode 100644 index df1169c..0000000 --- a/content/03.success_stories/03.01.data-collection.md +++ /dev/null @@ -1,34 +0,0 @@ -### Data Collection - - - - -#### Field Book - - -Phenotypic data collection is an essential part of the breeding process. Historically, gathering data in the field was done with pen and paper, or perhaps some version of a digital spreadsheet. The abundance and prevalence of smart phones has allowed the Field Book mobile app to enhance data collection. Field Book can create well-formed digital observation records from the moment they are taken. This can improve the efficiency of data collection and reduce human error. - -In 2018, BrAPI was introduced into Field Book; specifically, the Core and Phenotyping modules. BrAPI was able to take things a step further by automating the flow of data from the Field Book mobile app to a central database server. This workflow allows data collection and storage to be expedited, removing the need of the user to transfer export files manually. Since Field Book’s adoption of BrAPI, many community servers have been integrated to simplify data storage. In this work flow, data is collected and stored completely digitally with little-to-no human involvement. - -#### ClimMob - - -Not all data can be collected by a single person, or even by a single organization. ClimMob is a tool to easily allow citizen scientists to assist in the data collection process. Although this data may not be as detailed as a focused scientific program, it can be very useful to collect simple data from a wide range of locations and environments. - -When it comes to BrAPI compatibility, ClimMob follows the same patterns established by Field Book. During a survey, all the farmer collected data is stored in a central ClimMob node. When the survey is complete, all the data is uploaded automatically via BrAPI to a central breeding database for long term storage and analysis. - -#### ImageBreed - - -High-throughput phenotyping has been gaining significant traction lately as a way to collect lots of data very quickly. Image collection from unmanned arial and ground vehicles (UAVs and UGVs) are a great way to collect a lot of raw data all at once, then analyze it later. ImageBreed is a image collection pipeline tool to support regular use of UAVs and UGVs. - -When the raw images have been processed through the standardization pipelines in ImageBreed, useful phenotypes can be extracted from the images. The BrAPI standard is used to push these phenotypes back to a central breeding database where they can be analyzed with other data. In addition to this, ImageBreed also has the option to use BrAPI to upload the raw images to the central breeding database, or any other BrAPI compatible long term storage service. The BrAPI models in the current version of the standard (V2.1) are rudimentary, but effective. The ImageBreed team has put in some work to enhance the BrAPI image data standards. - -#### GridScore - -Phenotypic data collection underpins scientific crop research and plant breeding. Knowledge gained from collected data and its analysis alongside data visualizations inform further phenotypic trials and ideally support research hypotheses. The importance of accuracy and efficiency in the collection of this data as well as the infrastructure to facilitate the flow of data from the field to a knowledge base cannot be underestimated. [GridScore](https://ics.hutton.ac.uk/get-gridscore/) [@doi:10.1186/s12859-022-04755-2] is a modern mobile application for phenotypic observations that harnesses technological advancements in the area of mobile devices to enrich the data collection process. - -BrAPI has further increased the value of GridScore by integrating it into the overarching workflow from trial creation, data collection, and its ultimate data storage for further processing. Specifically, trial designs as well as trait definitions can be imported into GridScore using BrAPI and a finalized trial can ultimately be exported via BrAPI to any compatible database. diff --git a/content/03.success_stories/03.02.data-management.md b/content/03.success_stories/03.02.data-management.md deleted file mode 100644 index fd7b15a..0000000 --- a/content/03.success_stories/03.02.data-management.md +++ /dev/null @@ -1,79 +0,0 @@ -### Data Management - - - -#### PHIS - -The Hybrid Phenotyping Information System ([PHIS](http://www.phis.inrae.fr/) [@doi:https://doi.org/10.1111/nph.15385]), based on the [OpenSILEX](https://github.com/OpenSILEX/) framework, is an ontology-driven information system based on semantic web technologies. PHIS is deployed in several field and greenhouse platforms of the national [PHENOME](https://www.phenome-emphasis.fr/) and European [EMPHASIS](https://emphasis.plant-phenotyping.eu/) infrastructure. It manages and collects data from Phenotyping and High Throughput Phenotyping experiments on a day to day basis. PHIS unambiguously identifies all the objects and traits in an experiment, and establishes their types and relationships via ontologies and semantics. - -PHIS has been designed to be BrAPI-compliant. PHIS adheres to the standards and protocols specified by BrAPI and implements various services aligning with the BrAPI standards, encompassing the Core, Phenotyping, and Germplasm modules. This enables integration and compatibility with BrAPI-compliant systems and platforms. This prerequisite served as the basis for formalizing the data model, while also facilitating compatibility with other standards, such as the Minimal Information About a Plant Phenotyping Experiment ([MIAPPE](https://www.miappe.org/) [@doi:https://doi.org/10.1111/nph.16544]). By integrating BrAPI requirements into its structure, PHIS not only meets the standards of the phenotyping field, but also strengthens its capacity for interoperability and effective collaboration in the wider context of plant breeding and related fields. - -The fact that data within a PHIS instance can be queried through BrAPI services makes the indexing of PHIS in [FAIDARE](https://urgi.versailles.inra.fr/faidare/) very easy to implement. - -Furthermore, as PHIS offers BrAPI-compliant Web Services, it simplifies the integration and data exchange with other European information systems that handle phenotyping data. The adherence to BrAPI standards ensures a common interface and compatibility, facilitating communication and collaboration between PHIS and other systems in the European context. This interoperability not only eases data sharing, but also promotes a more coherent and efficient approach to the management and use of phenotyping data on various platforms and research initiatives within the European scientific community. - -#### DeltaBreed - - -DeltaBreed is an open-source data management system designed and developed by Breeding Insight to support USDA-ARS specialty crop and animal breeders. DeltaBreed is a unified system for managing breeding data that connects a variety of BrAPI applications (see list below). BrAPI integration allows the complexity underlying interoperability to be hidden, shielding users from multifactorial differences between diverse applications. DeltaBreed, adhering to the BrAPI model, establishes data standards and validations for users and provides a singular framework for data management and user training. - -DeltaBreed users need not be aware of BrAPI or the specifics of underlying applications but will notice that BrAPI interoperability reduces the need for human-mediated file transfers and data manipulation. Field Book users, for example, can connect to their DeltaBreed program, authenticate, and pull studies and traits directly from DeltaBreed to Field Book on their data collection device. The subsequent step of pushing observations from Field Book to DeltaBreed is straightforward via BrAPI, but will not be implemented until repeated observation handling workflows are established to differentiate and validate repeated observations, such as accidental repeats, overwrite requests, time-series observations, and repeated sub-entity measures. Users can expect DeltaBreed observation handling to become more seamless with future development. - -**DeltaBreed Connected Applications** -<< Submission is expected April 2024. We may need to trim this aspirational list down to reality in final edits.>> - -+ BIMS -+ BrAPI Java Server -+ BrAPI Sync -+ BreedBase -+ Diversity Arrays Technologies (DArT) genotyping services -+ Field Book -+ Gigwa -+ Mr Bean -+ Pedigree Viewer - -#### BMS - -The [Breeding Management System (BMS)](https://bmspro.io), developed by the [Integrated Breeding Platform (IBP)](https://integratedbreeding.net/), is a suite of tools designed to enhance the efficiency and effectiveness of plant breeding. BMS covers all stages of the breeding process, with the emphasis on germplasm management and [ontology](https://cropontology.org)-harmonized phenotyping. It also features analytics and decision-support tools. With its focus on interoperability, BMS integrates smoothly with BrAPI, facilitating easy connections with a broad array of complementary tools and databases, notably [Gigwa](https://southgreen.fr/content/gigwa) which is deployed together with the BMS to fulfill the genotyping data management needs of BMS users. - -The [brapi-sync](https://github.com/IntegratedBreedingPlatform/brapi-sync) tool, a significant component of BMS’s BrAPI capabilities, was developed by the IBP and released as a BrAPP for community use. Brapi-sync is designed to enhance collaboration among partner institutes within a network such as Innovation and Plant Breeding in West Africa ([IAVAO](https://www.iavao.org/en)), by enabling the sharing of germplasm and trials across BrAPI-enabled systems. This tool helps overcome traditional barriers to collaboration, ensuring data that was once isolated within specific programs or platforms can now be easily shared, integrated, and synchronized. - -Additionally, brapi-sync improves data management by utilizing the externalReferences field to maintain links to the origin IDs of each entity it transmits. This not only retains the original context of the data but also establishes a traceability mechanism for accurate data source attribution and verification. Such practices are crucial for maintaining data integrity and fostering trust among collaborative partners, ensuring access to accurate, reliable, and current information. - -#### Breedbase - -Breedbase is a comprehensive breeding data management system [@doi:10.1093/g3journal/jkac078] [@doi:10.1371/journal.pone.0240059] that implements a digital ecosystem for all breeding data, including trial data, phenotypic data, and genotypic data. Data acquisition is through tabled-based apps such as Fieldbook [@doi:10.2135/cropsci2013.08.0579] and related apps, such as Coordinate and InterCross apps, through drone imagery, Near Infra-Red Spectroscopy (NIRS), and other technologies. Search functions such as the Search Wizard interface provide powerful query capabilities, and various breeding-centric analysis tools are available, including mixed models, heritability, stability, PCA, and various clustering algorithms. The original impetus for creating Breedbase was the advent of new breeding paradigms based on genomic information such as genomic prediction algorithms [@doi:10.1093/genetics/157.4.1819] and the accompanying data management challenges, and complete genomic prediction workflow is integrated in the system. The first instance was created for the NextGen Cassava project in 2012 as the Cassavabase () database. Databases for other CGIAR root, tuber and banana (RTB) crops followed with database for yam (), sweet potato (), banana () as well as instances in labs and companies. The BrAPI interface [@doi:10.1093/bioinformatics/btz190] is crucial for Breedbase: Breedbase communicates via BrAPI with the data collection tablets, connection to other projects such as CLIMMOB [@doi:10.1016/j.compag.2023.108539], and many native tools use the BrAPI interface for accessing data. Users also appreciate the ability to connect to Breedbase instances using packages such as QBMS for data import into R for custom analyses. Breedbase has been an early and continuous adopter of, and contributor to, the BrAPI standard. - -#### BIMS - -BIMS (Breeding Information Management System) [@doi:10.1093/database/baab054] is a free, secure, and online breeding management system which allows breeders to store, manage, archive, and analyze their private breeding program data. BIMS enables individual breeders to have complete control of their own breeding data along with access to tools such as data import/export, data analysis and data archiving for their germplasm, phenotype, genotype, and image data. BIMS is currently implemented in five community databases, the Genome Database for Rosaceae [@doi:10.1093/nar/gky1000], CottonGEN [@doi:10.3390/plants10122805], the Citrus Genome Database, the Pulse Crop Database, and the Genome Database for Vaccinium, as well as a crop-independent website, . BIMS in these five community databases enables individual breeders to import publicly available data so that they can utilize public data in their breeding program. BIMS utilizes the Android App Field Book, enabling seamless data transfer between BIMS and the Field Book App through either files or BrAPI. Data transfer through BrAPI between BIMS and other resources such as BreedBase, GIGWA, and Breeder Genomics Hub is also on the way. - -#### Germinate - - - -[Germinate](https://ics.hutton.ac.uk/get-germinate/) [@doi:10.1002/csc2.20248] is an open-source plant genetic resources database that combines and integrates various kinds of plant breeding data including genotypic data, phenotypic trials data, passport data, images, geographic information and climate data into a single repository. Germinate is tightly linked to the BrAPI specification and supports a majority of BrAPI endpoints for querying, filtering and submission. - -Germinate integrates and connects with other BrAPI-enabled tools such as GridScore for phenotypic data collection, Flapjack for genotypic data visualization and Helium for pedigree visualization, but, due to the nature of BrAPI, Germinate can act as a data repository for any BrAPI-compatible tool. Thanks to the interoperability provided by BrAPI the need for manual data handling becomes a rarity with the direct benefit of faster data processing, fewer to no human errors, data security and integrity. - -#### PIPPA - -[PIPPA](https://pippa.psb.ugent.be) is a data management system used for collecting data from the [WIWAM](https://www.wiwam.be/) range of automated high throughput phenotyping platforms. These platforms have been deployed at different research institutes and commercial breeders across Europe in a variety of configurations with different types of equipment such as weighing scales, cameras and environment sensors. Examples are: - -+ [Umea Plant Science Centre](https://www.upsc.se/plant-growth-facilities-at-upsc-and-slu-umea/325-upsc-tree-phenotyping-platform.html) -+ [Fondazione Edmund Mach](https://cri.fmach.it/en/Facilities/Technological-Facilities/Plant-Phenotyping#application_fields) -+ [Phenovision](https://www.psb.ugent.be/phenotyping/phenovision) - -Developed from 2016 onwards, the software features a web interface with functionality for setting up new experiments for the platform(s), planning imaging and irrigation treatments, linking metadata to pots (genotype, growth media, manual treatments), exporting data, importing data and visualizing data as charts. It also supports the integration of image analysis scripts and connections to a compute cluster for job submission. - -To share the phenotype data of the experiments linked to publications, an implementation of BrAPI 1.3 was developed on a separate public PIPPA server open to the public, which allowed read only access to the data in a standardized format. This endpoint was registered on [FAIDARE](https://urgi.versailles.inra.fr/faidare/) and allows the data to be found alongside data from other BrAPI endpoints. - -As the BrAPI ecosystem has matured, it created a clear path for the development of PIPPA as to how to share data in a manner according to the FAIR principles which are becoming standard in plant research data management best practices. In combination with the support for [MIAPPE](https://www.miappe.org/), these have served as guidelines in the current development, which is focussed on delivering a public BraPI 2.1 endpoint and making more high throughput datasets publicly available via BrAPI. - -#### MGIS - - -The Musa Germplasm information system, [MGIS](https://www.crop-diversity.org/mgis/), serves as a comprehensive community portal dedicated to banana diversity, a crop critical to global food security [@doi:10.1093/database/bax046]. MGIS offers detailed information on banana germplasm, focusing on the collections held by the CGIAR International Banana Genebank (ITC) [@doi:10.1186/s43170-020-00015-6]. It is built on the Build on the Drupal/Tripal technology, like BIMS and Florilège. Since its inception, MGIS developers have actively participated in the Breeding API (BrAPI) community, pushing for the integration of Multicrop Passport Data (MCPD) into Germplasm module call of the API. MGIS thus provides passport data information on ITC banana genebank accessions (with GLIS DOI), synchronized with [Genesys](https://www.genesys-pgr.org/a/overview/v2YdWZGrZjD), but also enriches it by incorporating additional data from other germplasm collections worldwide. All those germplasm data are available through BrAPI germplasm module calls implementations. For genotyping data, MGIS incorporates GIGWA [@doi:10.1093/gigascience/giz051], which provides tailored implementations for BrAPI genotyping module calls. Furthermore, MGIS supports the implementation of a set of BrAPI phenotyping module calls, facilitating the exposing of morphological descriptors and trait information supported by ontologies like the Crop Ontology [@doi:10.1093/aobpla/plq008]. It is integrated with the Trait Selector BrAPP, developed as part of a project involving Breedbase [@doi:10.1093/g3journal/jkac078]. Uses cases between the Musa implementation of Breedbase, MusaBase, and MGIS to interlink genebank and breeding data. diff --git a/content/03.success_stories/03.03.federation-infrastructure.md b/content/03.success_stories/03.03.federation-infrastructure.md deleted file mode 100644 index 9f60591..0000000 --- a/content/03.success_stories/03.03.federation-infrastructure.md +++ /dev/null @@ -1,44 +0,0 @@ -### Federated Data Management Infrastructures - - - -#### AGENT Portal - -In the global system for ex situ conservation of plant genetic resources (PGR) [@doi:10.3390/plants10081557], a total of ~5.8 million accessions are conserved in 1750 ex situ genebanks [@doi:10.2135/cropsci2017.01.0014]. Unique and permanent identifiers in the form of DOIs are available for more than 1.7 million accessions [@doi:Food and Agriculture Organization (FAO) The Global Information System for PGRFA]. Each DOI is linked to some basic descriptive data that facilitates the use of these resources. Many DOIs are also linked to additional data from different domains or will be in the future. In order to answer questions on the global biological diversity of a plant species, on duplicate detection, on provenance tracking for the identification of genetic integrity, on the selection of the most suitable material for various purposes, including breeding and research, and to support further applications in data mining or AI, a data space beyond the most basic information is needed that includes genotypic and phenotypic data. In this context, the aim of the AGENT project () funded by the European Commission is to develop a concept for the digital exploitation and activation of this GenRes data space via European ex situ genebanks according to the FAIR criteria [@doi:10.1038/sdata.2016.18] and to test it in practice using two important crops, barley and wheat. In two work packages, standards and technology for data interoperability will be developed to establish a genetic resources infrastructure, which regulates data acquisition of genotypic and phenotypic data, integrates and archives them and makes them accessible according to FAIR principles. To this end, 13 European genebanks and 5 bioinformatics centers are cooperating and have agreed on standards and protocols for (i) the data flow (see figure {@fig:AGENT_Genotyping_Data_Flow}) and data formats [@doi:10.12688/f1000research.109080.2] for central archiving of genotypic and phenotypic data. - -![Figure Data flow of genotypic data from AGENT partner databases](images/AGENT_Genotyping_Data_Flow.png){#fig:AGENT_Genotyping_Data_Flow width="100%"} - -The AGENT portal as described in more detail in section unlock the full potential of the biological material stored in genebanks around the globe by using FAIR international data standards and an open digital infrastructure for the management of plant genetic resources. The implemented BrAPI interface enables to mine current and historic genotypic and phenotypic information to drive the discovery of genes, traits and knowledge for future missions, complement existing information for wheat and barley and the new data standards and infrastructure to foster an improved management of PGR for other crop species across European genebanks. - -For the joint research data infrastructure for the federation of collections of genotypic and phenotypic data from European gene banks and bioinformatics institutes, a AGENT portal ({@fig:AGENT_WebFrontend}) as database infrastructure for integrated plant genetic resources on ex-situ genebanks is being created. It provides, manual data exploration, machine-readable access via BrAPI and provide data to the cored data deposition resources at the European Bioinformatics Institute (EBI). - -![Figure AGENT Portal](images/AGENT_WebFrontend.png){#fig:AGENT_WebFrontend width="100%"} - -The AGENT database backend aggregates curated and integrated passport data, phenotypic and genotypic data about wheat and barley accessions of 18 project partners are harmonized and integrated via BrAPI endpoints () and explorable in a web portal (). The BrAPI endpoints were made available by scattered implementation. Genotyping data use DivBrowse [@doi:10.1093/gigascience/giad025] storage engine and BrAPI interface. Endpoints for sample data are implemented using AGENT database SQL to BrAPI broker service. -To integrate those BrAPI endpoint provider into a single service and URL scheme, we work on their integration in a BrAPI proxy service. As next steps, we will expand BrAPI implementation to enable the integration of analysis pipelines in the AGENT portal, e.g. for genebank mining tools such as the FIGS+ pipeline developed by AGENT partner ICARDA [@doi:10.22004/AG.ECON.266624]. Another perspective is to integrate the data collected in the AGENT project into the European Search Catalogue for Plant Genetic Resources (EURISCO) [@doi:10.1093/nar/gkac852] and to implement BrAPI endpoints to make data on PGR collections in European genebanks programmatically accessible. - -#### IPK-Genebank - - -Agrosystem Integration of germplasm collections in context of data trustee models among private economy and public research, integration of ex-situ genebanks (EU H2020 projects AGENT, INCREASING), integrated agrosystems and plant research infrastructure - -#### MIAPPE ISA to BrAPI service - -Phenotyping is crucial in the breeding process as it enables the identification of desirable traits, selection of breeding lines, and evaluation of breeding success. In the plant community, MIAPPE (Minimal Information About a Plant Phenotyping Experiment) [@doi:10.1111/nph.16544] is the established standard for phenotyping experiments and is commonly serialized as ISA Tab [@doi:10.1038/ng.1054]. Although ISA Tab is easy to read for non-technical experts due to its file-based approach, it lacks programmatic access, particularly for web applications. BrAPI, which is aligned with MIAPPE, can help solve this problem. -MIRA is a tool that enables the automatic deployment of a BrAPI server on a MIAPPE-compliant dataset in ISA Tab format. It can be deployed from a Docker image with the dataset mounted. By utilizing the mapping between MIAPPE, ISA, and BrAPI, there is no need for parsing or manual mapping of datasets that are already compliant with (meta-)data standards. By gaining programmatic access through BrAPI to these datasets, it facilitates the integration of phenotyping datasets into web applications. - -#### MIAPPE "BrAPI to ISA" service - -Since the release of BrAPI 1.3, efforts have been made to incorporate support for the Minimum Information About Plant Phenotyping Experiments (MIAPPE) standard into the specification [@doi:10.1111/nph.16544]. This integration was finalized in BrAPI 2.0, resulting in full compatibility between the two standards. Consequently, BrAPI now encompasses all attributes necessary for MIAPPE compliance, adhering to standardized descriptions in accordance with MIAPPE guidelines. Leveraging BrAPI as a standardized RESTful web service API specification, we employ the ISA standard for storing metadata and phenotyping data in a standardized manner. This data is structured in the ISA-TAB file format and subjected to validation using the [MIAPPE ISA configuration](https://github.com/ELIXIR-Belgium/isatab-validation). The "BrAPI to ISA" service functions as a converter between BrAPI RESTful endpoints and ISA-TAB, facilitating the archiving of metadata and data and thereby enhancing data preservation and accessibility. The [BrAPI2ISA](https://github.com/elixir-europe/plant-brapi-to-isa) tool is designed to be compatible with BrAPI 1.3, and we invite contributions from the community to extend support for the latest versions of BrAPI. - -#### BrAPIMapper - - -BrAPIMapper is a full BrAPI implementation of all calls for any data source missing BrAPI implementation or compliance with some BrAPI versions. BrAPIMapper is provided as a docker application that can get its external data sources from mySQL or PostgreSQL databases (with a dedicated interface for Chado database schema), generic REST services (with a dedicated interface for BrAPI endpoints), flat files (XML, JSON, CSV/TSV/GFF3/VCF, YAML) or any combination of any of those. It provides an administration interface to map BrAPI data models to external data sources. The interface allows administrators to select the BrAPI specification versions to use and the calls to enable. Data mapping configuration export and import features simplify upgrades to future BrAPI specifications changes as administrators would only have to map missing fields or make minor adjustments. Amongst others, it supports paging, search calls, either by providing direct results or using deferred results with a search identifier, lists, authentication and manages access restrictions to calls that can be setup through the administration interface as well. This tool aims to accelerate BrAPI services deployment while ensuring specification compliance. diff --git a/content/03.success_stories/03.04.visualization.md b/content/03.success_stories/03.04.visualization.md deleted file mode 100644 index e255d5b..0000000 --- a/content/03.success_stories/03.04.visualization.md +++ /dev/null @@ -1,45 +0,0 @@ -### Data visualization - - -#### Flapjack - - -[Flapjack](https://ics.hutton.ac.uk/flapjack) [@doi:10.1093/bioinformatics/btq580] is a multi-platform desktop application for data visualization and breeding analysis (eg, pedigree verification, marker-assisted backcrossing and forward breeding) using high-throughput genotype data. Data can be easily imported into Flapjack from any BrAPI compatible data source with genotype data available. [Flapjack Bytes](https://github.com/cropgeeks/flapjack-bytes) is a smaller, lightweight and fully web-based counterpart to Flapjack, which can be easily embedded into a database website to provide similar visualizations online. Traditionally supporting its own text-based data formats, Flapjack's use of BrAPI has streamlined the end-user experience for data import and work is underway to determine the best methods to exchange analysis results using future versions of the API. - -#### Helium - - -Helium (https://helium.hutton.ac.uk) [@doi:10.1186/1471-2105-15-259] is a plant pedigree visualization platform designed to account for the specific problems that are unique to plant pedigrees. A pedigree is a representation of how genetically discrete individuals are related to one another and is therefore a representation of the genetic relationship between individual plant lines, their parents and progeny. Plant pedigrees are often used to check for potential genotyping or phenotyping errors, since these errors, by the very nature of Mendelian inheritance, are constrained by the pedigree structure in which they exist (Paterson 2011). The accurate representation of pedigrees, and the ability to pull pedigree data from different data sources is therefore important in plant breeding and genetics and therefore ways to visualize and interact this complex data in meaningful ways is critical. - -From its original desktop interface (https://github.com/cardinalb/helium-docs/wiki), Helium has developed into a web-based visualization platform implementing BrAPI calls to allow users to import data from other BrAPI compliant databases (https://helium.hutton.ac.uk). The ability to pull data from BrAPI compliant data sources has significantly expanded Helium’s capability and utility within the community. Helium is used in projects ranging in size from tens to tens of thousands of lines and across a wide variety of crops and species. While originally designed for plant data [@doi:10.3389/fpls.2024.1268847] it has also found utility in other non-plant projects [@doi:10.1007/s10592-024-01611-z] highlighting its broad utility. This also allows Helium users to provide direct dataset links to collaborators allowing the original data to be held with the data provider and utilising Helium for its visualization functionality. Our current Helium deployment includes example BrAPI calls to a barley dataset at Hutton to allow users to test the system and features it offers. - - - - -#### Tassel - - -I don't know much about Tassel or its BrAPI compliance. This is filler text for the layout of the manuscript. - -#### Trait Selector BrAPP - - -BrAPPs are simple tools developed by the BrAPI community that are entirely reliant on BrAPI for their data requirements. This means a single BrAPI can be shared and used by many organizations, as long as those organizations have the standard BrAPI endpoints available. - -The Trait Selector BrAPP is used to search and select useful traits, using a visual aid to help the user find exactly what they need. This BrAPP works with both breeding databases and genebanks. Breeding databases would need to only implement the trait, observation and observation variable calls, while genebanks would require trait, germplasm attribute and germplasm attribute value calls. So, BrAPI servers compliant with version 2 implementing any of these sets of calls would just need to follow the documented steps to create an SVG image of a plant of interest in order to use this BrAPP. CassavaBase and MGIS are two successful examples of the use of this BrAPP. (example screenshots coming + supplementary data: links to the git and the doc) - - -#### DArTView - - -DArTView is a desktop application for visualizing genotype variant data and looking for trends or correlations. It is newly BrAPI compatible and can use BrAPI as an input data source. - -#### DivBrowse - -DivBrowse [@doi:10.1093/gigascience/giad025] is a web platform for exploratory data analysis of huge genotyping studies. The software can be run standalone or integrated as a plugin into existing data web portals. It provides a powerful interactive visualization of variant call matrices with hundreds of millions of variants and thousands of samples and enables easy data import and export by using standardized and established bioinformatics file formats. -At its core, DivBrowse combines the convenience of a genome browser and adds features tailored to the diversity analysis of germplasm. It is able to display genomic features such as nucleotide sequence, associated gene models and short genomic variants. DivBrowse provides visual access to large VCF files obtained through genotyping experiments. In addition to visualizing variant calls per variant and genotype, DivBrowse also calculates and displays variant statistics such as minor allele frequencies, proportion of heterozygous calls or missing variant calls for each visualized genomic window. In addition, dynamic Principal Component Analyses (PCAs) can be performed on a user specified genomic area to provide information on local genomic diversity. -DivBrowse has a Javascript API to control the tool from a hosting web portal (e.g. to control the list of genotypes to be displayed and the reference genome). DivBrowse has an interface to BLAST, which can be used to directly access genes or other genomic features. The modular structure of DivBrowse also allows developers to configure and easily embed links to external information systems. Furthermore, parts of BrAPI are implemented to provide genotypic data via its server-side component and is also able to consume and visualize genotypic data via an external BrAPI endpoint through the client-side GUI. diff --git a/content/03.success_stories/03.05.analytics.md b/content/03.success_stories/03.05.analytics.md deleted file mode 100644 index ab3d98e..0000000 --- a/content/03.success_stories/03.05.analytics.md +++ /dev/null @@ -1,30 +0,0 @@ -### Analytics - - -#### QBMS - -Modern breeding programs can utilize data management systems to maintain both phenotypic and genotypic data. Numerous systems are available for adoption. To fully leverage the benefits of digitalization in this ecosystem, breeders need to utilize data from different sources to make efficient data-driven decisions. With increased computational power at their disposal, scientists can construct more advanced analysis pipelines by combining various data sources. - -[QBMS](https://icarda-git.github.io/QBMS) [@doi:10.5281/zenodo.10791627] R package eliminates technical barriers scientists experience when using the BrAPI calls in their analysis scripts and pipelines. This barrier arises from the complexity of managing API backend processes, such as authentication, tokens, TCP/IP protocol, JSON format, pagination, stateless calls, asynchronous communication, database IDs, and more. To bridge this gap, we have developed the QBMS R package. This package abstracts the technical complexities, providing breeders (targetted end users) with stateful action verbs/functions familiar to them when navigating their GUI systems. It enables them to query and extract data into a standard data frame structure, consistent with their use of R language, one of the most common statistical tools in the breeding community. - -Since its release on the official CRAN repository in October 2021, the QBMS R package has garnered over 9400 downloads. Several tools, such as MrBean, rely on the QBMS package as their source data adapter. Moreover, the community has started building extended solutions on top of it. QBMS can serve as a cornerstone in the breeding modernization revolution by providing access to actionable data and enabling the creation of dashboards to reduce the time between harvest and decision-making for the next breeding cycle. - -#### Mr. Bean - - - -Mr.Bean is a graphical user interface designed to assist breeders, statisticians, and individuals involved in plant breeding programs with the analysis of field trials. By utilizing innovative methodologies such as SpATS for modeling spatial trends and autocorrelation models to address spatial variability, Mr.Bean proves highly practical and powerful in facilitating faster and more effective decision-making. Modeling Genotype-by-environment interaction poses its challenges, but Mr.Bean offers the capability to explore various variance-covariance matrices, including Factor Analytic, compound symmetry, and heterogeneous variances, among others, aiding in the assessment of genotype performance across diverse environments. - -Mr.Bean boasts flexibility in importing different file types, yet for users managing their data within data management systems (DMS), the process of downloading from their DMS and importing it into MrBean can be cumbersome. To address this issue, QBMS operates in the back-end. This feature prompts users to input the URL of the server, their credentials if necessary, and the specific trial they wish to analyze. Subsequently, users can seamlessly access and utilize their dataset within the entire interface. - -Aparicio, J., Gezan, S. A., Ariza-Suarez, D., Raatz, B., Diaz, S., Heilman-Morales, A., & Lobaton, J. (2024). Mr. Bean: a comprehensive statistical and visualization application for modeling agricultural field trial data. Frontiers in Plant Science, 14, 1290078. - - -#### G-Crunch - -G-Crunch is an upcoming user-facing analysis tool that attempts to fill the space of simple, user driven analytics requests, with a generic user interface and the ability to swap out data sources and analysis tools. G-Crunch hopes to streamline repeatable, debuggable simple analytic requests and results. -G-Crunch, as a tool, couldn't feasibly exist without BrAPI. The support of BrAPI interfaces allows G-Crunch to use one unified request method, and adapt to the user's (BrAPI-compliant) existing network of tools, which lowers the barrier to entry for adoption. diff --git a/content/03.success_stories/03.07.data-portals.md b/content/03.success_stories/03.07.data-portals.md deleted file mode 100644 index c1115c9..0000000 --- a/content/03.success_stories/03.07.data-portals.md +++ /dev/null @@ -1,53 +0,0 @@ -### Data Portal - - - -#### FAIDARE - - -FAIDARE () is a data discovery portal providing a biologist friendly search system over a global federation of 33 plant research databases. It allows to identify data resources using a full text approach completed with domain specific filters and to link back to the original database for visualization, analysis and download. For instance, it is possible to search for "wheat drought" then to refine the search to the "Triticum aestivum" taxon and yield component traits such as "Thousand Grain Weight". The indexed data types are very broad and include genomic features, such as genes or transposable elements, selected bibliography, QTL, markers, genetic variation studies, phenomic studies and plant genetic resources ie germplasm. This inclusiveness is achieved thanks to a two stage indexation data model. The most generic one provides basic search functionalities and relies on five fields : name, link back URL, data type, species and exhaustive description. The filtering is directly tied to some of those fields. Therefore, to provide more advanced filtering, FAIDARE is also providing a second stage indexation mechanism by taking advantage of BrAPi endpoints to get more detailed metadata on genotyping and phenotyping studies as well as germplasm. In parallel, FAIDARE provides a pre-visualization of germplasm and studies using dedicated cards. -![Figure FAIDARE Federation](images/Schema_FAIDARE.png){#fig:Schema_FAIDARE width="100%"} -The indexation mechanism relies on a dedicated public software () that allows data resources manager to request the indexation of there database using pull requests. This BrAPI client is able to extract data from any BrAPI 1.3 and 1.2 endpoint and development of BrAPI 2.x indexation will be initiated in 2025. Since not all databases are willing to implement BrAPI endpoints, we also provide the possibility to generate metadata as BrAPI json files, hence using the standard as a file exchange format. -FAIDARE architecture has been designed by elaborating on the GnpIS Software Architecture [@doi:10.34133/2019/1671403]. As a consequence, BrAPI is at the core of its datamodel, and in particular the JSON data files served by the Elasticsearch NoSQL engine are enriched version of the BrAPI JSON files. FAIDARE also includes a BrAPI endpoint that serves all indexed metadata. -FAIDARE has been adopted by several communities and in particular in the ELIXIR and EMPHASIS european infrastructures. It is also used by the WheatIS of the Wheat-Initiative. Several databases are added each year to the FAIDARE global federation, allowing to increase both the portal and the BrAPI adoption. - -#### Phenospex - HortControl - - -HortControl, developed by Phenospex, is a data repository. HortControl has a BrAPI implementation to be used to automate workflows and analytics software. - -#### GLIS - -The Global Information System (GLIS) on Plant Genetic Resources for Food and Agriculture (PGRFA) of the International Treaty on Plant Genetic Resources for Food and Agriculture (ITPGRFA) is a web-based global entry point for users and third-party systems to access information and knowledge on scientific, technical and environmental matters to strengthen PGRFA conservation, management and utilization activities. The system and its portal also enable recipients of PGRFA to make available all non-confidential information on germplasm according to the provisions of the Treaty and facilitates access to the results of their research and development. - -Thanks to the adoption of Digital Object Identifiers (DOIs) to PGRFA ex situ and in situ based on the Multi-Crop Passport Descriptors (MCPD), the Portal provides access to 1.7 million PGRFA in collections conserved worldwide. Of these, over 1.5 million are accessible for research, training and plant breeding in the food and agriculture domain. - -The Scientific Advisory Committee of the International Treaty and the Governing Body have repeatedly welcomed efforts on interoperability among germplasm information systems. In this context, the GLIS Portal adopted the Breeding API (BrAPI v1) in 2022. Integrating the BrAPI among the GLIS content negotiators facilitates queries and the exchange of content for data management in plant breeding. The Portal also offers other protocols (XML, DarwinCore, JSON and JSON-LD) to increase data and metadata connectivity. In the near future, depending on the availability of resources, upgrading to BrAPI v2 is planned. - -#### FLORILÈGE (Gateway to French Plant Genetic Resources) - -Designed primarily for the general public, Florilège provides access to public collections of all French plant biological resources centers. This web portal allows to browse available plant genetic resource accessions and gives the possibility to order some seeds or plant material for cultivation. It includes plant genetic resources of around fifty plant genus from 19 genebanks. - -Florilège retrieves accession information from different BrAPI-compliant systems. They include OLGA, a genebank accessions information management system, and GnpIS[@doi:10.34133/2019/1671403] [@doi:10.1007/978-1-4939-6658-5_5], an INRAE data repository for plant genetic resources, phenomics and genetics. Using BrAPI to gather data from these systems reduced the efforts and enabled standardized data retrieval. As a consequence, BrAPI is the de facto standard for exchanging data within the French plant genetic resources community. The Florilège team also requested several update of the BrAPi specifications to better serve this use case, such as Collection or improved external references. -![Figure Florilege Workflow](images/Schema_Florilege.jpg){#fig:Schema_Florilege width="100%"} - -Florilège is developed in Drupal 10, and uses xnttbrapi module (to easily connect to BrAPI compliant external databases). - -#### BrAPI plug and play GraphQL based data-warehouse - -Using the "Zendro" set of automatic software program-code generators (zendro-dev.github.io) a fully functional, efficient, and cloud-capable BrAPI data-warehouse has been created for the current version of the BrAPI data models. The resulting data-warehouse has two interfaces, one application programming interface implemented in the form of a GraphQL web-server and another intuitive point and click graphical user interface in the browser. Both provide secure access to data read and write functions for all BrAPI data models. These data administration methods comprise create, read, update, and delete (CRUD) functions that are standardized and accept the same parameters for all data models. - -While data write access comprises both persisting single or multiple records, data read access is particularly rich in features and includes access to single records referred to by their id and access to multiple records selected by logical filters. In this, multiple records are paginated using the highly efficient cursor based pagination model as proposed in the GraphQL standard. Logical filters allow for exhaustive search queries, whose structure is highly intuitive and based around logical triplets in which a data model field is validated using an operator and a value, e.g. "Study name equals 'xyz'". In this a large collection of operators is available and triplets can be combined to logical search trees using "and" or "or" operators. Searches can be extended over relationships between data models, thus enabling a user to query the warehouse exactly for the data wanted. - -Access security is implemented with the OAuth2 user authentication standard (datatracker.ietf.org/doc/html/rfc6749). Authorization is based on user roles and can be configured differently for each single data model read or write function. - -The browser based graphical user interface is implemented in React.js with Next and exposes an intuitive and self explanatory set of functions for each data model. In the left a menu allows the user to access all BrAPI data models. Upon clicking on a model a table is shown which allows the user to paginate through all existing records, sort them by any column, search the records, add new records, or update or delete existing records, if the user role authorizes these functions. Record data can be inspected in a detail view and here relationships to other data records can be reviewed using the very same graphical visual representations. Breadcrumbs allow the user to navigate back and forth in the trail of relationships inspected. Finally, the generated graphical interface allows for the integration of interactive scientific plots and analysis tools written in JavaScript or WebAssembly. - - - -The Zendro based BrAPI plug and play data-warehouse is capable of forming an efficient cloud of data servers. This is achieved simply by linking (URLs) other Zendro based warehouses that expose the same GraphQL API to the same data models, or a subset of data models. Any network of such Zendro GraphQL servers can be set up using this configuration approach. The code generated then exposes full access to all data records stored on any node of the network, while maintaining full security control at each node. Importantly, the warehouses are programmed in such a way that any number of data servers can be joined without loss of efficiency. Only the network connection speed and size of requested record sets influence the performance. - -As explained, Zendro is a code generator and creates a fully functional data warehouse from input data model definitions, i.e. a schema. The schema is given in the form of special data model descriptions, in which each model is defined using JavaScript Object Notation (JSON). Each model is defined in its respective JSON file. A translator has been developed to create the Zendro schema from the BrAPI data model definitions. This ensures that Zendro can create plug and play data warehouses for future versions of the BrAPI with great ease, i.e. by translating the BrAPI models to Zendro input and subsequently running Zendro to create the plug and play warehouse.