Skip to content

Commit

Permalink
custom dictionary fixed
Browse files Browse the repository at this point in the history
  • Loading branch information
BrapiCoordinatorSelby committed May 8, 2024
1 parent 564bc52 commit 153d0f9
Show file tree
Hide file tree
Showing 5 changed files with 38 additions and 9 deletions.
33 changes: 31 additions & 2 deletions build/assets/custom-dictionary.txt
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,12 @@ unicode
wikidata
BrAPIRelatedTechnicalTermsBrAPIRelatedTechnicalTermsBrAPIRelatedTechnicalTerms
analytics
aspirational
autocorrelation
backcrossing
BRAVA
BreedBase
centric
CGIAR
ClimMob
CottonGEN
Expand Down Expand Up @@ -88,6 +92,7 @@ BRAVA
Cassavabase
ClimMob
DArTView
DarwinCore
DeltaBreed
DivBrowse
GIGWA
Expand Down Expand Up @@ -134,30 +139,54 @@ Wageningen
WheatIS
Universite
Saclay
Agence
Agence
Nationale
Recherche
programme
Investissements
Investissements
d'avenir
ANR
INBS
Phenome
Bingen
Berlinstrase
Rhein
DILS
Bielefeld
BIBI
Geosciences
IBG
CEPLAS
Forschungszentrum
Julich
GmbH
Wilhelm
Johnen
Strase
Bioeconomy
BioSC
BrAPIRelatedNamesBrAPIRelatedNamesBrAPIRelatedNamesBrAPIRelatedNames
alshamaa
Asis
Abbeloos
Alaux
Alic
Backlund
Batac
Batbaby
bedroesb
Beier
Blondon
BrapiCoordinatorSelby
Brouwer
Casstevens
cardinalb
Celia
chaneylc
Clarysabel
cpommier
Crimi
Davuluri
Feser
feserm
Gouripriya
Expand Down
2 changes: 1 addition & 1 deletion content/03.01.03.Image_Breed.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#### ImageBreed

High-throughput phenotyping has been gaining significant traction lately as a way to collect lots of data very quickly. Image collection from unmanned arial and ground vehicles (UAVs and UGVs) are a great way to collect a lot of raw data all at once, then analyze it later. ImageBreed is a image collection pipeline tool to support regular use of UAVs and UGVs.
High-throughput phenotyping has been gaining significant traction lately as a way to collect lots of data very quickly. Image collection from unmanned aerial and ground vehicles (UAVs and UGVs) are a great way to collect a lot of raw data all at once, then analyze it later. ImageBreed is a image collection pipeline tool to support regular use of UAVs and UGVs.

When the raw images have been processed through the standardization pipelines in ImageBreed, useful phenotypes can be extracted from the images. The BrAPI standard is used to push these phenotypes back to a central breeding database where they can be analyzed with other data. In addition to this, ImageBreed also has the option to use BrAPI to upload the raw images to the central breeding database, or any other BrAPI compatible long term storage service. The BrAPI models in the current version of the standard (V2.1) are rudimentary, but effective. The ImageBreed team has put in some work to enhance the BrAPI image data standards.
4 changes: 2 additions & 2 deletions content/03.02.04.GIGWA.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@

Gigwa is a JEE web application providing means to centralize, share, finely filter, and visualize high-throughput genotyping data [@doi:10.1093/gigascience/giz051]. Built on top of MongoDB, it is scalable and can support working smoothly with datasets containing billions of genotypes. Installable from docker images or all-in-one bundle archives, it is pretty straightforward to deploy on servers or local computers and has thus been adopted by numerous research institutes from around the world. Notably, Gigwa serves as a collaborative management tool and/or a portal for exposing the data for genebanks and breeding programs for some CGIAR centers [@doi:10.1002/ppp3.10187]. Thus, the amount of data hosted and made widely accessible using this system has kept growing over the last few years.

Gigwa developers have been involved in the BrAPI community since 2016 and took part in designing the genotype-related part of the API's specifications. Its first BrAPI-compliant features were designed for compatibility with the Flapjack visualization tool [@doi:10.1093/bioinformatics/btq580] and thus primarily turned it into a BrAPI datasource. Consequently, over time, Gigwa being the first and most reliable application implementing BrAPI-Genotyping server calls, local collaborators and even external partners used it as a reference solution to design a number of tools taking advantage of those features (e.g., [BeegMac](https://webtools.southgreen.fr/BrAPI/Beegmac/), [SnpClust](https://github.com/jframi/snpclust), [QBMS](https://github.com/icarda-git/QBMS)). But further use-cases also required Gigwa to be able to consume data from other BrAPI servers, which led to also implement API-client features into the system. Thanks to all this work, a close collaboration was progressively established with the Integrated Breeding Platform team developing the widely used Breeding Management System, that ended up in both applications now being frequently deployed together, Gigwa pulling germplasm or sample metadata from BMS, and BMS displaying Gigwa-hosted genotypes within its own UI.
Gigwa developers have been involved in the BrAPI community since 2016 and took part in designing the genotype-related part of the API's specifications. Its first BrAPI-compliant features were designed for compatibility with the Flapjack visualization tool [@doi:10.1093/bioinformatics/btq580] and thus primarily turned it into a BrAPI data source. Consequently, over time, Gigwa being the first and most reliable application implementing BrAPI-Genotyping server calls, local collaborators and even external partners used it as a reference solution to design a number of tools taking advantage of those features (e.g., [BeegMac](https://webtools.southgreen.fr/BrAPI/Beegmac/), [SnpClust](https://github.com/jframi/snpclust), [QBMS](https://github.com/icarda-git/QBMS)). But further use-cases also required Gigwa to be able to consume data from other BrAPI servers, which led to also implement API-client features into the system. Thanks to all this work, a close collaboration was progressively established with the Integrated Breeding Platform team developing the widely used Breeding Management System, that ended up in both applications now being frequently deployed together, Gigwa pulling germplasm or sample metadata from BMS, and BMS displaying Gigwa-hosted genotypes within its own UI.

Client BrAPI libraries being available for R, community members typically write ad-hoc scripts syndicating data from multiple BrAPI sources (for instance phenotypes from a datasource and genotypes from another) in order to run various kinds of analyses such as GWAS, genomic selection or phylogenetic investigations. As a perspective, we may expect the most generic and widely-used of those pipelines to be at least publicly distributed, and possibly web-interfaced using solutions like R-Shiny in order to provide new, excitingly useful online services, based on Gigwa-hosted data.
Client BrAPI libraries being available for R, community members typically write ad-hoc scripts syndicating data from multiple BrAPI sources (for instance phenotypes from a data source and genotypes from another) in order to run various kinds of analyses such as GWAS, genomic selection or phylogenetic investigations. As a perspective, we may expect the most generic and widely-used of those pipelines to be at least publicly distributed, and possibly web-interfaced using solutions like R-Shiny in order to provide new, excitingly useful online services, based on Gigwa-hosted data.
2 changes: 1 addition & 1 deletion content/03.03.06.FAIDARE.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,5 @@
FAIDARE (<https://urgi.versailles.inrae.fr/faidare/>) is a data discovery portal providing a biologist friendly search system over a global federation of 33 plant research databases. It allows to identify data resources using a full text approach completed with domain specific filters and to link back to the original database for visualization, analysis and download. For instance, it is possible to search for "wheat drought" then to refine the search to the "Triticum aestivum" taxon and yield component traits such as "Thousand Grain Weight". The indexed data types are very broad and include genomic features, such as genes or transposable elements, selected bibliography, QTL, markers, genetic variation studies, phenomic studies and plant genetic resources ie germplasm. This inclusiveness is achieved thanks to a two stage indexation data model. The most generic one provides basic search functionalities and relies on five fields : name, link back URL, data type, species and exhaustive description. The filtering is directly tied to some of those fields. Therefore, to provide more advanced filtering, FAIDARE is also providing a second stage indexation mechanism by taking advantage of BrAPi endpoints to get more detailed metadata on genotyping and phenotyping studies as well as germplasm. In parallel, FAIDARE provides a pre-visualization of germplasm and studies using dedicated cards.
![Figure FAIDARE Federation](images/Schema_FAIDARE.png){#fig:Schema_FAIDARE width="100%"}
The indexation mechanism relies on a dedicated public software (<https://github.com/elixir-europe/plant-brapi-etl-faidare>) that allows data resources manager to request the indexation of there database using pull requests. This BrAPI client is able to extract data from any BrAPI 1.3 and 1.2 endpoint and development of BrAPI 2.x indexation will be initiated in 2025. Since not all databases are willing to implement BrAPI endpoints, we also provide the possibility to generate metadata as BrAPI json files, hence using the standard as a file exchange format.
FAIDARE architecture has been designed by elaborating on the GnpIS Software Architecture [@doi:10.34133/2019/1671403]. As a consequence, BrAPI is at the core of its datamodel, and in particular the JSON data files served by the Elasticsearch NoSQL engine are enriched version of the BrAPI JSON files. FAIDARE also includes a BrAPI endpoint that serves all indexed metadata.
FAIDARE architecture has been designed by elaborating on the GnpIS Software Architecture [@doi:10.34133/2019/1671403]. As a consequence, BrAPI is at the core of its data model, and in particular the JSON data files served by the Elasticsearch NoSQL engine are enriched version of the BrAPI JSON files. FAIDARE also includes a BrAPI endpoint that serves all indexed metadata.
FAIDARE has been adopted by several communities and in particular in the ELIXIR and EMPHASIS european infrastructures. It is also used by the WheatIS of the Wheat-Initiative. Several databases are added each year to the FAIDARE global federation, allowing to increase both the portal and the BrAPI adoption.
6 changes: 3 additions & 3 deletions content/metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -296,7 +296,7 @@ authors:
orcid: 0000-0002-2177-8781
email: [email protected]
affiliations:
- 'Institute of Bio- and Geosciences (IBG-4: Bioinformatics), CEPLAS, Forschungszenturm Jülich GmbH, Wilhelm Johnen Straße, 52428 Jülich, Germany'
- 'Institute of Bio- and Geosciences (IBG-4: Bioinformatics), CEPLAS, Forschungszentrum Jülich GmbH, Wilhelm Johnen Straße, 52428 Jülich, Germany'
- Bioeconomy Science Center (BioSC), Forschungszentrum Jülich GmbH, 52428 Jülich, Germany
corresponding: false
- name: Valentin Guignon
Expand Down Expand Up @@ -329,7 +329,7 @@ authors:
orcid: 0000-0002-0177-3887
email: [email protected]
affiliations:
- VIB AgroIncubator
- VIB Agro-Incubator
corresponding: false
- name: Laszlo Lang
initials: LL
Expand All @@ -353,5 +353,5 @@ authors:
orcid: 0000-0002-7759-1617
email: [email protected]
affiliations:
- Boyrce Thompson Institute
- Boyce Thompson Institute
corresponding: false

0 comments on commit 153d0f9

Please sign in to comment.