Skip to content

Commit

Permalink
custom dictionary fixed
Browse files Browse the repository at this point in the history
  • Loading branch information
BrapiCoordinatorSelby committed May 8, 2024
1 parent 153d0f9 commit 31f0880
Show file tree
Hide file tree
Showing 8 changed files with 59 additions and 30 deletions.
61 changes: 46 additions & 15 deletions build/assets/custom-dictionary.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ personal_ws-1.1 en 22
al
doi
eq
eg
et
github
isbn
Expand All @@ -21,22 +22,29 @@ svgs
tbl
unicode
wikidata
BrAPIRelatedTechnicalTermsBrAPIRelatedTechnicalTermsBrAPIRelatedTechnicalTerms
EnglishAndLatinEnglishAndLatinEnglishAndLatinEnglishAndLatinEnglishAndLatin
analytics
aspirational
autocorrelation
backcrossing
BRAVA
BreedBase
centric
CGIAR
ClimMob
CottonGEN
DArTView
FAIDARE
FAO
Fieldbook
backend
centric
dataset
datasets
debuggable
digitalization
explorable
facto
frontend
onwards
scalable
stateful
programmatically
Triticum
aestivum
Vaccinium
BrAPIRelatedTechnicalTermsBrAPIRelatedTechnicalTermsBrAPIRelatedTechnicalTerms
backcrossing
DMS
BrAPI
BrAPP
BrAPPs
Expand Down Expand Up @@ -71,12 +79,13 @@ MIAPPE
NoSQL
OAuth
ontologies
pangenome
phenomic
phenomics
phenotypes
Phenotypic
phenotyping
programmatically
phylogenetic
QTL
transcriptomics
transposable
Expand All @@ -85,8 +94,6 @@ TSV
UI
VCF
YAML
Triticum
aestivum
BrAPIRelatedToolsBrAPIRelatedToolsBrAPIRelatedToolsBrAPIRelatedTools
BRAVA
Cassavabase
Expand All @@ -101,6 +108,7 @@ GmbH
GnpIS
FAIDARE
ImageBreed
ISA
Fieldbook
BeegMac
BIMS
Expand All @@ -113,23 +121,31 @@ InterCross
MrBean
MusaBase
OpenSILEX
PHG
PIPPA
QBMS
WIWAM
WebAssembly
Zendro
BrAPIRelatedOrganizationsBrAPIRelatedOrganizationsBrAPIRelatedOrganizations
Agro
BreedBase
Boyce
CGIAR
CIRAD
CottonGEN
EBI
FAO
IBP
ICARDA
Agropolis
Bioversity
BMS
BMS’s
EURISCO
INRAE
Julich
IAVAO
Leafnode
NIFA
Tripal
Expand Down Expand Up @@ -165,6 +181,7 @@ Johnen
Strase
Bioeconomy
BioSC
Umea
BrAPIRelatedNamesBrAPIRelatedNamesBrAPIRelatedNamesBrAPIRelatedNames
alshamaa
Asis
Expand All @@ -187,6 +204,10 @@ Clarysabel
cpommier
Crimi
Davuluri
Destin
Droesbeke
Erwan
Floch
Feser
feserm
Gouripriya
Expand All @@ -198,19 +219,28 @@ Habito
Hallab
Iain
imilne
jlamossweeney
Khaled
koenig
Konig
Lange
langeipk
Laszlo
LzLang
mflores
leetaei
Lopez
Marsella
Mathieu
Matthijs
Mirella
mrouard
Montpellier
Pommier
raabb
Raubach
Rosaceae
Rouard
sebeier
Selby
Sempere
Expand All @@ -222,3 +252,4 @@ Tovar
trife
VivianBass
Weise
zrm
2 changes: 1 addition & 1 deletion content/03.01.06.PIPPA.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,4 @@ Developed from 2016 onwards, the software features a web interface with function

To share the phenotype data of the experiments linked to publications, an implementation of BrAPI 1.3 was developed on a separate public PIPPA server open to the public, which allowed read only access to the data in a standardized format. This endpoint was registered on [FAIDARE](https://urgi.versailles.inra.fr/faidare/) and allows the data to be found alongside data from other BrAPI endpoints.

As the BrAPI ecosystem has matured, it created a clear path for the development of PIPPA as to how to share data in a manner according to the FAIR principles which are becoming standard in plant research data management best practices. In combination with the support for [MIAPPE](https://www.miappe.org/), these have served as guidelines in the current development, which is focussed on delivering a public BraPI 2.1 endpoint and making more high throughput datasets publicly available via BrAPI.
As the BrAPI ecosystem has matured, it created a clear path for the development of PIPPA as to how to share data in a manner according to the FAIR principles which are becoming standard in plant research data management best practices. In combination with the support for [MIAPPE](https://www.miappe.org/), these have served as guidelines in the current development, which is focused on delivering a public BraPI 2.1 endpoint and making more high throughput datasets publicly available via BrAPI.
2 changes: 1 addition & 1 deletion content/03.02.04.GIGWA.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ Gigwa is a JEE web application providing means to centralize, share, finely filt

Gigwa developers have been involved in the BrAPI community since 2016 and took part in designing the genotype-related part of the API's specifications. Its first BrAPI-compliant features were designed for compatibility with the Flapjack visualization tool [@doi:10.1093/bioinformatics/btq580] and thus primarily turned it into a BrAPI data source. Consequently, over time, Gigwa being the first and most reliable application implementing BrAPI-Genotyping server calls, local collaborators and even external partners used it as a reference solution to design a number of tools taking advantage of those features (e.g., [BeegMac](https://webtools.southgreen.fr/BrAPI/Beegmac/), [SnpClust](https://github.com/jframi/snpclust), [QBMS](https://github.com/icarda-git/QBMS)). But further use-cases also required Gigwa to be able to consume data from other BrAPI servers, which led to also implement API-client features into the system. Thanks to all this work, a close collaboration was progressively established with the Integrated Breeding Platform team developing the widely used Breeding Management System, that ended up in both applications now being frequently deployed together, Gigwa pulling germplasm or sample metadata from BMS, and BMS displaying Gigwa-hosted genotypes within its own UI.

Client BrAPI libraries being available for R, community members typically write ad-hoc scripts syndicating data from multiple BrAPI sources (for instance phenotypes from a data source and genotypes from another) in order to run various kinds of analyses such as GWAS, genomic selection or phylogenetic investigations. As a perspective, we may expect the most generic and widely-used of those pipelines to be at least publicly distributed, and possibly web-interfaced using solutions like R-Shiny in order to provide new, excitingly useful online services, based on Gigwa-hosted data.
Client BrAPI libraries being available for R, community members typically write adhoc scripts syndicating data from multiple BrAPI sources (for instance phenotypes from a data source and genotypes from another) in order to run various kinds of analyses such as GWAS, genomic selection or phylogenetic investigations. As a perspective, we may expect the most generic and widely-used of those pipelines to be at least publicly distributed, and possibly web-interfaced using solutions like R-Shiny in order to provide new, excitingly useful online services, based on Gigwa-hosted data.
2 changes: 1 addition & 1 deletion content/03.04.02.BMS.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ The [Breeding Management System (BMS)](https://bmspro.io), developed by the [Int

The [brapi-sync](https://github.com/IntegratedBreedingPlatform/brapi-sync) tool, a significant component of BMS’s BrAPI capabilities, was developed by the IBP and released as a BrAPP for community use. Brapi-sync is designed to enhance collaboration among partner institutes within a network such as Innovation and Plant Breeding in West Africa ([IAVAO](https://www.iavao.org/en)), by enabling the sharing of germplasm and trials across BrAPI-enabled systems. This tool helps overcome traditional barriers to collaboration, ensuring data that was once isolated within specific programs or platforms can now be easily shared, integrated, and synchronized.

Additionally, brapi-sync improves data management by utilizing the externalReferences field to maintain links to the origin IDs of each entity it transmits. This not only retains the original context of the data but also establishes a traceability mechanism for accurate data source attribution and verification. Such practices are crucial for maintaining data integrity and fostering trust among collaborative partners, ensuring access to accurate, reliable, and current information.
Additionally, brapi-sync improves data management by utilizing the External References field to maintain links to the origin IDs of each entity it transmits. This not only retains the original context of the data but also establishes a traceability mechanism for accurate data source attribution and verification. Such practices are crucial for maintaining data integrity and fostering trust among collaborative partners, ensuring access to accurate, reliable, and current information.
2 changes: 1 addition & 1 deletion content/03.05.01.QBMS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@

Modern breeding programs can utilize data management systems to maintain both phenotypic and genotypic data. Numerous systems are available for adoption. To fully leverage the benefits of digitalization in this ecosystem, breeders need to utilize data from different sources to make efficient data-driven decisions. With increased computational power at their disposal, scientists can construct more advanced analysis pipelines by combining various data sources.

[QBMS](https://icarda-git.github.io/QBMS) [@doi:10.5281/zenodo.10791627] R package eliminates technical barriers scientists experience when using the BrAPI calls in their analysis scripts and pipelines. This barrier arises from the complexity of managing API backend processes, such as authentication, tokens, TCP/IP protocol, JSON format, pagination, stateless calls, asynchronous communication, database IDs, and more. To bridge this gap, we have developed the QBMS R package. This package abstracts the technical complexities, providing breeders (targetted end users) with stateful action verbs/functions familiar to them when navigating their GUI systems. It enables them to query and extract data into a standard data frame structure, consistent with their use of R language, one of the most common statistical tools in the breeding community.
[QBMS](https://icarda-git.github.io/QBMS) [@doi:10.5281/zenodo.10791627] R package eliminates technical barriers scientists experience when using the BrAPI calls in their analysis scripts and pipelines. This barrier arises from the complexity of managing API backend processes, such as authentication, tokens, TCP/IP protocol, JSON format, pagination, stateless calls, asynchronous communication, database IDs, and more. To bridge this gap, we have developed the QBMS R package. This package abstracts the technical complexities, providing breeders (targeted end users) with stateful action verbs/functions familiar to them when navigating their GUI systems. It enables them to query and extract data into a standard data frame structure, consistent with their use of R language, one of the most common statistical tools in the breeding community.

Since its release on the official CRAN repository in October 2021, the QBMS R package has garnered over 9400 downloads. Several tools, such as MrBean, rely on the QBMS package as their source data adapter. Moreover, the community has started building extended solutions on top of it. QBMS can serve as a cornerstone in the breeding modernization revolution by providing access to actionable data and enabling the creation of dashboards to reduce the time between harvest and decision-making for the next breeding cycle.
2 changes: 1 addition & 1 deletion content/03.06.04.Zendro.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Using the "Zendro" set of automatic software program-code generators (zendro-dev.github.io) a fully functional, efficient, and cloud-capable BrAPI data-warehouse has been created for the current version of the BrAPI data models. The resulting data-warehouse has two interfaces, one application programming interface implemented in the form of a GraphQL web-server and another intuitive point and click graphical user interface in the browser. Both provide secure access to data read and write functions for all BrAPI data models. These data administration methods comprise create, read, update, and delete (CRUD) functions that are standardized and accept the same parameters for all data models.

While data write access comprises both persisting single or multiple records, data read access is particularly rich in features and includes access to single records referred to by their id and access to multiple records selected by logical filters. In this, multiple records are paginated using the highly efficient cursor based pagination model as proposed in the GraphQL standard. Logical filters allow for exhaustive search queries, whose structure is highly intuitive and based around logical triplets in which a data model field is validated using an operator and a value, e.g. "Study name equals 'xyz'". In this a large collection of operators is available and triplets can be combined to logical search trees using "and" or "or" operators. Searches can be extended over relationships between data models, thus enabling a user to query the warehouse exactly for the data wanted.
While data write access comprises both persisting single or multiple records, data read access is particularly rich in features and includes access to single records referred to by their id and access to multiple records selected by logical filters. In this, multiple records are paginated using the highly efficient cursor based pagination model as proposed in the GraphQL standard. Logical filters allow for exhaustive search queries, whose structure is highly intuitive and based around logical triplets in which a data model field is validated using an operator and a value, e.g. "Study name equals 'my_study'". In this a large collection of operators is available and triplets can be combined to logical search trees using "and" or "or" operators. Searches can be extended over relationships between data models, thus enabling a user to query the warehouse exactly for the data wanted.

Access security is implemented with the OAuth2 user authentication standard (datatracker.ietf.org/doc/html/rfc6749). Authorization is based on user roles and can be configured differently for each single data model read or write function.

Expand Down
4 changes: 1 addition & 3 deletions content/04.discussion.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,8 @@

### BrAPI for Breeders and Scientists

The BrAPI technical specification document is meant to be read and used by software developers. However, the purpose of the specification, and the community around it, is to make things faster, easier, and cheaper for the breeders and scientists working to make the world a better place. BrAPI offers a convenient path to automation and data integration for software tools in the breeding domain. All of the example use cases described above can be achieved with manual effort, moving and editing data files by hand. However, when the basic structure and flow of data becomes automated, breeders and scientists can spend less time on data management and more time focussing on the science, doing what they do best. For many, the ultimate goal is the development of a digital ecosystem: a collection of software tools and applications that can all work together seamlessly. In this digital ecosystem, data is collected digitally from the beginning, reducing as much human error as possible. The data is checked by quality control and stored automatically, then can be sent to any internal tool or external lab for further analysis with just the click of a button. This idea might sound too good to be true, but as more tools start sharing a universal data standard, automating data flow becomes easier, and the community gets closer to total interoperability.
The BrAPI technical specification document is meant to be read and used by software developers. However, the purpose of the specification, and the community around it, is to make things faster, easier, and cheaper for the breeders and scientists working to make the world a better place. BrAPI offers a convenient path to automation and data integration for software tools in the breeding domain. All of the example use cases described above can be achieved with manual effort, moving and editing data files by hand. However, when the basic structure and flow of data becomes automated, breeders and scientists can spend less time on data management and more time focusing on the science, doing what they do best. For many, the ultimate goal is the development of a digital ecosystem: a collection of software tools and applications that can all work together seamlessly. In this digital ecosystem, data is collected digitally from the beginning, reducing as much human error as possible. The data is checked by quality control and stored automatically, then can be sent to any internal tool or external lab for further analysis with just the click of a button. This idea might sound too good to be true, but as more tools start sharing a universal data standard, automating data flow becomes easier, and the community gets closer to total interoperability.

### Looking Ahead

The BrAPI specification will continue to grow, enabling more use cases and new types of data. These new use cases might include newer scientific techniques and technologies. Things like drone imaging data, spectroscopy, LIDAR, metabolomics, transcriptomics, high-throughput phenotyping, and machine learning analysis. All of these technologies can open new avenues for research and development of new crop varieties. All of these technologies also generate more data, and require data sharing between different software applications and data repositories. The BrAPI project leadership and community is committed to building the standards to support these new use cases as they arrive and become accepted by the scientific community. In fact, small groups within the BrAPI community have already start building generic data models and communication standards for many of the technologies listed above. These community efforts will eventually become part of the BrAPI standard in a future version of the specification document.


14 changes: 7 additions & 7 deletions content/metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -199,9 +199,9 @@ authors:
email: [email protected]
initials: ELF
affiliations:
- Université Paris-Saclay, INRAE, BioinfOmics, Plant Bioinformatics Facility, Versailles, France
- Université Paris-Saclay, INRAE, Bioinformatics, Plant Bioinformatics Facility, Versailles, France
- Université Paris-Saclay, INRAE, URGI, Versailles, France
- name: Jospeh Ruff
- name: Joseph Ruff
email: [email protected]
initials: JR
affiliations:
Expand All @@ -211,33 +211,33 @@ authors:
initials: MA
orcid: 0000-0001-9356-4072
affiliations:
- Université Paris-Saclay, INRAE, BioinfOmics, Plant Bioinformatics Facility, Versailles, France
- Université Paris-Saclay, INRAE, Bioinformatics, Plant Bioinformatics Facility, Versailles, France
- Université Paris-Saclay, INRAE, URGI, Versailles, France
- name: Célia Michotey
email: [email protected]
initials: CM
orcid: 0000-0003-1877-1703
affiliations:
- Université Paris-Saclay, INRAE, BioinfOmics, Plant Bioinformatics Facility, Versailles, France
- Université Paris-Saclay, INRAE, Bioinformatics, Plant Bioinformatics Facility, Versailles, France
- Université Paris-Saclay, INRAE, URGI, Versailles, France
- name: Anne-Francoise Adam-Blondon
email: [email protected]
initials: AFAB
orcid: 0000-0002-3412-9086
affiliations:
- Université Paris-Saclay, INRAE, BioinfOmics, Plant Bioinformatics Facility, Versailles, France
- Université Paris-Saclay, INRAE, Bioinformatics, Plant Bioinformatics Facility, Versailles, France
- Université Paris-Saclay, INRAE, URGI, Versailles, France
- name: Jeremy Destin
email: [email protected]
initials: JD
affiliations:
- Université Paris-Saclay, INRAE, BioinfOmics, Plant Bioinformatics Facility, Versailles, France
- Université Paris-Saclay, INRAE, Bioinformatics, Plant Bioinformatics Facility, Versailles, France
- Université Paris-Saclay, INRAE, URGI, Versailles, France
- name: Maud Marty
email: [email protected]
initials: MM
affiliations:
- Université Paris-Saclay, INRAE, BioinfOmics, Plant Bioinformatics Facility, Versailles, France
- Université Paris-Saclay, INRAE, Bioinformatics, Plant Bioinformatics Facility, Versailles, France
- Université Paris-Saclay, INRAE, URGI, Versailles, France
- name: Suman Kumar
email: [email protected]
Expand Down

0 comments on commit 31f0880

Please sign in to comment.