Skip to content
This repository has been archived by the owner on Jan 25, 2023. It is now read-only.

Clarify callCount and add sampleTotalCount (issue #237) #258

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

sdelatorrep
Copy link
Contributor

No description provided.

@mbaudis
Copy link
Member

mbaudis commented Jan 17, 2019

@sdelatorrep Thanks for starting this - and driving a great topic for standardisation & documentation!

I have problems mostly with the variantCount definition - but this may be me, coming from a "queries match multiple variants" background.

Generally, I do not think that those numbers have to mimic anything VCF, but rather that they can be computed from VCF numbers.

So:

  • variantCount should IMO be the number of distinct variants matched by a query. For most precise SNV queries, this would be just 1. However, for ranges/wildcards, this could be any number.
  • callCount doesn't make sense as the basic no. of alleles; it should (logically) be the number the queried variant(s) has/have been observed in the dataset; the sum of 2n(1/1) + 1n(1/0) + 1*n(0/1) ... However: This isn't of great interest (IMO), biologically, since you cannot decompose it into individual contributions from homo- and heterozygous ... matches. Useful would be a de-composition of the different call combinations (1/0, 1/1 ...); but I wouldn't know how to do this w/o running into a "if we can do this, we can ..." scenario.

So maybe we need to get back to the drawing board for "meaningful quantitative Beacon responses"?

@jrambla jrambla modified the milestones: spec 1.1.0, spec 1.1.1 Jan 28, 2019
@jrambla
Copy link
Collaborator

jrambla commented Mar 21, 2019

Please, look at the presentation I've circulated based on the wiki page that Sabela has created but using an scaled down example.
For variantCount in wildcard/"multiple fuzzy variants", I would like to expand that in Beacon 2.0, but now, I guess that if we consider "any match" we should be returning them as if they were just "one", if someone is interested in specific variants, they can do that specific query. If someone wants to know about "which are the different variants" I believe we should refer to the "evidence" Beacon including the Variant SchemaBlock stuff.
I'm also believing that adding the karyotype details (1/1, 1/0...) could make sense, thus we can add counts for variantHomozygotesCount, variantHeterozygoteCount.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants