Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Follow ups to multi-variant rendering #4712

Open
cmdcolin opened this issue Dec 10, 2024 · 14 comments
Open

Follow ups to multi-variant rendering #4712

cmdcolin opened this issue Dec 10, 2024 · 14 comments
Labels
enhancement New feature or request

Comments

@cmdcolin
Copy link
Collaborator

cmdcolin commented Dec 10, 2024

  • allow deleting rows
  • allow renaming rows
  • add the 'bulk editor' to multi-wiggle
  • add clustering method to multi-wiggle
  • add ability to render large cnv type data or overlapping features
  • add legend
  • add tree rendering from hclust output
  • add ability to customize color
  • add ability to cluster individual haplotype rows from phased VCF
  • configurable colors/color callback
  • display samplesTsv data in the feature details
  • open up feature details from clicking on the multivariant view (?)
  • parse PEDIGREE tags from VCF to generate sample metadata
  • multiple color sidebars a la IGV [1]

[1]

Image

done

  • handle polyploid (ALT's more than 0/0,1/0,1/1) better
  • hide colored boxes
  • add mouseover
  • render phase sets as a random color
  • add ability to configure metadata better for multivariant. currently requires using the bulk editor <-- samplesTsv
@cmdcolin
Copy link
Collaborator Author

example of showing larger scale SVs in a multi-variant type way, this example with inversion polymorphism (from https://genomebiology.biomedcentral.com/articles/10.1186/s13059-023-02919-8/figures/1)

image

@cmdcolin
Copy link
Collaborator Author

could look into more hprc data from
More here https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/HGSVC3/release/

@taprs
Copy link

taprs commented Jan 21, 2025

Any workarounds to display polyploid variant calls so far? I picture it to myself as a heatmap with proportions of ALT calls in each cell (e.g. 0/0/0/1 is at 25% on the color scale). Obviously not applicable to multiallelic sites but they are a headache anyway.

@cmdcolin
Copy link
Collaborator Author

cmdcolin commented Jan 21, 2025

I picture it to myself as a heatmap with proportions of ALT calls in each cell (e.g. 0/0/0/1 is at 25% on the color scale). Obviously not applicable to multiallelic sites but they are a headache anyway.

this is a great idea. I'll see what i could do to try something like that. if you're aware of any good polyploid data to look at, let me know, i think i have some i stumbled on or i can make some synthetic data.

also, indeed, the code has thus far not done any handling of multi-allelic sites either, sticking to just 4 colors 0/0 (ref, grey) 0/1 (het, teal) and 1/1 (hom, blu) and "other" (purple)

@taprs
Copy link

taprs commented Jan 22, 2025

Thanks for the quick reply! We have diploid/tetraploid mix data at hand, it's not published yet but we can share it if needed. (Publishing it is exactly why I started looking into the capabilities of Jbrowse2 😉 )

@cmdcolin
Copy link
Collaborator Author

that's awesome. I believe I saw a talk from your group at PAG just last week! feel free to email if interested, especially if it has mixed ploidy would be curious, but might be able to find synthetic data too

@cmdcolin
Copy link
Collaborator Author

@taprs I did a little work on trying to more properly handle both phased variants and polyploid in this PR

if you are using the web version, you can get the branch build with e.g. jbrowse create --branch polyploid newinstance

#4795

if you have phased variants, it has some particular interestingness perhaps because it can split out each haplotypes into a new row

example with a trio vcf

Screenshot From 2025-01-25 15-03-21

@taprs
Copy link

taprs commented Jan 27, 2025

Thanks for the news! I set out to test it and caught an error when trying to display a random VCF region in matrix mode (default variant mode is fine, and "regular" multivariant mode does not render tetraploid samples). Here is the stack trace:

TypeError: Cannot read properties of undefined (reading 'includes')
TypeError: Cannot read properties of undefined (reading 'includes')
../../../plugins/variants/src/MultiLinearVariantMatrixRenderer/makeImageData.ts:96:33 (at o ()
../../../plugins/variants/src/MultiLinearVariantMatrixRenderer/LinearVariantMatrixRenderer.ts:23:16 (at)
../../../packages/core/util/offscreenCanvasUtils.tsx:72:26 (at l ()
../../../plugins/variants/src/MultiLinearVariantMatrixRenderer/LinearVariantMatrixRenderer.ts:19:7 (at z.render ()
../../../packages/core/util/index.ts:1011:9 (at async he ()
../../../packages/core/pluggableElementTypes/renderers/ServerSideRendererType.tsx:190:11 (at async z.renderInWorker ()
../../../packages/core/rpc/methods/CoreRender.ts:64:37 (at async A.execute ()
JBrowse 2.18.0

Image

I set up the server like this:

jbrowse create --branch polyploid heatmap_test
jbrowse add-assembly ./NT1_220222.fasta --load symlink -n NT1 --out heatmap_test
jbrowse add-track ./Eur_lyrata_sc1.vcf.gz --load symlink --out heatmap_test
cd heatmap_test
npx serve -S .

I can send you the genome + the VCF file link via email this week. I get no error when I display them using the main branch, but understand that this VCF might be special in a few ways:

  • there are both diploids and tetraploids in the same file
  • all genomic positions are there, including sites with no data or with uniform allelic state
  • some regions might have very little to no data

Cheers,
Nikita

@cmdcolin
Copy link
Collaborator Author

cmdcolin commented Jan 27, 2025

interesting, i would be curious to see the VCF. that is basically completely failing to parse the genotypes out of the vcf if so.

if you can show just like a single line, i might be able to extrapolate, but potentially might need the whole vcf also

(also big thanks for testing out the pre-release)

@taprs
Copy link

taprs commented Jan 27, 2025

Sure, here are some 20 VCF lines stripped from the region shown in the screenshot.

vcf_sample.txt

@cmdcolin
Copy link
Collaborator Author

Thanks for posting this @taprs

I couldn't reproduce that error so maybe still gotta look a little more, if you have the full vcf can check that out

Random thing: I noticed that there are a lot of lines where no "ALT", or ALT is just a "."

I have not seen that much before. The VCF spec says that this means the ALT is "MISSING" (section 1.6.1) https://samtools.github.io/hts-specs/VCFv4.5.pdf

ALT — alternate base(s): Comma-separated list of alternate non-reference alleles. These alleles do not have to
be called in any of the samples. Each allele in this list must be one of: a non-empty String of bases (A,C,G,T,N;
case insensitive); the ‘*’ symbol (allele missing due to overlapping deletion); the MISSING value ‘.’ (no variant);
an angle-bracketed ID String (“<ID>”); the unspecified allele “<*>” as described in Section 5.5; or a breakend
replacement string as described in Section 5.4. If there are no alternative alleles, then the MISSING value must
be used. Tools processing VCF files are not required to preserve case in the allele String, except for IDs, which
are case sensitive. (String; no whitespace, commas, or angle-brackets are permitted in the ID String itself

@cmdcolin
Copy link
Collaborator Author

@taprs I did an update to the code and it has a number of improvements, potentially a fix for that crash also

@cmdcolin
Copy link
Collaborator Author

can get it with

jbrowse create --nightly newinstance

... pending a new release soon

@taprs
Copy link

taprs commented Jan 30, 2025

Yes, the update fixed the VCF display issue I had, thanks a ton! It was weird because the error was gone if I sliced the vcf file...

As for the "missing" ALT values, we use them to have invariant sites present in the VCF and to distinguish them from sites with no data. This is useful for some popgen stuff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants