Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transcript fusion r. description #594

Open
leicray opened this issue Mar 18, 2024 · 3 comments
Open

Transcript fusion r. description #594

leicray opened this issue Mar 18, 2024 · 3 comments

Comments

@leicray
Copy link
Contributor

leicray commented Mar 18, 2024

Describe the bug
A user has submitted the variant description NM_001987.5:c.?_33::NM_021947.3:c.-4_? which fails to validate and which triggers an ERROR email to the sysadmins.

Transforming the description to NM_001987.5:c.?_33::NM_021947.3:c.-4_? also fails but without triggering the ERROR email, but with the warning NM_001987.5:c.?_33::NM_021947.3:c.-4_?: char 14: expected one of (, *, or a digit. This is strange as character 14 is the dot immediately after the c.

The user appears to want to express a presumed gene-fusion variant at the RNA level which might be reasonable if the RNA has been analysed. However, the presence of two instances of ? suggests that RNA sequencing has not been carried out.

Taken at face value, the variant description suggests a fusion such that the 5' end of NM_001987.5 up to nucleotide 33 is fused to the 5' UTR of NM_021947.3 immediately before nucleotide -4.

@ifokkema
Copy link
Collaborator

I don't see the difference between the first and second variant descriptions; according to my diff, they are the same.

The page on RNA fusion shows the same format, by the way;

when only the sequence adjacency and not the entire transcript has been analysed, the format NM_152263.2:r.?_775::NM_002609.3:r.1580_? should be used.

@leicray
Copy link
Contributor Author

leicray commented Mar 18, 2024

Mea culpa. I muddled the original description during editing and then pasted the c. version into the first instance in the message when trying to fix it.

It looks like the syntax is correct but the variant description is not being parsed correctly by VV.

@ifokkema
Copy link
Collaborator

No worries! Interestingly, though, it seems the HGVS nomenclature website is missing information on what to do on the protein level. As such, I'm not sure what VV can do beyond checking if the given positions (NM_001987.5:c.33 and NM_021947.3:c.-4) exist. There seems to be no protein-level annotation for this... unless I'm not looking well enough. We can obviously make something up, but I doubt that's the idea here, especially since this notation doesn't even tell us whether the adjacent nucleotides NM_001987.5:c.32 and NM_021947.3:c.-3 have been sequenced. I'm also not sure what to do with frameshifts, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants