Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse forward slashes in gwas catalog #1006

Open
kshefchek opened this issue Jan 4, 2021 · 1 comment
Open

Parse forward slashes in gwas catalog #1006

kshefchek opened this issue Jan 4, 2021 · 1 comment

Comments

@kshefchek
Copy link
Contributor

For example the row:

2020-06-25	25555482	Gelernter J	2014-09-16	Biol Psychiatry	www.ncbi.nlm.nih.gov/pubmed/25555482	Genome-wide association study of nicotine dependence in American populations: identification of novel risk loci in both African-Americans and European-Americans.	Nicotine dependence symptom count	3,529 African American individuals, 4,117 European American individuals	NA	6p21.32	6	32383959	intergenic	TSBP1-AS1			ENSG00000225914			rs35794310/rs147955325/rs11415565-TG	rs35794310/rs147955325/rs11415565	...

Per https://www.ncbi.nlm.nih.gov/snp/rs35794310 - rs35794310 was merged with rs11415565
and https://www.ncbi.nlm.nih.gov/snp/rs147955325 - rs147955325 was merged with rs11415565

We should model this similarly to how we model deprecated identifiers in ontologies, but it's unclear from this row alone which identifier is the current one (is it always the last in the list?)

See monarch-initiative/monarch-ui#383

@kshefchek
Copy link
Contributor Author

According to the docs, if MERGED == 1, we should be using the SNP_ID_CURRENT column

Looks like we already have some support for this:
https://github.com/monarch-initiative/dipper/blob/254242e2/dipper/sources/GWASCatalog.py#L450

From the gwas catalog docs:

SNPS*: Strongest SNP; if a haplotype it may include more than one rs number (multiple SNPs comprising the haplotype)

MERGED*: denotes whether the SNP has been merged into a subsequent rs record (0 = no; 1 = yes;)

SNP_ID_CURRENT*: current rs number (will differ from strongest SNP when merged = 1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant