Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TCGA (GDC) / (ICGC)=PCAWG #146

Open
dtitov opened this issue Apr 3, 2020 · 1 comment
Open

Add TCGA (GDC) / (ICGC)=PCAWG #146

dtitov opened this issue Apr 3, 2020 · 1 comment
Assignees

Comments

@dtitov
Copy link
Collaborator

dtitov commented Apr 3, 2020

Sumana's use-case (if it's still relevant).

@dtitov dtitov self-assigned this Apr 3, 2020
@dtitov
Copy link
Collaborator Author

dtitov commented Apr 3, 2020

GDC and ICGC APIs are not suitable for crawling: there’s no way to download metadata - one can only search using their endpoints.

PCAWG is better: there’s a way to download metadata (http://pancancer.info/gnos_metadata/latest/), however the metadata is too detailed (https://drive.google.com/file/d/1K-wRabGt4pIBJqZTPhmC0KIDfm7C1frM/view?usp=sharing), thus huge (raw files are 900MB of JSON -> 6529004 entries in DB). TrackFind can’t visualize such an amount of data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant