Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AnnotSV BioConda Environment: Reference Issues For GRCh38 #255

Open
yr542 opened this issue Sep 6, 2024 · 14 comments
Open

AnnotSV BioConda Environment: Reference Issues For GRCh38 #255

yr542 opened this issue Sep 6, 2024 · 14 comments
Labels
bug Something isn't working Docker/Singularity/Bioconda help wanted Extra attention is needed

Comments

@yr542
Copy link

yr542 commented Sep 6, 2024

I used the commands to create a reference for AnnotSV:

wget http://ftp.ensembl.org/pub/release-
111/gtf/homo_sapiens/Homo_sapiens.GRCh38.111.chr.gtf.gz
wget http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/gtfToGenePred
chmod +x gtfToGenePred
gunzip Homo_sapiens.GRCh38.111.chr.gtf.gz
./gtfToGenePred -genePredExt -geneNameAsName2 -includeVersion \
Homo_sapiens.GRCh38.111.chr.gtf refGene.txt
for i in 1 10 11 12 13 14 15 16 17 18 19 2 20 21 22 3 4 5 6 7 8 9 M MT X Y;do \
awk -v chr=$i '$2 ==chr {print
$2"\t"$4"\t"$5"\t"$3"\t"$12"\t"$1"\t"$6"\t"$7"\t"$9"\t"$10}' \
refGene.txt | sed 's/^MT/M/' | sort -k1,1 -k2,2n -k3,3n >>
refGene.sorted.tmp.tmp.bed; done
grep -v "none" refGene.sorted.tmp.tmp.bed > refGene.sorted.tmp.bed
rm gtfToGenePred Homo_sapiens.GRCh38.111.chr.gtf refGene.txt
refGene.sorted.tmp.tmp.bedhttp://ftp.ensembl.org/pub/release-111/gtf/homo_sapiens/Homo_sapiens.GRCh38.111.chr.gtf.gzhttp://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/gtfToGenePred

but AnnotSV would not accept it saying:

...checking the annotation data sources (September 05 2024 - 13:23)
############################################################################
"" file doesn't exist
Please check your install - Exit with error.
############################################################################

My commands for AnnotSV in both cases:

AnnotSV -SVinputFile "path/to/my/vcf/file" \
        -outputDir "/full/path/to/my/output/directory" \
        -genomeBuild GRCh38 \
        -annotationsDir "/the/full/path/to/my/reference/directory"

I also tried the work around using INSTALL_annotations.shbut I do not believe the annotations are for Ensembl?

@lgmgeo
Copy link
Owner

lgmgeo commented Sep 9, 2024

I'm quite surprised by your request.
Why did you try to update the refGene.sorted.bed file with the same ENSEMBL version (GRCh38.111) as the one distributed?

Can you show me the output of the following command lines:

ls -l $ANNOTSV/share/AnnotSV/Annotations_Human/Genes/GRCh38/
ls -l $ANNOTSV/share/AnnotSV/Annotations_Human/*

@yr542
Copy link
Author

yr542 commented Sep 10, 2024

Hello @lgmgeo I was trying to test it out so I can modify the reference as needed but so far:

I activate the conda environment then I did the commands you suggested:

Suggested command 1:

ls -l $ANNOTSV/share/AnnotSV/Annotations_Human/Genes/GRCh38/

Result Of Command 1:

ls: cannot access '/share/AnnotSV/Annotations_Human/Genes/GRCh38/': No such file or directory

Suggested Command 2:

ls -l $ANNOTSV/share/AnnotSV/Annotations_Human/*

Result Of Command 2:

ls: cannot access '/share/AnnotSV/Annotations_Human/*': No such file or directory

This was why I had to try with the references. But I did find a work around using the INSTALL_annotations.sh command in the BioConda environment. But I believe it uses RefSeq annotations not Ensembl?

@lgmgeo
Copy link
Owner

lgmgeo commented Sep 12, 2024

The ANNOTSV environment variable is not defined in your conda environment.
That's why it's not working.

@lgmgeo lgmgeo added bug Something isn't working help wanted Extra attention is needed Docker/Singularity/Bioconda labels Sep 12, 2024
@yr542
Copy link
Author

yr542 commented Sep 12, 2024

Thank you for your response. Then what path would you suggest to define this $ANNOTSV variable given this in a bioconda environment? Is there an example?

@lgmgeo
Copy link
Owner

lgmgeo commented Sep 13, 2024

I can't understand totally, because this should work by using the -annotationsDir option.

see #184:

can you try binding the directory with your input files and annotations directory to the container? This can be done with --bind (see here for more information about this).

@yr542
Copy link
Author

yr542 commented Sep 13, 2024

I am not sure what I did wrong? I had to create references using the INSTALL_annotations.sh command -

  1. there was no path for $ANNOTSV,
  2. I did try the binding with singularity but it did not work.
  3. It rendered AnnotSV un-usable via Singularity. I am now using the bioconda.

@lgmgeo
Copy link
Owner

lgmgeo commented Sep 16, 2024

Sorry, I'm definitely not an expert with singularity or bioconda.

@nvnieuwk, can you help @yr542 ?

@nvnieuwk
Copy link
Contributor

Have you tried installing annotsv in a fresh conda environment?

@yr542
Copy link
Author

yr542 commented Sep 23, 2024

@nvnieuwk yes I have used a fresh conda environment. The issue still remains.

@nvnieuwk
Copy link
Contributor

Strange :/ Can you give me some more information on your system? What OS are you using? What conda/mamba version are you using?

@yr542
Copy link
Author

yr542 commented Sep 24, 2024

@nvnieuwk

### Basic Information:

System version: Debian 10
Conda Version: 23.11.0
Python Version: 3.11.5

@nvnieuwk
Copy link
Contributor

Hmm all looks fine :/ It's very weird that you are getting these issues, I usually run the tool on a HPC too with singularity and have never encountered any issues. I'm not a expert on these systems either so I'm not sure my suggestions will help...

Can try downloading the annotations on another system and rsyncing them to your HPC? Maybe this can help?

@yr542
Copy link
Author

yr542 commented Oct 9, 2024

@nvnieuwk I have tried this before, it didn't work regrettably.

@nvnieuwk
Copy link
Contributor

Then I'm not really sure this is something I can help you with :/ It doesn't seem like it's a singularity error, @lgmgeo could it be some bug in the INSTALL_annotations.sh script that only downloads parts of the annotations, or some issue with the source of the annotations?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Docker/Singularity/Bioconda help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants