Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GTF file not found with CIRIquant #155

Open
ZabalaAitor opened this issue Jun 27, 2024 · 4 comments · May be fixed by #158
Open

GTF file not found with CIRIquant #155

ZabalaAitor opened this issue Jun 27, 2024 · 4 comments · May be fixed by #158
Labels
bug Something isn't working

Comments

@ZabalaAitor
Copy link

Description of the bug

Hello,

I recently encountered an error while running CIRIquant when providing a specific GTF file. Other tools, such as circRNA_finder, run without any issues and perform the annotation correctly.

Thank you very much for your time and assistance in resolving this issue.

Best regards,

Aitor Zabala

Command used and terminal output

nextflow pull nf-core/circrna

nextflow run nf-core/circRNA \
	-r dev \
	-profile apptainer \
	--input /data/azabala/validation_GTF/data/samplesheet_eGenomes.csv \
	--phenotype /data/azabala/validation_GTF/data/phenotype_eGenomes.csv \
	--module circrna_discovery \
	--outdir /scratch/azabala/validation_GTF/eGenomes \
	--tool ciriquant,circrna_finder \
	--max_cpus 24 \
	--max_memory 256GB \
	-w /scratch/azabala/validation_GTF/eGenomes/work_eGenomes \
	--genome GRCh38 \
	--gtf /data/azabala/gtf/eGenomes/genes.gtf \
	--save_reference false \
	-resume

Core Nextflow options
  revision       : dev
  runName        : evil_mclean
  containerEngine: apptainer
  launchDir      : /scratch/azabala/validation_GTF
  workDir        : /scratch/azabala/validation_GTF/eGenomes/work_eGenomes
  projectDir     : /home/azabala/.nextflow/assets/nf-core/circRNA
  userName       : azabala
  profile        : apptainer
  configFiles    : 

Input/output options
  input          : /data/azabala/validation_GTF/data/samplesheet_eGenomes.csv
  outdir         : /scratch/azabala/validation_GTF/eGenomes
  phenotype      : /data/azabala/validation_GTF/data/phenotype_eGenomes.csv

Pipeline Options
  tool           : ciriquant,circrna_finder

Reference genome options
  save_reference : false
  genome         : GRCh38
  fasta          : s3://ngi-igenomes/igenomes//Homo_sapiens/NCBI/GRCh38/Sequence/WholeGenomeFasta/genome.fa
  gtf            : /data/azabala/gtf/eGenomes/genes.gtf
  mature         : s3://ngi-igenomes/igenomes//Homo_sapiens/NCBI/GRCh38/Annotation/SmallRNA/mature.fa

Max job request options
  max_cpus       : 24
  max_memory     : 256GB

41/575925] NOTE: Process `NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRIQUANT (exons_AGGT_eGenomes)` terminated with an error exit status (1) -- Execution is retried (2)
[30/e51473] NOTE: Process `NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRIQUANT (exons_eGenomes)` terminated with an error exit status (1) -- Execution is retried (2)
[8a/b210b2] NOTE: Process `NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRIQUANT (introns_eGenomes)` terminated with an error exit status (1) -- Execution is retried (2)
[b2/dfdd4b] NOTE: Process `NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRIQUANT (introns_AGGT_eGenomes)` terminated with an error exit status (1) -- Execution is retried (2)
ERROR ~ Error executing process > 'NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRIQUANT (exons_AGGT_eGenomes)'

Caused by:
  Process `NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRIQUANT (exons_AGGT_eGenomes)` terminated with an error exit status (1)

Command executed:

  CIRIquant \
      -t 24 \
      -1 exons_AGGT_eGenomes_1_val_1.fq.gz \
      -2 exons_AGGT_eGenomes_2_val_2.fq.gz \
      --config travis.yml \
      --no-gene \
      -o exons_AGGT_eGenomes \
      -p exons_AGGT_eGenomes
  
  cat <<-END_VERSIONS > versions.yml
  "NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRIQUANT":
      bwa: $(echo $(bwa 2>&1) | sed 's/^.*Version: //; s/Contact:.*$//')
      ciriquant : $(echo $(CIRIquant --version 2>&1) | sed 's/CIRIquant //g' )
      samtools: $(echo $(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*$//')
      stringtie: $(stringtie --version 2>&1)
      hisat2: 2.1.0
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  Traceback (most recent call last):
    File "/usr/local/bin/CIRIquant", line 10, in <module>
      sys.exit(main())
    File "/usr/local/lib/python2.7/site-packages/CIRIquant/main.py", line 89, in main
      config = check_config(check_file(args.config_file))
    File "/usr/local/lib/python2.7/site-packages/CIRIquant/utils.py", line 91, in check_config
      globals()[i.upper()] = check_file(config['reference'][i])
    File "/usr/local/lib/python2.7/site-packages/CIRIquant/utils.py", line 49, in check_file
      raise ConfigError('File: {}, not found'.format(file_name))
  CIRIquant.utils.ConfigError: File: /data/azabala/gtf/eGenomes/genes.gtf, not found

Work dir:
  /scratch/azabala/validation_GTF/eGenomes/work_eGenomes/a4/1ad695fb7ab3bc5077c78c8686692a

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

 -- Check '.nextflow.log' file for details

Relevant files

travis.yaml
name: ciriquant
tools:
bwa: /usr/local/bin/bwa
hisat2: /usr/local/bin/hisat2
stringtie: /usr/local/bin/stringtie
samtools: /usr/local/bin/samtools

reference:
fasta: /scratch/azabala/validation_GTF/eGenomes/work_eGenomes/stage-5e8255a4-53a0-4b1c-8881-9481ba39060a/1c/522d8e9bfc6e1560dad67f3afd1164/genome.fa
gtf: /data/azabala/gtf/eGenomes/genes.gtf
bwa_index: /scratch/azabala/validation_GTF/eGenomes/work_eGenomes/c3/0410f7c6a5f85b8d7b25e17307668d/bwa/genome
hisat_index: /scratch/azabala/validation_GTF/eGenomes/work_eGenomes/11/08674c9acbe0f7ebc38718c559e64a/hisat2/genome

command.log
Traceback (most recent call last):
File "/usr/local/bin/CIRIquant", line 10, in
sys.exit(main())
File "/usr/local/lib/python2.7/site-packages/CIRIquant/main.py", line 89, in main
config = check_config(check_file(args.config_file))
File "/usr/local/lib/python2.7/site-packages/CIRIquant/utils.py", line 91, in check_config
globals()[i.upper()] = check_file(config['reference'][i])
File "/usr/local/lib/python2.7/site-packages/CIRIquant/utils.py", line 49, in check_file
raise ConfigError('File: {}, not found'.format(file_name))
CIRIquant.utils.ConfigError: File: /data/azabala/gtf/eGenomes/genes.gtf, not found

System information

Nextflow: 23.04.2
Hardware: HPC
Executor: slurm
Conatiner: Apptainer
OS: Linux
nf-core/circrna: dev

@ZabalaAitor ZabalaAitor added the bug Something isn't working label Jun 27, 2024
@nictru
Copy link
Collaborator

nictru commented Jun 27, 2024

Hey, looks weird to me, I will need some help with debugging this.

Could you try switching to /scratch/azabala/validation_GTF/eGenomes/work_eGenomes/a4/1ad695fb7ab3bc5077c78c8686692a und run bash .command.run?

Would be interesting if this error occurs if the script is executed like this or not

@ZabalaAitor
Copy link
Author

bash .command.run

Traceback (most recent call last):
File "/usr/local/bin/CIRIquant", line 10, in
sys.exit(main())
File "/usr/local/lib/python2.7/site-packages/CIRIquant/main.py", line 89, in main
config = check_config(check_file(args.config_file))
File "/usr/local/lib/python2.7/site-packages/CIRIquant/utils.py", line 91, in check_config
globals()[i.upper()] = check_file(config['reference'][i])
File "/usr/local/lib/python2.7/site-packages/CIRIquant/utils.py", line 49, in check_file
raise ConfigError('File: {}, not found'.format(file_name))
CIRIquant.utils.ConfigError: File: /data/azabala/gtf/eGenomes/genes.gtf, not found

@nictru
Copy link
Collaborator

nictru commented Jun 27, 2024

This is really strange. The pipeline should fail if the file would not exist already here. The ciriQuant-internal check if the file exists takes place here in a pretty standard way, so I would say it's unlikely there is a problem within the tool. So the only remaining explanation is that the GTF file is not properly mounted to the container at runtime.

Could you try the following:

  1. Copy the GTF file into the working directory /scratch/azabala/validation_GTF/eGenomes/work_eGenomes/a4/1ad695fb7ab3bc5077c78c8686692a
  2. Change the .run.sh, replacing the /data/azabala/gtf/eGenomes/genes.gtf with genes.gtf
  3. Run bash .command.run again

And let me know what happens please.

Also it would be nice if you could attach the .command.run file here

@ZabalaAitor
Copy link
Author

ZabalaAitor commented Jul 1, 2024

Now it seems that CIRIquant is able to read the GTF file. I am providing you with the following files for further inspection #155.tar.gz:

  • command.run
  • command.err (the lines obtained in the terminal)
  • exons_AGGT_eGenomes (output)

Please note that .command.run and .command.err may appear as hidden files.

nictru added a commit that referenced this issue Jul 12, 2024
@nictru nictru linked a pull request Jul 12, 2024 that will close this issue
nictru added a commit that referenced this issue Jul 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants