Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

readBamFile with bigFile=TRUE still returns 0 GAlignmentsList #60

Open
sunta3iouxos opened this issue Sep 27, 2024 · 2 comments
Open

readBamFile with bigFile=TRUE still returns 0 GAlignmentsList #60

sunta3iouxos opened this issue Sep 27, 2024 · 2 comments

Comments

@sunta3iouxos
Copy link

Hi there,
Thank youf or this tool, but I am still facing the same issues as a lot of time ago.
My ATACseqQC version is 1.22.0
I am on windows 11
R=4.2.2
bioconductor=3.16.0

Am I missing something?

When I am using the bigFile = ALSE option the pipeline works:

 gal <- readBamFile(file.path("C:/bam/Bowtie2/",bamfiles[i]), tag=tags, which=which,
+                    asMates=TRUE, bigFile=F)
This is a big BAM file. Do you want to set bigFile=TRUE to save memory? (Y/n)? n
> 
> gal
GAlignmentsList object of length 4349701:
[[1]]
GAlignments object with 2 alignments and 18 metadata columns:
      seqnames strand       cigar    qwidth     start       end     width     njunc |                  qname      flag      mapq
         <Rle>  <Rle> <character> <integer> <integer> <integer> <integer> <integer> |            <character> <integer> <integer>
  [1]     chr1      +  1S48M2D52M       101 186863384 186863485       102         0 | A00620:409:HHJ5GDSXC..        99        44
  [2]     chr1      -        101M       101 186863467 186863567       101         0 | A00620:409:HHJ5GDSXC..       147        44
          isize                     seq                    qual     mrnm        AS          MD                    RG        XG
      <integer>          <DNAStringSet>          <PhredQuality> <factor> <integer> <character>           <character> <integer>
  [1]       185 NTGTGCATAC...AGTCTGTATG #FF::FFFFF...FFFFFFFFFF     chr1       182  48^TG12A39 A006200409_226916_S85         2
  [2]      -185 CTTTGTGTGA...TATGCATGTG :FFFFFFFFF...FFFFFFFFFF     chr1       202         101 A006200409_226916_S85         0
             NM        XM        XN        XO        XS        YS          YT
      <integer> <integer> <integer> <integer> <integer> <integer> <character>
  [1]         3         1         0         1        59       202          CP
  [2]         0         0         0         0        60       182          CP
  -------
  seqinfo: 66 sequences from an unspecified genome

[[2]]
GAlignments object with 2 alignments and 18 metadata columns:
      seqnames strand       cigar    qwidth     start       end     width     njunc |                  qname      flag      mapq
         <Rle>  <Rle> <character> <integer> <integer> <integer> <integer> <integer> |            <character> <integer> <integer>
  [1]     chr1      +        101M       101  24612984  24613084       101         0 | A00620:409:HHJ5GDSXC..      1123         1
  [2]     chr1      -        101M       101  24613014  24613114       101         0 | A00620:409:HHJ5GDSXC..      1171         1
          isize                     seq                    qual     mrnm        AS          MD                    RG        XG
      <integer>          <DNAStringSet>          <PhredQuality> <factor> <integer> <character>           <character> <integer>
  [1]       131 AGCTTGTAGG...TAGGGATAAT FFFFFFFFFF...FF::FFFF:F     chr1       202         101 A006200409_226916_S85         0
  [2]      -131 GATTTGCTTT...AGTGTACAGG :FFFFF:F:F...FFFFFFFFFF     chr1       202         101 A006200409_226916_S85         0
             NM        XM        XN        XO        XS        YS          YT
      <integer> <integer> <integer> <integer> <integer> <integer> <character>
  [1]         0         0         0         0       202       202          CP
  [2]         0         0         0         0       202       202          CP
  -------
  seqinfo: 66 sequences from an unspecified genome

[[3]]
GAlignments object with 2 alignments and 18 metadata columns:
      seqnames strand       cigar    qwidth     start       end     width     njunc |                  qname      flag      mapq
         <Rle>  <Rle> <character> <integer> <integer> <integer> <integer> <integer> |            <character> <integer> <integer>
  [1]     chr1      +        101M       101  72245863  72245963       101         0 | A00620:409:HHJ5GDSXC..        99        44
  [2]     chr1      -        101M       101  72245878  72245978       101         0 | A00620:409:HHJ5GDSXC..       147        44
          isize                     seq                    qual     mrnm        AS          MD                    RG        XG
      <integer>          <DNAStringSet>          <PhredQuality> <factor> <integer> <character>           <character> <integer>
  [1]       116 ATATGCAAGC...TCCCAACCCG FFFFFFFFFF...FFFFFFFFFF     chr1       202         101 A006200409_226916_S85         0
  [2]      -116 ACTCATACAC...TCCCTAGATC FFFFFFFFFF...FFFFFFFFFF     chr1       202         101 A006200409_226916_S85         0
             NM        XM        XN        XO        XS        YS          YT
      <integer> <integer> <integer> <integer> <integer> <integer> <character>
  [1]         0         0         0         0      <NA>       202          CP
  [2]         0         0         0         0      <NA>       202          CP
  -------
  seqinfo: 66 sequences from an unspecified genome

...
<4349698 more elements>

When I am using the bigFile=TRUE to speed the process and use less memory I get the following:

> gal <- readBamFile(file.path("C:/bam/Bowtie2/",bamfiles[i]), tag=tags, which=which,
+                    asMates=TRUE, bigFile=T)
> gal
GAlignmentsList object of length 0:
<0 elements>
@jianhong
Copy link
Owner

Yes, it is not read into memory until it is required.

@sunta3iouxos
Copy link
Author

OK, I get the point.
It is a very slow process though. Is there a way to activate R parallelisation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants