-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
long read QC #57
Comments
What sort of file do you have? It is a valid SAM file?
…On Feb 22, 2018 10:58 PM, "ehhill" ***@***.***> wrote:
Hi there,
I have long read RNA-seq data from the minion platform. I'm trying to plot
gene body coverage of these reads. Is there a reason this data wouldn't be
compatible with the QoRTs java application? When I try to run it I get this
error message:
Exception in thread "main" net.sf.samtools.SAMFormatException: SAM
validation error: ERROR: Record 1, Read name 1_323911, Zero-length read
without FZ, CS or CQ tag
Cheers,
Erin
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#57>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACwu7GLtj6TvEZTCx6cybqyxUNqLPAq_ks5tXjdSgaJpZM4SQWlT>
.
|
I've tried SAM/BAM files generated from both gmap and minimap2 (just the generic SAM file format output by these programs). I get the same error message for both. |
Can you send the first read line of one of these files?
The SAM file parser comes from the samtools jdk, and that is what's
throwing an error.
…On Feb 26, 2018 5:26 PM, "ehhill" ***@***.***> wrote:
I've tried SAM/BAM files generated from both gmap and minimap2 (just the
generic SAM file format output by these programs). I get the same error
message for both.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#57 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACwu7Jh6MHngNAoQ58H7nkTpqZQ4VLKEks5tYy8TgaJpZM4SQWlT>
.
|
Hi, this is the first entry from a minimap2 file. b66c0d3f-f3e9-46d4-9556-90424e2ca7cb 4 * 0 0 * * 0 0 TTGTGCTTTTCATTTATTCCTCGGCTGGGTTGTTTAGCATTCCCTGCTCTTTTTAAAACGTCATTGTTTATCCGGCTTGTCCGCAGCGGCTGGCTTTTCATCCATACCCAAACGCCTATCACCGTATGTCATCTCGTCATTTCCTGGTATCATTCTGACGTACTCGGCGTGCCATGCGGCACTTCCAGTGGCGTCATCTCCACAAGGCTCCATTTCCCTTTGCCAGGTAGGCAATCACCGTCGCCATCATCTCATACCCTCGATATGCAAAAGCATCGGGCTCGTTGCATTCGCCTCTGCCCATTCGATCATCAGGTCATGTGTACGCCCACCGGCGCTGGCTGTTCGGCATTCGACCGCTCGGCATATCTCGGCTCGACCAATCGGTGCTGGCCACATCTGCCGCCGCAATTATATGGTGCTGCGGTCTGGCCGCG |
It looks like this is an unaligned SAM file. QoRTs only works on aligned
data.
…On Feb 26, 2018 8:04 PM, "ehhill" ***@***.***> wrote:
Hi, this is the first entry from a minimap2 file.
b66c0d3f-f3e9-46d4-9556-90424e2ca7cb 4 * 0 0 * * 0 0
TTGTGCTTTTCATTTATTCCTCGGCTGGGTTGTTTAGCATTCCCTGCTCTTTTTAAAACG
TCATTGTTTATCCGGCTTGTCCGCAGCGGCTGGCTTTTCATCCATACCCAAACGCCTATC
ACCGTATGTCATCTCGTCATTTCCTGGTATCATTCTGACGTACTCGGCGTGCCATGCGGC
ACTTCCAGTGGCGTCATCTCCACAAGGCTCCATTTCCCTTTGCCAGGTAGGCAATCACCG
TCGCCATCATCTCATACCCTCGATATGCAAAAGCATCGGGCTCGTTGCATTCGCCTCTGC
CCATTCGATCATCAGGTCATGTGTACGCCCACCGGCGCTGGCTGTTCGGCATTCGACCGC
TCGGCATATCTCGGCTCGACCAATCGGTGCTGGCCACATCTGCCGCCGCAATTATATGGTGCTGCGGTCTGGCCGCG
$))($%-130%#2 <#2>
,%'/-('#$%%%%,,+..,12)'',&3)-)((#%'&(++)'++1'&'()%''%*11-23*(*&*
++,9'#%$%%(((#%%%%&*)''&2.,,)5,+))&'$*'%&*'"(&#$%*&-,'))()&,/'01,1)'.(
*&'&$''&$+$%$&$+,,+-))'&-,'((,+).$(-+:.'.:-/+(-%+$%%('#%'')%&-*/)/8+2-&+
0(%0(&,*)*'%%'/-3(,*&-&%*)%)(-00%$$'#)-)()..+&&$&
*+0-&+-11,,/1.,).)'('/./'55,+*)./-+060)&'&*0-))*)2(*)'&%$($)('$<(*+/,.'
*''&#'')*)('/:/.'00%(&'$//'('(&)+',*/7:;:7-(2**(*&()*''&%'+*('',5*+($%''3(
*'))+-($&&#&)*((%)*(&&%&
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#57 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACwu7LaV9QVMmAG7yb8TFWjz8dpP7SA7ks5tY1S2gaJpZM4SQWlT>
.
|
That's odd, it should be aligned to the genome; it was from a minimap2 spliced alignment to the genome. Error info: Here is an entry in this sam file. Thanks for you help. |
Hmm. The gmap SAM file looks ok.
Can you give me the command line used with that gmap run?
…On Feb 26, 2018 8:17 PM, "ehhill" ***@***.***> wrote:
That's odd, it should be aligned to the genome; it was from a minimap2
spliced alignment to the genome.
Regardless, I've tried other sam files from gmap and these throw this
error:
Error info:
Exception in thread "main" java.lang.UnsupportedOperationException:
empty.min
at scala.collection.TraversableOnce.min(TraversableOnce.scala:222)
at scala.collection.TraversableOnce.min$(TraversableOnce.scala:220)
at scala.collection.mutable.ArrayOps$ofByte.min(ArrayOps.scala:203)
at internalUtils.commonSeqUtils$.$anonfun$initSamRecordIterator$3(
commonSeqUtils.scala:710)
at internalUtils.commonSeqUtils$.$anonfun$initSamRecordIterator$3$
adapted(commonSeqUtils.scala:709)
at scala.collection.TraversableLike.$anonfun$map$
1(TraversableLike.scala:234)
at scala.collection.immutable.List.foreach(List.scala:389)
at scala.collection.TraversableLike.map(TraversableLike.scala:234)
at scala.collection.TraversableLike.map$(TraversableLike.scala:227)
at scala.collection.immutable.List.map(List.scala:295)
at internalUtils.commonSeqUtils$.initSamRecordIterator(
commonSeqUtils.scala:709)
at qcUtils.runAllQC$.runOnSeqFile(runAllQC.scala:1020)
at qcUtils.runAllQC$.run(runAllQC.scala:960)
at qcUtils.runAllQC$allQC_runner.run(runAllQC.scala:672)
at runner.runner$.main(runner.scala:97)
at runner.runner.main(runner.scala)
Here is an entry in this sam file.
1_377103 16 CS10_Chromosome_05 746131 40 12S5M1I7M4D54M1I7M1I5M2D9M24S * 0
0 CTGATTCGCCGTGACTCTTTGCTTAGTTCATGCAGCCCATGCTCACGTCATATCATGAGC
TCTGAATCCAATAAGAGATGGAACCGAGTTGAATGTACTCCGGAACCAGCCACGGCAAGCACATTT *
MD:Z:6A5^GTTT20T16C6C1C1GA11A4^GT2A6 NH:i:1 HI:i:1 NM:i:18 SM:i:40
XQ:i:40 X2:i:0 XO:Z:UU XS:A:- XG:Z:M
Thanks for you help.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#57 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACwu7IKx30AH23YkCL49jWgy09gBKGZIks5tY1ePgaJpZM4SQWlT>
.
|
I've generated a new gmap file using Then used samtools/bamtools to convert to sorted .BAM. This is the .SAM file: I run QoRTs with: java -Xmx4G -jar QoRTs.jar QC --singleEnded --generatePlots allreads.gmap.new.sam EVM6.gtf ~/coverage_plot/test And get the error message: Checking first 10000 reads. Checking SAM file for formatting errors... |
HI @hartleys , thanks for writing such a useful tool. I'm also trying to apply QoRTs to long read sequencing data generated by PacBio IsoSeq3 and aligned to the human genome with minimap2.
I've tried running QoRTs on both the SAM file and the coordinate sorted and indexed BAM file created by samtools. This is the command I've been trying:
I gradually added all the extra flags in vain to get it to run. This is the error message I got each time:
For good measure here is a read from my SAM file:
Thanks again! |
I'm seeing the same issue on the attached bam:
|
Hi there,
I have long read RNA-seq data from the minion platform. I'm trying to plot gene body coverage of these reads. Is there a reason this data wouldn't be compatible with the QoRTs java application? When I try to run it I get this error message:
Exception in thread "main" net.sf.samtools.SAMFormatException: SAM validation error: ERROR: Record 1, Read name 1_323911, Zero-length read without FZ, CS or CQ tag
Cheers,
Erin
The text was updated successfully, but these errors were encountered: