-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
high coverage fail #4
Comments
I upgraded the isInt to accommodate up to unin64 in my little library libgab. Can you do a:
I hope this will not overflow to more than 4 billion fragments, yes -c is the endogenous coverage. |
that problem is now gone, I think. I now have this error with ART though: terminate called after throwing an instance of 'std::bad_alloc' |
ok that is probably because data/70-5-25-40x_s file is 917 GB for some reason. Am I doing this wrong?: ./gargammel.pl -c 30 --comp 0.7,0.05,0.25 -l 110 -rl 100 -SS HS25 -o data/70-5-25-40x data/ what I want to get is a total of 30X human genome coverage with 100 bp paired end reads (fragment 110). That should translate to 900M reads (450M pairs) of length 100bp. Of this data set, 70% should be bacterial, 25% endogenous, 5% present-day contamination. That's what I'm trying to get anyway, but I guess I misinterpret the -c parameter. |
The ART package cannot take zipped files. Hence we have to use plain files. Can you do an ls -al in the directory data/70-5-25-40x data/ |
can you also try to run art on a subset, do you still get the std:bac_alloc? |
ls -l data/70-5-25-40x* |
art works well with a small subset, no std:bad_alloc |
thanks! I have emailed the developers, let's wait. In the meantime, maybe
you can dice up the input using unix split? Very sorry for the trouble.
…On Mon, Oct 22, 2018 at 8:46 PM Can Alkan ***@***.***> wrote:
art works well with a small subset, no std:bad_alloc
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#4 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACEWo0OOuUbBJEfbIp9VSe6_53J4E7n2ks5unhKegaJpZM4Xu2r9>
.
|
ok. there are _b, _c files as well, should I repeat with them? What happens after that, is the ART output the final output? |
normally you just need the _a file. it is the one with the adapter ligated
on the deaminated fragments.
ART produces the final output yes.
…On Mon, Oct 22, 2018 at 9:12 PM Can Alkan ***@***.***> wrote:
ok. there are _b, _c files as well, should I repeat with them? What
happens after that, is the ART output the final output?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#4 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACEWo50xh_kRpqkb5nS83v2_BRruKImuks5unhi0gaJpZM4Xu2r9>
.
|
Hi
I am trying to simulate aDNA data at high coverage. I assume the "-c" parameter sets the overall depth of coverage. Is this correct, or does it set the endogenous coverage? I do this:
./gargammel.pl -c 30 --comp 0.7,0.05,0.25 -l 110 -rl 100 -SS HS25 -o data/70-5-25-40x data/
after quite a long time gargammel fails:
....
Produced 2,147,400,000
ERROR: Cannot add thousandSeparator to non-integer 2147500000
system cmd /mnt/compgen/homes/calkan/projects/ancient/gargammel/src/adptSim -f AGATCGGAAGAGCACACGTCTGAACTCCAGTCACCGATTCGATCTCGTATGCCGTCTTCTGCTTG -s AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATTT -l 100 -artp data/70-5-25-40x_a.fa data/70-5-25-40x_d.fa.gz failed: 256 at ./gargammel.pl line 79.
The text was updated successfully, but these errors were encountered: