You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I suggest we lower the default --target_mono_exonic_pct from 20 to 5%
With some species with smaller gene sets finding 20% of 1200 train and test genes wont be possible, this was the case for a recent fungal genome.
REAT Failed, the following file might contain information with the reasons behind the failure
/ei/.project-scratch/e/e701c73c-45b1-4784-9385-6c69cf3272cf/CB-GENANNO-508_ERGA_Spongipellis_delectans/Analysis/reat-dev-issue25/Prediction/cromwell-executions/ei_prediction/d18b476e-faa4-4c2f-98a7-b5797c30ddde/call-SelectAugustusTestAndTrain/execution/stderr
+ generate_augustus_test_and_train /ei/.project-scratch/e/e701c73c-45b1-4784-9385-6c69cf3272cf/CB-GENANNO-508_ERGA_Spongipellis_delectans/Analysis/reat-dev-issue25/Prediction/cromwell-executions/ei_prediction/d18b476e-faa4-4c2f-98a7-b5797c30ddde/call-SelectAugustusTestAndTrain/inputs/-1046222641/with_utr.extra.gff --train_min 400 --train_max 1000 --test_max 200 --target_mono_exonic_pct 20
+ gff2gbSmallDNA.pl test.gff /ei/.project-scratch/e/e701c73c-45b1-4784-9385-6c69cf3272cf/CB-GENANNO-508_ERGA_Spongipellis_delectans/Analysis/reat-dev-issue25/Prediction/cromwell-executions/ei_prediction/d18b476e-faa4-4c2f-98a7-b5797c30ddde/call-SelectAugustusTestAndTrain/inputs/1001504700/gfSpoDele1_1.curated_primary.softmasked.fa 200 test.gb
Couldn't open test.gff.
When examined I could see that we simply dont have 240 single exon genes and the generate_augustus_test_and_train script generates no output with no info in an error log so it's not entirely transparant to a user what caused the error
Note the -f force option does not override the target_mono_exonic_pct 20% requirement though this does give an error indication
generate_augustus_test_and_train /ei/.project-scratch/e/e701c73c-45b1-4784-9385-6c69cf3272cf/CB-GENANNO-508_ERGA_Spongipellis_delectans/Analysis/reat-dev-issue25/Prediction/cromwell-executions/ei_prediction/2482d9fe-d7e9-42dc-bbaa-8259e9e25fb8/call-SelectAugustusTestAndTrain/inputs/-578101069/with_utr.extra.gff --train_min 400 --train_max 1000 --test_max 200 --target_mono_exonic_pct 20 -f
Requested minimum number of mono-exonic models: 240
Real possible minimum number of mono-exonic models: 6
Number of train models: 32
Number of mono-exonic models in train set: 6
Traceback (most recent call last):
File "/ei/software/cb/reat/dev-issue32/x86_64/bin/generate_augustus_test_and_train", line 138, in <module>
main()
File "/ei/software/cb/reat/dev-issue32/x86_64/bin/generate_augustus_test_and_train", line 101, in main
test_models = random.sample(train_models, args.test_max)
File "/ei/software/cb/reat/dev-issue32/x86_64/lib/python3.9/random.py", line 449, in sample
raise ValueError("Sample larger than population or is negative")
ValueError: Sample larger than population or is negative
The idea was that target_mono_exonic_pct set a maximum percentage of single exon genes, as coded it works as a target. That being the case I would just lower it to 5%
The text was updated successfully, but these errors were encountered:
I suggest we lower the default --target_mono_exonic_pct from 20 to 5%
With some species with smaller gene sets finding 20% of 1200 train and test genes wont be possible, this was the case for a recent fungal genome.
When examined I could see that we simply dont have 240 single exon genes and the generate_augustus_test_and_train script generates no output with no info in an error log so it's not entirely transparant to a user what caused the error
Note the -f force option does not override the target_mono_exonic_pct 20% requirement though this does give an error indication
The idea was that target_mono_exonic_pct set a maximum percentage of single exon genes, as coded it works as a target. That being the case I would just lower it to 5%
The text was updated successfully, but these errors were encountered: