Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fcs adaptor detects no contamination? #106

Open
MatteoSebastianelli opened this issue Jan 20, 2025 · 3 comments
Open

fcs adaptor detects no contamination? #106

MatteoSebastianelli opened this issue Jan 20, 2025 · 3 comments

Comments

@MatteoSebastianelli
Copy link

Hello,

I ran the run_fcsadaptor.sh scripts on fasta file with scaffolds before manual curation of the assembly. I found weird that fcs does not detect any contamination. The report.txt file is empty. This is the first time I run fcs to check for contamination, and I wanted to double check that the software behaved properly. Here is the log file printed out:

[WARN tini (14694)] Tini is not running as PID 1 and isn't registered as a child subreaper.
Zombie processes will not be re-parented to Tini, so zombie reaping won't work.
To fix the problem, use the -s option or set the environment variable TINI_SUBREAPER to register Tini as a child subreaper, or run Tini as PID 1.
Output will be placed in: /output-volume
Executing the workflow
Resolved '/app/fcs/progs/ForeignContaminationScreening.cwl' to 'file:///app/fcs/progs/ForeignContaminationScreening.cwl'
[workflow ] start
[workflow ] starting step ValidateInputSequences
[step ValidateInputSequences] start
[job ValidateInputSequences] /scratch/52538381/eq_jduws$ validate_fasta
--jsonl
validate_fasta.log
--fasta-output
validated.fna
/scratch/52538381/zukw5tu_/stg56a37a2a-fc7e-43ca-935c-441fc68fa302/fortisHiC_scaffolds_final.fa.MicroFinder.ordered.fa > /scratch/52538381/eq_jduws/validate_fasta.txt
[job ValidateInputSequences] Max memory used: 597MiB
[job ValidateInputSequences] completed success
[step ValidateInputSequences] completed success
[workflow ] starting step parallel_section
[step parallel_section] start
[workflow parallel_section] start
[workflow parallel_section] starting step SplitInputSequences
[step SplitInputSequences] start
[workflow SplitInputSequences] start
[workflow SplitInputSequences] starting step fasta_split
[step fasta_split] start
[job fasta_split] /scratch/52538381/l7q_0s15$ fasta_split
-in_file
/scratch/52538381/jq_qj688/stg85c0894f-f175-4aaa-a4db-0d3f68066d78/validated.fna_0.fna
-out_file
split_fasta.fna
-logfile
fast_split.log
-mapping_json
seq_mapping.jsonl
protobuf arena allocated space: 10000000, used: 595816
[job fasta_split] Max memory used: 620MiB
[job fasta_split] completed success
[step fasta_split] completed success
[workflow SplitInputSequences] starting step log
[step log] start
[job log] /scratch/52538381/s284i85y$ cxxlog2pb
--stage
SplitInputSequences < /scratch/52538381/dffb15vf/stga2c971ff-7367-4d69-923b-fadc7ca056fb/fast_split.log > /scratch/52538381/s284i85y/45ec615881c92e938694dd3ccaea7a9c03748313
[job log] Max memory used: 29MiB
[job log] completed success
[step log] completed success
[workflow SplitInputSequences] completed success
[step SplitInputSequences] completed success
[workflow parallel_section] starting step AdaptorScreeningAndFilterResults
[step AdaptorScreeningAndFilterResults] start
[workflow AdaptorScreeningAndFilterResults] start
[workflow AdaptorScreeningAndFilterResults] starting step blast
[step blast] start
[job blast] /scratch/52538381/aqwiystn$ vecscreen
-db
adaptors_for_euks
-logfile
vecscreen.log
-out
vs_unfiltered.hit
-query
/scratch/52538381/i0defv1b/stg178e9386-97e6-497c-bca1-636f174e1a5f/split_fasta.fna
-term-flex
25
[job blast] Max memory used: 326MiB
[job blast] completed success
[step blast] completed success
[workflow AdaptorScreeningAndFilterResults] starting step filter
[step filter] start
[job filter] /scratch/52538381/rk_uzqtc$ vecscreen_filter
--filtered
vs_filtered.jsonl
--unfiltered
/scratch/52538381/1fxkkthg/stgc78b026d-eedb-42bf-bad4-12e391e35e83/vs_unfiltered.hit
[job filter] Max memory used: 29MiB
[job filter] completed success
[step filter] completed success
[workflow AdaptorScreeningAndFilterResults] starting step log_2
[step log_2] start
[job log_2] /scratch/52538381/5ai7md9v$ cxxlog2pb
--stage
AdaptorScreening < /scratch/52538381/ftwhqpwg/stgfb57110f-5b98-46da-a905-ebc62d195b81/vecscreen.log > /scratch/52538381/5ai7md9v/45ec615881c92e938694dd3ccaea7a9c03748313
[job log_2] Max memory used: 32MiB
[job log_2] completed success
[step log_2] completed success
[workflow AdaptorScreeningAndFilterResults] completed success
[step AdaptorScreeningAndFilterResults] completed success
[workflow parallel_section] starting step ApplyHeuristicsToMakeExcludeAndTrimCalls
[step ApplyHeuristicsToMakeExcludeAndTrimCalls] start
[workflow ApplyHeuristicsToMakeExcludeAndTrimCalls] start
[workflow ApplyHeuristicsToMakeExcludeAndTrimCalls] starting step make_calls
[step make_calls] start
[job make_calls] /scratch/52538381/u6zpr1u1$ make_calls
-a
/scratch/52538381/queuykcm/stg7c62ec5b-acfc-4268-b192-11cbec9f4d2c/vs_filtered.jsonl
-logfile
make_calls.log
-seq-len
/scratch/52538381/queuykcm/stg61be8cea-1220-403e-9b85-83a54e3379c9/seq_mapping.jsonl > /scratch/52538381/u6zpr1u1/combined.calls.jsonl
[job make_calls] completed success
[step make_calls] completed success
[workflow ApplyHeuristicsToMakeExcludeAndTrimCalls] starting step log_3
[step log_3] start
[job log_3] /scratch/52538381/f7ck7tv_$ cxxlog2pb
--stage
ApplyHeuristicsToMakeExcludeAndTrimCalls < /scratch/52538381/2mfqgzei/stge52161fa-d3c4-4029-a011-735c4ea49e2d/make_calls.log > /scratch/52538381/f7ck7tv_/45ec615881c92e938694dd3ccaea7a9c03748313
[job log_3] Max memory used: 32MiB
[job log_3] completed success
[step log_3] completed success
[workflow ApplyHeuristicsToMakeExcludeAndTrimCalls] completed success
[step ApplyHeuristicsToMakeExcludeAndTrimCalls] completed success
[workflow parallel_section] starting step log_merging
[step log_merging] start
[job log_merging] /scratch/52538381/qjaovj_1$ cat
/scratch/52538381/m_t_rysj/stgde69e61d-27ec-4f5d-a595-722aa0cc7fb8/45ec615881c92e938694dd3ccaea7a9c03748313
/scratch/52538381/m_t_rysj/stgea39f734-1f48-462b-8060-9a2a8ef5f9a2/45ec615881c92e938694dd3ccaea7a9c03748313
/scratch/52538381/m_t_rysj/stga1f563a3-747b-4f15-908b-ca8e4d7e346e/45ec615881c92e938694dd3ccaea7a9c03748313 > /scratch/52538381/qjaovj_1/par_sec.log
[job log_merging] completed success
[step log_merging] completed success
[workflow parallel_section] completed success
[step parallel_section] completed success
[workflow ] starting step adaptor_calls
[step adaptor_calls] start
[job adaptor_calls] /scratch/52538381/9qheeb7y$ cat
/scratch/52538381/omxb204h/stg6fe661aa-e099-4388-9502-a030d3db9dc4/combined.calls.jsonl > /scratch/52538381/9qheeb7y/adaptor_calls.jsonl
[job adaptor_calls] completed success
[step adaptor_calls] completed success
[workflow ] starting step gather_logs
[step gather_logs] start
[job gather_logs] /scratch/52538381/0gl48dhl$ cat
/scratch/52538381/bgaabvdn/stg033cfb93-9d68-48b4-b651-68c3fbd5194c/par_sec.log > /scratch/52538381/0gl48dhl/par_sec_logs.log
[job gather_logs] completed success
[step gather_logs] completed success
[workflow ] starting step seq_mapping
[step seq_mapping] start
[job seq_mapping] /scratch/52538381/7gc7ztvj$ cat
/scratch/52538381/a_mbc5cx/stg3ab619d7-f3c3-4b72-9e26-31876d894043/seq_mapping.jsonl > /scratch/52538381/7gc7ztvj/seq_mapping.jsonl
[job seq_mapping] completed success
[step seq_mapping] completed success
[workflow ] starting step post_processor
[step post_processor] start
[workflow post_processor] start
[workflow post_processor] starting step postproc_calls
[step postproc_calls] start
[job postproc_calls] /scratch/52538381/e920ckzh$ postproc_calls
-in_calls
/scratch/52538381/_92yfwib/stg9b3151df-81d4-4a58-9042-befcb0b7ae65/adaptor_calls.jsonl
-logfile
postproc_calls.log
-input_mapping
/scratch/52538381/_92yfwib/stgd9a7f527-852e-48a7-ae5c-195fa08f5bd4/seq_mapping.jsonl
-out_file
combined.calls.jsonl
[job postproc_calls] completed success
[step postproc_calls] completed success
[workflow post_processor] starting step log_4
[step log_4] start
[job log_4] /scratch/52538381/e8fcu245$ cxxlog2pb
--stage
PostProcessCalls < /scratch/52538381/7280k2p0/stg9dcf3509-9e64-4950-b0a9-fd6320feb941/postproc_calls.log > /scratch/52538381/e8fcu245/3bc8758dc9026d571fb8c6b8383da3db38612251
[job log_4] Max memory used: 30MiB
[job log_4] completed success
[step log_4] completed success
[workflow post_processor] completed success
[step post_processor] completed success
[workflow ] starting step collect_logs
[step collect_logs] start
[job collect_logs] /scratch/52538381/fj6tqbcm$ cat
/scratch/52538381/urnvcdjn/stg38be2718-d627-4a34-94ca-078d7d229cc6/validate_fasta.log
/scratch/52538381/urnvcdjn/stg6927fc9e-4ddb-4195-82d3-f7252f940371/par_sec_logs.log
/scratch/52538381/urnvcdjn/stgddbb15cb-b142-4f50-8b69-03be826128b7/3bc8758dc9026d571fb8c6b8383da3db38612251 > /scratch/52538381/fj6tqbcm/logs.jsonl
[job collect_logs] completed success
[step collect_logs] completed success
[workflow ] starting step GenerateReport
[step GenerateReport] start
[workflow GenerateReport] start
[workflow GenerateReport] starting step log_step
[step log_step] start
[job log_step] /scratch/52538381/ywvt4es4$ log_jl2tsv
--infile
/scratch/52538381/7u7tjcfl/stga23b4a6c-dd27-4dd9-8b46-1b1fc11e7eb8/logs.jsonl
--outfile
fcs.log
[job log_step] Max memory used: 29MiB
[job log_step] completed success
[step log_step] completed success
[workflow GenerateReport] starting step calls_step
[step calls_step] start
[job calls_step] /scratch/52538381/bv2j8gu9$ pbcalls2tsv < /scratch/52538381/2wqd4xwf/stg9fc67b85-9cab-4086-8289-1d201fa826d1/combined.calls.jsonl > /scratch/52538381/bv2j8gu9/fcs_adaptor_report.txt
[job calls_step] Max memory used: 29MiB
[job calls_step] completed success
[step calls_step] completed success
[workflow GenerateReport] completed success
[step GenerateReport] completed success
[workflow ] starting step GenerateCleanedFasta
[step GenerateCleanedFasta] start
[workflow GenerateCleanedFasta] start
[workflow GenerateCleanedFasta] starting step prepare_xml_step
[step prepare_xml_step] start
[job prepare_xml_step] /scratch/52538381/h7muifzo$ pbcalls2seqtransform
--skipped
skipped_trims.jsonl < /scratch/52538381/wxi1sh85/stgfd2cbff5-61eb-4494-b125-e2d37eaac778/combined.calls.jsonl > /scratch/52538381/h7muifzo/fcs_calls.xml
[job prepare_xml_step] Max memory used: 29MiB
[job prepare_xml_step] completed success
[step prepare_xml_step] completed success
[workflow GenerateCleanedFasta] starting step seqtransform_step
[step seqtransform_step] start
[job seqtransform_step] /scratch/52538381/rjd7uabg$ seqtransform
-out
validated.fna_0.cleaned_fa
-in
/scratch/52538381/l94x1g0m/stgf60105fc-a016-4a3b-86dd-ebc074cecc28/validated.fna_0.fna
-seqaction-xml-file
/scratch/52538381/l94x1g0m/stg77c906b9-4939-48c7-a2ff-f64f4ffc619a/fcs_calls.xml
-report
seqtransform.log
[job seqtransform_step] Max memory used: 804MiB
[job seqtransform_step] completed success
[step seqtransform_step] completed success
[workflow GenerateCleanedFasta] completed success
[step GenerateCleanedFasta] completed success
[workflow ] starting step all_skipped_trims
[step all_skipped_trims] start
[job all_skipped_trims] /scratch/52538381/4e3exgrc$ cat
/scratch/52538381/sgj0w0fd/stge6a69c4e-2861-4913-ad37-188b2c0b0b51/skipped_trims.jsonl > /scratch/52538381/4e3exgrc/skipped_trims.jsonl
[job all_skipped_trims] completed success
[step all_skipped_trims] completed success
[workflow ] starting step all_cleaned_fasta
[step all_cleaned_fasta] start
[step all_cleaned_fasta] completed success
[workflow ] completed success

Thanks in advance!
Matteo

@etvedte
Copy link
Contributor

etvedte commented Jan 21, 2025

I didn't see any pipeline failures in your log that should be cause for concern. Do you expect to see adaptor contamination in this genome?

If you are wanting to check for normal FCS-adaptor operation, please try the usage example on the wiki which has known contaminated sequences.

@MatteoSebastianelli
Copy link
Author

MatteoSebastianelli commented Jan 21, 2025 via email

@etvedte
Copy link
Contributor

etvedte commented Jan 21, 2025

Although short sequences are more likely to contain contaminants, it is not always the case. FCS-adaptor in particular uses the same alignment size / score thresholds regardless of source sequence size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants