Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nosetests showing 1 error in python3 (and multiple for python 2.7 with appropriate umi tools version) #450

Closed
alexander-e-f-smith opened this issue Jan 7, 2021 · 11 comments

Comments

@alexander-e-f-smith
Copy link

alexander-e-f-smith commented Jan 7, 2021

Hi
I've not been able to get a completely error free output after new installation and running your nose tests. See below for details of how I have run this and environment set-up before hand. I get 1 error around whitelisting for python3 set-up when running nose tests. And multiple errors for old installation of umi tools (0.5.1) in python 2.7....although i'm not sure if the nose tests are appropriate here (nosetest umi_test.py is aligned for the installation version of umi_tools. See below for details:

In python 3.7
export PYTHONHASHSEED=0

root@feac0438ff76:/usr/umitools# which umi_tools
/opt/venv/bin/umi_tools

root@feac0438ff76:/usr/umitools# umi_tools --version
UMI-tools version: 1.1.1

root@e83a97661f88:/usr/umitools/UMI-tools-1.1.1# nosetests tests/test_umi_tools.py

OUTPUT:

../usr/umitools/UMI-tools-1.1.1/tests/test_umi_tools.py:240: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  tool_tests = yaml.load(open(fn))
......F.................................................
======================================================================
FAIL: umi_tools.py/whitelist_scrb_seq
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/venv/lib/python3.7/site-packages/nose/case.py", line 198, in runTest
    self.test(*self.arg)
  File "/usr/umitools/UMI-tools-1.1.1/tests/test_umi_tools.py", line 202, in check_script
    ok_(not fail, msg)
AssertionError: files /tmp/tmp17lqxxuh/stdout and tests/whitelist_umi_output.sam are not the same
/bin/bash -c 'umi_tools dedup --paired --log=test.log --filter-umi --umi-whitelist=/usr/umitools/UMI-tools-1.1.1/tests/umi_whitelist.tsv --umi-whitelist-paired=/usr/umitools/UMI-tools-1.1.1/tests/umi_whitelist.tsv --out-sam --random-seed=123456789 --stdin=/usr/umitools/UMI-tools-1.1.1/tests/whitelist_umi_input.bam |sort > /tmp/tmp17lqxxuh/stdout'
md5: output=1531, ed9d1ea5c354335f290837080ffb1115 reference=1531, 4d05033cdda6ca1bbbfa412409247d42first 10 differences: @PG	ID:hisat2	PN:hisat2	VN:2.1.0	CL:"/home/PROT-FILESVR1/proteomics/tss38/cgat-install/conda-install/envs/py36-v1/bin/hisat2-align-s --wrapper basic-0 --threads 1 --rna-strandness RF -x /home/PROT-FILESVR1/proteomics/tss38/references/genomes/hisat2/hg38 --known-splicesite-infile geneset.dir/geneset_all.junctions --novel-splicesite-outfile hisat_no_filter.dir/CL-1-1.hisat.bam_novel_junctions -1 /tmp/10587.inpipe1 -2 /tmp/10587.inpipe2"
NS500105:308:HCGGHAFXY:1:11101:20827:5614_CTCAGTCTTTCGTTCG	99	chr1	22053	60	1S67M	=	192659	170684	TGGGCTGCCCACAGGGCTCCTCAGTCTAAGCCAAGTGGTGTGTCATAGTCCCCTGGCCCCAGTAATGG	/EEEEEEEEEEEEEAEEEAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEE	AS:i:-11	ZS:i:-11	XN:i:0	XM:i:2	XO:i:0	XG:i:0	NM:i:2	MD:Z:13A46T6	YS:i:-1	YT:Z:CP	XS:A:-	NH:i:1
--
@SQ	SN:chr1	LN:248956422
NS500105:308:HCGGHAFXY:1:11101:25537:9008_GTTGTCGAAGACACTC	99	chr1	19796	1	1S65M	=	190403	170684	CCCATCTTTTCTGGCTGGGGAGAGGCCTTCATCTGCTGTAAAGGGTCCTCCAGCACAAGCTGTCTT	/EEAEEEEEEEEEEEEEEEEEEEEEE/EEEEEEEEEEEEAEEEEEEEEEEEAEEEEEEEEEEEEEE	AS:i:-1	ZS:i:-1	XN:i:0	XM:i:0	XO:i:0	XG:i:0	NM:i:0	MD:Z:65	YS:i:-2	YT:Z:CP	XS:A:-	NH:i:4
--
@SQ	SN:chr10	LN:133797422
NS500105:308:HCGGHAFXY:1:11101:8596:6657_GGATAACGAATTCCGG	355	chr1	91436	1	68M	=	91623	255	GCAGAGGTCAGCAAGGCAAACCCGAGCCCAGGGATGCGGGGTGGGGGCAGGTACATCCTCTCTTGAGC	6EEEEEEEEEEEA/EEEAEEEEEEEEEEEEE/EEEEEEEEEAEEEEEEEEEEE6EE6E/EEEEEE/E<	AS:i:0	ZS:i:0	XN:i:0	XM:i:0	XO:i:0	XG:i:0	NM:i:0	MD:Z:68	YS:i:-1	YT:Z:CP	XS:A:-	NH:i:3
--
@SQ	SN:chr10_GL383545v1_alt	LN:179254
NS500105:308:HCGGHAFXY:1:11101:8596:6657_GGATAACGAATTCCGG	403	chr1	91623	1	7M461N60M1S	=	91436	-255	CCAAACTCTTGGTTGTGTTCTTTGATTAGTGCCTGTGACGCAGCTTCAGGAGGTCCTGAGAACGTGTAEE/AA</E//<///A//</E666//666A66EE/A66EE6E6AE//AAA6A/66E/666A6EEA6A</	AS:i:-1	ZS:i:-1	XN:i:0	XM:i:0	XO:i:0	XG:i:0	NM:i:0	MD:Z:67	YS:i:0	YT:Z:CP	XS:A:-	NH:i:3
--
@SQ	SN:chr10_GL383546v1_alt	LN:309802
NS500105:308:HCGGHAFXY:1:11101:8596:6657_GGATAACGAATTCCGG	99	chr1	91436	1	68M	=	268198	176898	GCAGAGGTCAGCAAGGCAAACCCGAGCCCAGGGATGCGGGGTGGGGGCAGGTACATCCTCTCTTGAGC	6EEEEEEEEEEEA/EEEAEEEEEEEEEEEEE/EEEEEEEEEAEEEEEEEEEEE6EE6E/EEEEEE/E<	AS:i:0	ZS:i:0	XN:i:0	XM:i:0	XO:i:0	XG:i:0	NM:i:0	MD:Z:68	YS:i:-1	YT:Z:CP	XS:A:-	NH:i:3
--
@SQ	SN:chr10_KI270824v1_alt	LN:181496
NS500105:308:HCGGHAFXY:1:11102:25272:14512_GTTCTGCTCTGAGTGT	99	chr1	21354	60	68M	=	191926	170649	TCCTGACTACAATAACAGATTCTGGGTGTCCCTGGCATCCACTCTCTCTCCCTTATTATCCCTTCAGT	E</E6EAEEEEE/EE/E//EEEA//AAAE/<A///EEA//EE///<//<//A66//E//AE/A////A	AS:i:-12	ZS:i:-12	XN:i:0	XM:i:4	XO:i:0	XG:i:0	NM:i:4	MD:Z:18C35C2G7C2	YS:i:-1	YT:Z:CP	XS:A:-	NH:i:1
--
@SQ	SN:chr10_KI270825v1_alt	LN:188315
NS500105:308:HCGGHAFXY:1:11103:19873:1262_TGAGAGTGGAAGTGCA	355	chr1	18999	1	1S65M	=	19051	121	TCTTGCATCTCATGGAACGCCATTTCCCCAGACATCCCTGTGGCTGGCTCCTGATGCCCGAGGCCC	AEEEEE/EEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEE<EEEEEEEEEEEEEAE	AS:i:-1	ZS:i:-1	XN:i:0	XM:i:0	XO:i:0	XG:i:0	NM:i:0	MD:Z:65	YS:i:-1	YT:Z:CP	XS:A:-	NH:i:5
--
@SQ	SN:chr11	LN:135086622
NS500105:308:HCGGHAFXY:1:11103:19873:1262_TGAGAGTGGAAGTGCA	403	chr1	19051	1	67M1S	=	18999	-121	ATGCCCGAGGCCCAAGTGTCTGATGCTTTAAGGCACATCACCCCACTCATGCTTTTCCATGTTCTTTA	EEEEEEEEEEEEEEEAEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAE/	AS:i:-1	ZS:i:-1	XN:i:0	XM:i:0	XO:i:0	XG:i:0	NM:i:0	MD:Z:67	YS:i:-1	YT:Z:CP	XS:A:-	NH:i:5
--
@SQ	SN:chr11_GL383547v1_alt	LN:154407
NS500105:308:HCGGHAFXY:1:11103:8261:10925_ATCCATGGTCGTAGGT	355	chr1	14224	0	5S63M	=	184948	170804	GCCCGTCTGCAACAGCTGCCCCTGCTGACGGCCCTTCTCTCCTCCCTCTCATCCCAGAGAAACAGGTC	/EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEAEEEEEEEEEEEEEEEEE	AS:i:-10	ZS:i:-10	XN:i:0	XM:i:1	XO:i:0	XG:i:0	NM:i:1	MD:Z:24T38	YS:i:-10	YT:Z:CP	XS:A:-	NH:i:2
--
@SQ	SN:chr11_JH159136v1_alt	LN:200998
NS500105:308:HCGGHAFXY:1:11104:19418:19384_GTTCACCTAGTCGAGA	355	chr1	17033	1	4S64M	=	17218	257	CGTTACATAGAAGTAGTTCTCTGGGACCTGCAAGATTAGGCAGGGACATGTGAGAGGTGACAGGGACC	/EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEAEEEEEEEE	AS:i:-4	ZS:i:-4	XN:i:0	XM:i:0	XO:i:0	XG:i:0	NM:i:0	MD:Z:64	YS:i:0	YT:Z:CP	XS:A:-	NH:i:5
--
@SQ	SN:chr11_JH159137v1_alt	LN:191409
NS500105:308:HCGGHAFXY:1:11104:19418:19384_GTTCACCTAGTCGAGA	403	chr1	17218	1	68M	=	17033	-257	TCCCACCCCTCCCACCTGCTGTTCCAGCTGCTCTCTCTTGCTGATGGACAAGGGGGCATCAAACAGCT	EEEEEEEEEEEAE/EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE</	AS:i:0	ZS:i:0	XN:i:0	XM:i:0	XO:i:0	XG:i:0	NM:i:0	MD:Z:68	YS:i:-4	YT:Z:CP	XS:A:-	NH:i:5

----------------------------------------------------------------------
Ran 58 tests in 97.134s

FAILED (failures=1)

I also ran this on older version of UMI_tools in python 2.7 (version 0.5.1), which gave may more errors using same test conditions. Although I am not sure whether this is valid in python2.7 at all.

In Python2.7

export PYTHONHASHSEED=0

nosetests tests/test_umi_tools.py (extract of output below)

FAIL: umi_tools.py/whitelist_scrb_seq
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/tmp/UMI-tools/tests/test_umi_tools.py", line 202, in check_script
    ok_(not fail, msg)
AssertionError: files /tmp/tmplFA85o/stdout and tests/group_unsorted.sam are not the same
/bin/bash -c 'umi_tools group -L test.log --out-sam --random-seed=123456789 --method=directional --no-sort-output  --output-bam --stdin=/tmp/UMI-tools/tests/chr19.bam |sort > /tmp/tmplFA85o/stdout'
md5: output=55221, fb36cbc9c92278137833599787af76ba reference=55221, 793edea471eb96ad61cf38d5b8d916f4first 10 differences: @SQ	SN:chr1	LN:197195432
@SQ	SN:chr10	LN:129993255
--
@SQ	SN:chr10	LN:129993255
@SQ	SN:chr11	LN:121843856
--
@SQ	SN:chr11	LN:121843856
@SQ	SN:chr12	LN:121257530
………………………………………..
……………………………………….
Ran 35 tests in 55.604s

FAILED (failures=16)
@TomSmithCGAT
Copy link
Member

Hi. For the python 3.7 test, can you try running export PYTHONHASHSEED=0 before nosetests.
See #440 for a related outstanding issue.

@alexander-e-f-smith
Copy link
Author

Hi, thanks for the response.
That's how I have run it (see previous post details...I have pasted in the relative environment at the top.....including PYTHONHASHSEED=0).
Best

@TomSmithCGAT
Copy link
Member

Very sorry. Yes, you clearly mentioned that...

@TomSmithCGAT
Copy link
Member

Hmm... can't reproduce with clean conda or pip installation.

How are you installing?

What do you get when you run the command manually?
umi_tools dedup --paired --log=test.log --filter-umi --umi-whitelist=/usr/umitools/UMI-tools-1.1.1/tests/umi_whitelist.tsv --umi-whitelist-paired=/usr/umitools/UMI-tools-1.1.1/tests/umi_whitelist.tsv --out-sam --random-seed=123456789 --stdin=/usr/umitools/UMI-tools-1.1.1/tests/whitelist_umi_input.bam |sort > tmp

@alexander-e-f-smith
Copy link
Author

alexander-e-f-smith commented Jan 7, 2021 via email

@alexander-e-f-smith
Copy link
Author

alexander-e-f-smith commented Jan 7, 2021 via email

@IanSudbery
Copy link
Member

No i'm not sure attachments work in github issues.

The problem in the nosetests for the python3 version appears to be that the your umi_tools appears to be outputting a sam file with a header, where as the reference does not contain a header.

I wonder if this is a pysam/samtools version issue. Has a new version of pysam/samtools be released since we last did a rebuild?

@alexander-e-f-smith
Copy link
Author

alexander-e-f-smith commented Jan 7, 2021 via email

@alexander-e-f-smith
Copy link
Author

alexander-e-f-smith commented Jan 8, 2021 via email

@IanSudbery
Copy link
Member

Thats very peculiar. The problem is that one your tests the header is being output, and our reference file doesn't have headers. This doesn't seem like it should be an OS related problem.

@TomSmithCGAT
Copy link
Member

Closing due to inactivity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants