Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting start and end regions position for wig files #5

Open
felipealbrecht opened this issue Jan 28, 2014 · 6 comments
Open

Setting start and end regions position for wig files #5

felipealbrecht opened this issue Jan 28, 2014 · 6 comments

Comments

@felipealbrecht
Copy link
Collaborator

Hello, in the line 85 of the file parser/wig.py:

feature = [params['start'], params['start'] + params['span'], line]
params['start'] += params['span'] + params['step'] - 1

I think that it is wrong. Because it is using the span to set the starting position.
The right code is:

feature = [params['start'], params['start'] + params['span'] - 1, line]
params['start'] += params['step']

The explanation is here (taken from http://genome.ucsc.edu/goldenPath/help/wiggle.html)

fixedStep chrom=chr3 start=400601 step=100 span=5
11
22
33
causes the values 11, 22, and 33 to be displayed as 5-base regions on chromosome 3 at positions 400601-400605, 400701-400705, and 400801-400805, respectively.

(If we use the actual code from parser/wig.py, the positions will be 400601-400606, 400705-400709, 400809-400814, and so on...)

@xapple
Copy link
Owner

xapple commented Jan 28, 2014

Hi !

Yeah that looks like an error indeed. One should add a test case with a WIG file that uses the step and span instructions at the same. I don't think there is one at the moment. One should also add to the WIG parser a line that checks that the step is equal or smaller than the span.

However considering the numbering convention used throughout the package (it's the same as UCSC) I would say the right code is:

feature = [params['start'], params['start'] + params['span'], line]
params['start'] += params['step']

I corrected it in 84ec769! Would you like to add a test case and the sanity check ?

@felipealbrecht
Copy link
Collaborator Author

Hi! Thank you for the very fast feedback.

Unhappily I cant change the code right now, I am working in a project in my work and I do not have time.
If you are not in rush, I can do it latter.

Another question: Which test does use the files in the samples/wig/ ?

@xapple
Copy link
Owner

xapple commented Jan 28, 2014

Yeah I don't have time either. But the parser should be fixed now !

The tests for the wig format are in https://github.com/xapple/track/blob/master/track/test/format_wig.py

@felipealbrecht
Copy link
Collaborator Author

Okay, thank you.
I will do and commit latter.

@felipealbrecht
Copy link
Collaborator Author

Hello,

Using part of the scores4.wig file as an example:

fixedStep chrom=chr1 start=10 span=5
10.0
20.0

In this case, the step value will be 1, the default value.
So, it will trigger to this features:

chr1 10 15
chr1 11 20

That is invalid because the overlapping.

By the UCSC definition:

  fixedStep  chrom=chrN  start=position  step=stepInterval  [span=windowSize]
  dataValue1
  dataValue2
  ... etc ...

all fixedStep data should contain the step value, but in the line 68, the code attributes a default value:

params['step'] = int(params.get('step',1))

I believe that the code should throws an exception in case that the step value is not found.

@xapple
Copy link
Owner

xapple commented Jan 29, 2014

Yeah you're right. I don't know what I was thinking when writing that code. Feel free to change it, I'm giving you commit access.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants