Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue about MotifSeq.py. #48

Open
Goatofmountain opened this issue Apr 9, 2021 · 1 comment
Open

Issue about MotifSeq.py. #48

Goatofmountain opened this issue Apr 9, 2021 · 1 comment
Assignees
Labels
help wanted Extra attention is needed

Comments

@Goatofmountain
Copy link

Hi James,
I ran the MotifSeq.py in the example dir of SquiggleKit like this:
python ../MotifSeq.py -p test.fast5 -m CATCTATCCAGGGTTAAATT.model > test_kmer.tsv
And it comes out with the error below:
`**********************************************************

  • z-score, p-value, probability, etc. are based on *
  • preliminary experimental modeling only             *
    
  •            Use at own risk                         *
    

Traceback (most recent call last):
File "../MotifSeq.py", line 520, in
main()
File "../MotifSeq.py", line 153, in main
model, m_order, L = read_bait_model(args.model)
File "../MotifSeq.py", line 420, in read_bait_model
L = int(l[1])
IndexError: list index out of range`

I've tried other ways to seak motif, but it comes out with similar errors. I guess this may be a problem in code assignments.

By the way, I read the script of MotifSeq.py this afternoon and I can not understand how local dtw works in the motif finding process (in the function "get_region_multi"). Is this function aligned the simulated signal of each base in the motif sequence to a segment of the original signal? If so, how can I choose the best alignment location of the motif in the raw signal?

@Psy-Fer Psy-Fer self-assigned this Apr 12, 2021
@Psy-Fer Psy-Fer added the help wanted Extra attention is needed label Apr 12, 2021
@Psy-Fer
Copy link
Owner

Psy-Fer commented Apr 12, 2021

Hello,

Here are a few suggestions to help.

The -p command is for a top path, not an individual file. So if trying to use the example fast5 file, point it to the ./example/ folder.

The -m flag is not for that kind of model, though I can understand how that could be confusing (my bad).
Instead, use the CATCTATCCAGGGTTAAATT.fa file with the -i flag. This takes the sequence you give it, and will convert it into a model file, and use it with the DTW to find the best hit.

The DTW method will return the best hit only for that motif, for each read it is used against, along with some metrics if using the med-scaling. I assume from your comment this is what you are looking for?

Let me know if you have any other questions or can't get it to work with my suggestions above.

James

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants