Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can the input motifs be in the PWM format? #19

Open
kerenzhou062 opened this issue Mar 20, 2022 · 3 comments
Open

Can the input motifs be in the PWM format? #19

kerenzhou062 opened this issue Mar 20, 2022 · 3 comments

Comments

@kerenzhou062
Copy link

kerenzhou062 commented Mar 20, 2022

Hi, can the input motifs be in the PWM format? Like outputs from HOMER program (motif1.motif), an example please check bellow:

>TGCATG	1-TGCATG,BestGuess:hsa-miR-4262 MIMAT0016894 Homo sapiens miR-4262 Targets (miRBase)(0.647)	5.179177	-34261.033795	0	T:22354.0(48.77%),B:2355.3(5.54%),P:1e-14879
0.001	0.044	0.001	0.954
0.001	0.001	0.997	0.001
0.001	0.997	0.001	0.001
0.997	0.001	0.001	0.001
0.001	0.001	0.001	0.997
0.001	0.001	0.997	0.001

Best,

Keren

@ghuls
Copy link
Member

ghuls commented Mar 21, 2022

The motifs need to be in Cluster-Buster format.

The following function will create them (put one homer motif per file).

homer_to_clusterbuster () {
    local homer_motif_file="${1}";
    awk -F '\t' -v 'OFS=\t' '{ if ($1 ~ />/) { print $1 } else if (NF == 4) { print $1 * 100, $2 * 100, $3 * 100, $4 * 100; } }' "${homer_motif_file}";
}
$ cat /tmp/motif.homer 
>TGCATG	1-TGCATG,BestGuess:hsa-miR-4262 MIMAT0016894 Homo sapiens miR-4262 Targets (miRBase)(0.647)	5.179177	-34261.033795	0	T:22354.0(48.77%),B:2355.3(5.54%),P:1e-14879
0.001	0.044	0.001	0.954
0.001	0.001	0.997	0.001
0.001	0.997	0.001	0.001
0.997	0.001	0.001	0.001
0.001	0.001	0.001	0.997
0.001	0.001	0.997	0.001

$ homer_to_clusterbuster /tmp/motif.homer 
>TGCATG
0.1	4.4	0.1	95.4
0.1	0.1	99.7	0.1
0.1	99.7	0.1	0.1
99.7	0.1	0.1	0.1
0.1	0.1	0.1	99.7
0.1	0.1	99.7	0.1

@kerenzhou062
Copy link
Author

The motifs need to be in Cluster-Buster format.

The following function will create them (put one homer motif per file).

homer_to_clusterbuster () {
    local homer_motif_file="${1}";
    awk -F '\t' -v 'OFS=\t' '{ if ($1 ~ />/) { print $1 } else if (NF == 4) { print $1 * 100, $2 * 100, $3 * 100, $4 * 100; } }' "${homer_motif_file}";
}
$ cat /tmp/motif.homer 
>TGCATG	1-TGCATG,BestGuess:hsa-miR-4262 MIMAT0016894 Homo sapiens miR-4262 Targets (miRBase)(0.647)	5.179177	-34261.033795	0	T:22354.0(48.77%),B:2355.3(5.54%),P:1e-14879
0.001	0.044	0.001	0.954
0.001	0.001	0.997	0.001
0.001	0.997	0.001	0.001
0.997	0.001	0.001	0.001
0.001	0.001	0.001	0.997
0.001	0.001	0.997	0.001

$ homer_to_clusterbuster /tmp/motif.homer 
>TGCATG
0.1	4.4	0.1	95.4
0.1	0.1	99.7	0.1
0.1	99.7	0.1	0.1
99.7	0.1	0.1	0.1
0.1	0.1	0.1	99.7
0.1	0.1	99.7	0.1

Thank you for your explaination!

Best,

Keren

@ghuls
Copy link
Member

ghuls commented Apr 18, 2023

Our SCENIC+ public motif collection is now available: https://resources.aertslab.org/cistarget/motif_collections/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants