Skip to content

Commit

Permalink
[factor] orffinder-to-sequence tool
Browse files Browse the repository at this point in the history
  • Loading branch information
ChocoParrot committed Sep 13, 2021
1 parent da6ccad commit adf1827
Show file tree
Hide file tree
Showing 6 changed files with 813 additions and 1 deletion.
238 changes: 238 additions & 0 deletions src/cline_tools/gene.fasta
Original file line number Diff line number Diff line change
@@ -0,0 +1,238 @@
>NC_001643.1 Pan troglodytes mitochondrion, complete genome
GTTTATGTAGCTTACCCCCTCAAAGCAATACACTGAAAATGTTTCGACGGGTTTACATCACCCCATAAAC
AAACAGGTTTGGTCCTAGCCTTTCTATTAGCTCTTAGTAAGATTACACATGCAAGCATCCCCGCCCCGTG
AGTCACCCTCTAAATCGCCATGATCAAAAGGAACAAGTATCAAGCACGCAGCAATGCAGCTCAAAACGCT
TAGCCTAGCCACACCCCCACGGGAGACAGCAGTGATAAACCTTTAGCAATAAACGAAAGTTTAACTAAGC
CATACTAACCTCAGGGTTGGTCAATTTCGTGCTAGCCACCGCGGTCATACGATTAACCCAAGTCAATAGA
AACCGGCGTAAAGAGTGTTTTAGATCACCCCCCCATAAAGCTAAAATTCACCTGAGTTGTAAAAAACTCC
AGCTGATACAAAATAAACTACGAAAGTGGCTTTAACACATCTGAATACACAATAGCTAAGACCCAAACTG
GGATTAGATACCCCACTATGCTTAGCCCTAAACTTCAACAGTTAAATTAACAAAACTGCTCGCCAGAACA
CTACGAGCCACAGCTTAAAACTCAAAGGACCTGGCGGTGCTTCATATCCCTCTAGAGGAGCCTGTTCTGT
AATCGATAAACCCCGATCAACCTCACCGCCTCTTGCTCAGCCTATATACCGCCATCTTCAGCAAACCCTG
ATGAAGGTTACAAAGTAAGCACAAGTACCCACGTAAAGACGTTAGGTCAAGGTGTAGCCTATGAGGTGGC
AAGAAATGGGCTACATTTTCTACCCCAGAAAATTACGATAACCCTTATGAAACCTAAGGGTCAAAGGTGG
ATTTAGCAGTAAACTAAGAGTAGAGTGCTTAGTTGAACAGGGCCCTGAAGCGCGTACACACCGCCCGTCA
CCCTCCTCAAGTATACTTCAAAGGATACTTAACTTAAACCCCCTACGTATTTATATAGAGGAGATAAGTC
GTAACATGGTAAGTGTACTGGAAAGTGCACTTGGACGAACCAGAGTGTAGCTTAACATAAAGCACCCAAC
TTACACTTAGGAGATTTCAACTCAACTTGACCACTCTGAGCCAAACCTAGCCCCAAACCCCCTCCACCCT
ACTACCAAACAACCTTAACCAAACCATTTACCCAAATAAAGTATAGGCGATAGAAATTGTAAACCGGCGC
AATAGACATAGTACCGCAAGGGAAAGATGAAAAATTATACCCAAGCATAATACAGCAAGGACTAACCCCT
GTACCTTTTGCATAATGAATTAACTAGAAATAACTTTGCAAAGAGAACCAAAGCTAAGACCCCCGAAACC
AGACGAGCTACCTAAGAACAGCTAAAAGAGCACACCCGTCTATGTAGCAAAATAGTGGGAAGATTTATAG
GTAGAGGCGACAAACCTACCGAGCCTGGTGATAGCTGGTTGTCCAAGATAGAATCTTAGTTCAACTTTAA
ATTTACCTACAGAACCCTCTAAATCCCCTTGTAAACTTAACTGTTAGTCCAAAGAGGAACAGCTCTTTAG
ACACTAGGAAAAAACCTTGTAAAGAGAGTAAAAAATTTAACACCCATAGTAGGCCTAAAAGCAGCCACCA
ATTAAGAAAGCGTTCAAGCTCAACACCCACAACCTTAAAGATCCCAAACATACAACCGAACTCCTTACAC
CCAATTGGACCAATCTATTACCCCATAGAAGAACTAATGTTAGTATAAGTAACATGAAAACATTCTCCTC
CGCATAAGCCTACATCAGACCAAAATATTAAACTGACAATTAACAGCCTAATATCTACAATCAACCAACA
AGCCATTATTACCCCCGCTGTTAACCCAACACAGGCATGCCCACAAGGAAAGGTTAAAAAAAGTAAAAGG
AACTCGGCAAATCTTACCCCGCCTGTTTACCAAAAACATCACCTCTAGCATTACCAGTATTAGAGGCACC
GCCTGCCCGGTGACATATGTTTAACGGCCGCGGTACCCTAACCGTGCAAAGGTAGCATAATCACTTGTTC
CTTAAATAGGGACTTGTATGAATGGCTCCACGAGGGTTTAGCTGTCTCTTACTTTCAACCAGTGAAATTG
ACCTACCCGTGAAGAGGCGGGCATAACATAACAAGACGAGAAGACCCTATGGAGCTTTAATTCATTAATG
CAAACAATACTTAACAAACCTACAGGTCCTAAACTATTAAACCTGCATTAAAAATTTCGGTTGGGGCGAC
CTCGGAGCACAACCCAACCTCCGAGCAATACATGCTAAGACCTCACCAGTCAAAGCGAATTACTACATCC
AATTGATCCAATGACTTGACCAACGGAACAAGTTACCCTAGGGATAACAGCGCAATCCTATTCCAGAGTC
CATATCAACAATAGGGTTTACGACCTCGATGTTGGATCAGGACATCCCGATGGTGCAGCCGCTATTAAAG
GTTCGTTTGTTCAACGATTAAAGTCCTACGTGATCTGAGTTCAGACCGGAGTAATCCAGGTCGGTTTCTA
TCTGTTCTAAATTTCTCCCTGTACGAAAGGACAAGAGAAATGAGGCCTACTTCACAAAGCGCCTTCCCCA
ATAAATGATATTATCTCAATTTAGCGCCATGCCAACACCCACTCAAGAACAGAGTTTGTTAAGATGGCAG
AGCCCGGTAATTGCATAAAACTTAAAACTTTACAATCAGAGGTTCAATTCCTCTTCTTGACAACACACCC
ATGACCAACCTCCTACTCCTCATTGTACCCATCCTAATCGCAATAGCATTCCTAATGCTAACCGAACGAA
AAATTCTAGGCTACATACAACTACGCAAAGGTCCCAACATTGTAGGTCCTTACGGGCTATTACAGCCCTT
CGCTGACGCCATAAAACTCTTCACTAAAGAACCCTTAAAACCCTCCACTTCAACCATTACCCTCTACATC
ACCGCCCCAACCCTAGCCCTCACCATTGCCCTCTTACTATGAACCCCCCTCCCCATACCCAACCCCCTAG
TCAATCTTAACTTAGGCCTCCTATTTATTCTAGCCACCTCCAGCCTAGCCGTTTACTCAATCCTCTGATC
AGGGTGAGCATCAAACTCGAACTACGCCTTAATCGGTGCACTACGAGCAGTAGCCCAAACAATCTCATAC
GAAGTCACTCTAGCCATTATCCTACTGTCAACGCTACTAATAAGTGGCTCCTTCAATCTCTCTACCCTTG
TCACAACACAAGAGCACCTCTGACTAATCCTGCCAACATGACCCCTGGCCATAATATGATTTATCTCTAC
ACTAGCAGAGACCAACCGAACTCCCTTCGACCTTACTGAAGGAGAATCTGAACTAGTCTCAGGCTTTAAT
ATCGAGTATGCCGCAGGCCCCTTTGCCCTATTTTTCATAGCCGAATACATAAACATTATTATAATAAACA
CCCTCACTGCTACAATCTTCCTAGGAGCAACATACAATACTCACTCCCCTGAACTCTACACGACATATTT
TGTCACCAAAGCTCTACTTCTAACCTCCCTGTTCCTATGAATTCGAACAGCATATCCCCGATTTCGCTAC
GACCAGCTCATACACCTCCTATGAAAAAACTTCCTACCACTCACCCTAGCATCACTCATGTGATATATCT
CCATACCCACTACAATCTCCAGCATCCCCCCTCAAACCTAAGAAATATGTCTGATAAAAGAATTACTTTG
ATAGAGTAAATAATAGGAGTTCAAATCCCCTTATTTCTAGGACTATAAGAATCGAACTCATCCCTGAGAA
TCCAAAATTCTCCGTGCCACCTATCACACCCCATCCTAAAGTAAGGTCAGCTAAATAAGCTATCGGGCCC
ATACCCCGAAAATGTTGGTTACACCCTTCCCGTACTAATTAATCCCCTAGCCCAACCCATCATCTACTCT
ACCATCCTTACAGGCACGCTCATTACAGCGCTAAGCTCACACTGATTTTTCACCTGAGTAGGCCTAGAAA
TAAATATACTAGCTTTTATCCCAATCCTAACCAAAAAAATAAGCCCCCGCTCCACAGAAGCCGCCATCAA
ATACTTTCTCACACAAGCAACTGCGTCCATAATTCTCCTGATAGCTATCCTCTCCAACAGCATACTCTCC
GGACAATGAACCATAACCAATACTACCAATCAATACTCATCATTAATAATTATAATAGCAATGGCAATAA
AACTAGGAATAGCCCCCTTTCACTTTTGAGTTCCAGAAGTTACCCAAGGCACCCCCCTAATATCCGGCCT
ACTCCTCCTCACATGACAAAAATTAGCCCCTATTTCAATTATATACCAAATCTCCTCATCACTGAACGTA
AACCTTCTCCTCACCCTTTCAATCTTGTCCATTATAGCAGGCAGCTGAGGCGGACTAAACCAAACCCAAC
TACGCAAAATCCTAGCATACTCCTCAATCACCCACATAGGCTGAATAATAGCAGTCCTACCATATAACCC
TAACATAACCATTCTTAATTTAACCATTTACATCATCCTAACTACTACCGCATTTCTGCTACTCAACTTA
AACTCCAGCACCACAACCCTACTACTATCTCGCACCTGAAACAAGCTAACATGATTAACTCCCCTAATTC
CATCCACCCTCCTCTCCCTAGGAGGCCTACCCCCACTAACTGGCTTCTTACCCAAATGAGTTATCATCGA
AGAATTCACAAAAAATAATAGCCTCATCATCCCCACCATCATAGCCATCATCACTCTCCTTAACCTCTAT
TTCTACCTACGCCTAATCTACTCCACCTCAATTACACTACTTCCCATATCTAATAACGTAAAAATAAAAT
GACAATTCGAACATACAAAACCCACCCCCTTCCTCCCTACACTCATCACCCTTACCACACTGCTTCTACC
CATCTCCCCCTTCATACTAATAATCTTATAGAAATTTAGGTTAAGCACAGACCAAGAGCCTTCAAAGCCC
TCAGCAAGTTACAATACTTAATTTCTGCAACAACTAAGGACTGCAAAACCCCACTCTGCATCAACTGAAC
GCAAATCAGCCACTTTAATTAAGCTAAGCCCTTACTAGATTAATGGGACTTAAACCCACAAACATTTAGT
TAACAGCTAAACACCCTAATCAACTGGCTTCAATCTACTTCTCCCGCCGCAAGAAAAAAAGGCGGGAGAA
GCCCCGGCAGGTTTGAAGCTGCTTCTTCGAATTTGCAATTCAATATGAAAATCACCTCAGAGCTGGTAAA
AAGAGGCTTAACCCCTGTCTTTAGATTTACAGTCCAATGCTTCACTCAGCCATTTTACCCCACCCTACTG
ATGTTCACCGACCGCTGACTATTCTCTACAAACCACAAAGATATTGGAACACTATACCTACTATTCGGTG
CATGAGCTGGAGTCCTGGGCACAGCCCTAAGTCTCCTTATTCGGGCTGAACTAGGCCAACCAGGCAACCT
CCTAGGTAATGACCACATCTACAATGTCATCGTCACAGCCCATGCATTCGTAATAATCTTCTTCATAGTA
ATGCCTATTATAATCGGAGGCTTTGGCAACTGGCTAGTTCCCTTGATAATTGGTGCCCCCGACATGGCAT
TCCCCCGCATAAACAACATAAGCTTCTGGCTCCTGCCCCCTTCTCTCCTACTTCTACTTGCATCTGCCAT
AGTAGAAGCCGGCGCGGGAACAGGTTGAACAGTCTACCCTCCCTTAGCGGGAAACTACTCGCATCCTGGA
GCCTCCGTAGACCTAACCATCTTCTCCTTACATCTGGCAGGCATCTCCTCTATCCTAGGAGCCATTAACT
TCATCACAACAATTATTAATATAAAACCTCCTGCCATGACCCAATACCAAACACCCCTCTTCGTCTGATC
CGTCCTAATCACAGCAGTCTTACTTCTCCTATCCCTCCCAGTCCTAGCTGCTGGCATCACCATACTATTG
ACAGATCGTAACCTCAACACTACCTTCTTCGACCCAGCCGGGGGAGGAGACCCTATTCTATATCAACACT
TATTCTGATTTTTTGGCCACCCCGAAGTTTATATTCTTATCCTACCAGGCTTCGGAATAATTTCCCACAT
TGTAACTTATTACTCCGGAAAAAAAGAACCATTTGGATATATAGGCATGGTTTGAGCTATAATATCAATT
GGCTTCCTAGGGTTTATCGTGTGAGCACACCATATATTTACAGTAGGGATAGACGTAGACACCCGAGCCT
ATTTCACCTCCGCTACCATAATCATTGCTATTCCTACCGGCGTCAAAGTATTCAGCTGACTCGCTACACT
TCACGGAAGCAATATGAAATGATCTGCCGCAGTACTCTGAGCCCTAGGGTTTATCTTTCTCTTCACCGTA
GGTGGCCTAACCGGCATTGTACTAGCAAACTCATCATTAGACATCGTGCTACACGACACATACTACGTCG
TAGCCCACTTCCACTACGTTCTATCAATAGGAGCTGTATTCGCCATCATAGGAGGCTTCATTCACTGATT
CCCCCTATTCTCAGGCTATACCCTAGACCAAACCTATGCCAAAATCCAATTTGCCATCATGTTCATTGGC
GTAAACCTAACCTTCTTCCCACAGCACTTCCTTGGCCTATCTGGGATGCCCCGACGTTACTCGGACTACC
CCGATGCATACACCACATGAAATGTCCTATCATCCGTAGGCTCATTTATCTCCCTGACAGCAGTAATATT
AATAATTTTCATGATTTGAGAAGCCTTTGCTTCAAAACGAAAAGTCCTAATAGTAGAAGAGCCCTCCGCA
AACCTGGAATGACTATATGGATGCCCCCCACCCTACCACACATTCGAAGAACCCGTATACATAAAATCTA
GACAAAAAAGGAAGGAATCGAACCCCCTAAAGCTGGTTTCAAGCCAACCCCATGACCTCCATGACTTTTT
CAAAAAGATATTAGAAAAACTATTTCATAACTTTGTCAAAGTTAAATTACAGGTTAACCCCCGTATATCT
TAATGGCACATGCAGCGCAAGTAGGTCTACAAGATGCTACTTCCCCTATCATAGAAGAACTTATTATCTT
TCACGACCATGCCCTCATAATTATCTTTCTCATCTGCTTTCTAGTCCTATACGCCCTTTTCCTAACACTC
ACAACAAAACTAACTAATACTAGTATTTCAGACGCCCAGGAAATAGAAACCGTCTGAACTATCCTGCCCG
CCATCATCCTAGTCCTTATTGCCCTACCATCCCTGCGTATCCTTTACATAACAGACGAGGTCAACGACCC
CTCCTTTACTATTAAATCAATCGGCCATCAATGATATTGAACCTACGAATACACCGACTACGGCGGGCTA
ATCTTCAACTCCTACATACTCCCCCCATTATTTCTAGAACCAGGTGATCTACGACTCCTTGACGTTGATA
ACCGAGTGGTCCTCCCAGTTGAAGCCCCCGTTCGTATAATAATTACATCACAAGATGTTCTACACTCATG
AGCTGTTCCCACATTAGGCCTAAAAACAGACGCAATTCCCGGACGCCTAAACCAAACCACTTTCACCGCC
ACACGACCAGGAGTATACTACGGCCAATGCTCAGAAATCTGTGGAGCAAACCACAGTTTTATACCCATCG
TCCTAGAATTAATCCCTCTAAAAATCTTTGAAATAGGACCCGTATTCACTCTATAGCACCTTCTCTACCC
CTCTCCAGAGCTCACTGTAAAGCTAACCTAGCATTAACCTTTTAAGTTAAAGATTAAGAGGACCGACACC
TCTTTACAGTGAAATGCCCCAACTAAATACCGCCGTATGACCCACCATAATTACCCCCATACTCCTGACA
CTATTTCTCGTCACCCAACTAAAAATATTAAATTCAAATTACCATCTACCCCCCTCACCAAAACCCATAA
AAATAAAAAACTACAATAAACCCTGAGAACCAAAATGAACGAAAATCTATTCGCTTCATTCGCTGCCCCC
ACAATCCTAGGCTTACCCGCCGCAGTACTAATCATTCTATTCCCCCCTCTACTGGTCCCCACTTCTAAAC
ATCTCATCAACAACCGACTAATTACCACCCAACAATGACTAATTCAACTGACCTCAAAACAAATAATAAC
TATACACAGCACTAAAGGACGAACCTGATCTCTCATACTAGTATCCTTAATCATTTTTATTACCACAACC
AATCTTCTTGGGCTTCTACCCCACTCATTCACACCAACCACCCAACTATCTATAAACCTAGCCATGGCTA
TCCCCCTATGAGCAGGCGCAGTAGTCATAGGCTTTCGCTTTAAGACTAAAAATGCCCTAGCCCACTTCTT
ACCGCAAGGCACACCTACACCCCTTATCCCCATACTAGTTATCATCGAAACTATTAGCCTACTCATTCAA
CCAATAGCCTTAGCCGTACGTCTAACCGCTAACATTACTGCAGGCCACCTACTCATGCACCTAATTGGAA
GCGCCACACTAGCATTATCAACTATCAATCTACCCTATGCACTCATTATCTTCACAATTCTAATCCTACT
GACTATTCTAGAGATCGCCGTCGCCTTAATCCAAGCCTACGTTTTTACACTTCTAGTGAGCCTCTACCTG
CACGACAACACATAATGACCCACCAATCACATGCCTACCACATAGTAAAACCCAGCCCATGACCCCTAAC
AGGGGCCCTCTCGGCCCTCCTAATAACCTCCGGCCTGGCCATATGATTCCACTTCTACTCCACAACACTA
CTCACACTAGGCTTACTAACTAACACATTGACCATATATCAATGATGACGCGATGTTATACGAGAAGGCA
CATACCAAGGCCACCACACACCACCCGTCCAAAAAGGTCTCCGATATGGGATAATTCTTTTTATTACCTC
AGAAGTTTTTTTCTTTGCAGGATTTTTTTGAGCTTTCTACCACTCCAGCCTAGCCCCTACCCCCCAGCTA
GGAGGACACTGGCCCCCAACAGGTATTACCCCACTAAATCCCCTAGAAGTCCCACTCCTAAACACATCTG
TATTACTCGCATCAGGAGTATCAATTACTTGAGCCCATCACAGCTTAATAGAAAATAACCGAAACCAAAT
AATTCAAGCACTGCTTATTACGATTCTACTAGGTCTTTATTTTACCCTCCTACAAGCCTCAGAATATTTC
GAATCCCCTTTTACCATTTCCGATGGCATCTACGGCTCAACATTCTTTGTAGCCACAGGCTTCCACGGAC
TCCACGTCATTATTGGATCAACTTTCCTCACTATCTGCCTCATCCGCCAACTAATATTTCACTTCACATC
CAAACATCACTTCGGCTTTCAAGCCGCCGCCTGATACTGACACTTCGTAGATGTAGTCTGACTATTTCTA
TATGTCTCTATTTACTGATGAGGATCTTACTCTTTTAGTATAAGTAGTACCGTTAACTTCCAATTAACTA
GTTTTGACAACATTCAAAAAAGAGTAATAAACTTCGTCCTAATTTTAATAACCAATACCCTTCTAGCCCT
ACTACTGATAATTATCACATTCTGACTACCACAACTCAACAGCTACATAGAAAAATCTACCCCTTACGAA
TGTGGCTTCGACCCTATATCCCCCGCCCGCGTCCCCTTCTCCATAAAATTTTTCCTAGTAGCCATCACCT
TCCTATTATTTGACCTAGAAATTGCCCTCCTATTGCCCTTACCTTGAGCCCTACAAACGGCCAACCTACC
ACTAATAGTCACATCATCCCTCTTATTAATTACTATCCTAGCCCTAAGCCTCGCCTACGAATGATTACAA
AAAGGGTTAGACTGAACCGAATTGGTATATAGTTTAAATAAAACGAATGATTTCGACTCATTAAATTATG
ATAATCATATTTACCAAATGCCCCTTATTTATATAAATATTATACTAGCATTTACCATCTCACTTCTAGG
AATACTAGTATATCGCTCACACCTAATATCTTCCCTACTATGCCTAGAAGGAATAATACTATCACTGTTC
ATCATAGCCACCCTCATAACCCTCAATACTCACTCCCTCTTAGCCAATATTGTACCCATCACCATACTAG
TCTTTGCTGCCTGCGAAGCAGCAGTAGGTCTAGCACTACTAGTTTCAATCTCTAACACATATGGCTTAGA
CTACGTACATAACCTAAACCTACTCCAATGCTAAAACTAATCATCCCGACAATTATATTACTACCACTAA
CATGATTCTCTAAAAAACGTATAATTTGAATCAACACAACCACTCACAGCCTAATTATCAGCACCATTCC
CTTACTATTTTTTAACCAAATTAACAACAACCTATTCAGCTGTTCCCTGCCCTTCTCCTCCGACCCCTTA
ACAACTCCCCTCCTAATATTAACTGCTTGACTTCTACCCCTCACAATCATAGCAAGCCAGCGCCACCTAT
CCAACGAACCACTATCACGAAAAAAACTCTACCTCTCCATGCTAATTTCCCTCCAAATCTCCTTAATTAT
AACATTCTCGGCCACAGAGCTAATTATATTTTATATCTTCTTCGAAACCACACTTATCCCCACCCTGGCT
ATCATCACCCGATGGGGTAACCAACCAGAACGCCTGAACGCAGGTACATACTTCCTATTCTATACCCTAG
TAGGCTCCCTCCCCCTACTCATCGCACTAATCTATACCCACAACACCCTAGGCTCACTAAATATCCTATT
ACTCACTCTTACAACCCAAGAACTATCAAACACCTGAGCCAACAACTTAATATGACTAGCGTACACGATG
GCTTTCATGGTAAAAATACCCCTTTACGGACTCCACCTATGACTCCCTAAAGCCCATGTCGAAGCCCCTA
TTGCCGGGTCAATGGTACTTGCTGCAGTACTCTTAAAATTAGGTGGCTATGGCATAATACGCCTCACACT
CATCCTCAACCCCCTAACAAAACATATAGCCTATCCCTTCCTCATGTTGTCCTTATGAGGTATAATCATA
ACAAGCTCCATCTGCCTGCGACAAACAGACCTAAAATCGCTCATTGCATACCCTTCAGTCAGCCACATAG
CCCTCGTAGTAACAGCCATTCTCATCCAAACCCCCTGAAGCTTCACCGGCGCAATTATCCTCATAATCGC
CCACGGACTTACATCCTCATTATTATCCTGCCTAGCAAACTCAAATTATGAACGCACCCACAGTCGCATC
ATAATTCTCTCCCAAGGACTTCAAACTCTACTCCCACTAATAGCCTTTTGATGACTCCTGGCAAGCCTCG
CTAACCTCGCCCTACCCCCTACCATTAATCTCCTAGGGGAACTCTCCGTGCTAGTAACCTCATTCTCCTG
ATCAAATACCACTCTCCTACTCACAGGATTCAACATACTAATCACAGCCCTGTACTCCCTCTACATGTTT
ACCACAACACAATGAGGCTCACTCACCCACCACATTAATAGCATAAAGCCCTCATTCACACGAGAAAACA
CTCTCATATTTTTACACCTATCCCCCATCCTCCTTCTATCCCTCAATCCTGATATCATCACTGGATTCAC
CTCCTGTAAATATAGTTTAACCAAAACATCAGATTGTGAATCTGACAACAGAGGCTCACGACCCCTTATT
TACCGAGAAAGCTTATAAGAACTGCTAACTCGTATTCCCATGCCTAACAACATGGCTTTCTCAACTTTTA
AAGGATAACAGTTATCCATTGGTCTTAGGCCCCAAAAATTTTGGTGCAACTCCAAATAAAAGTAATAACC
ATGTATGCTACCATAACCACCTTAGCCCTAACTTCCTTAATTCCCCCCATCCTCGGCGCCCTCATTAACC
CTAACAAAAAAAACTCATACCCCCATTACGTGAAATCCATTATCGCATCCACCTTTATCATTAGCCTTTT
CCCCACAACAATATTCATATGCCTAGACCAAGAAACTATTATCTCGAACTGACACTGAGCAACAACCCAA
ACAACCCAACTCTCCCTGAGCTTTAAACTAGACTATTTCTCCATAACATTTATCCCCGTAGCACTGTTCG
TTACATGATCCATCATAGAATTCTCACTATGATATATAGACTCAGACCCCAACATCAACCAATTCTTCAA
ATACTTACTTATCTTCCTAATTACTATACTAATCCTAGTCACCGCTAACAACCTATTCCAACTCTTCATC
GGCTGAGAAGGCGTAGGAATTATATCCTTTCTACTCATTAGCTGATGGTACGCCCGAACAGATGCCAACA
CAGCAGCCATCCAAGCAATCCTATATAACCGTATCGGTGATATTGGTTTTGTCCTAGCCCTAGCATGATT
TCTCCTACACTCCAACTCATGAGATCCACAACAAATAATCCTCCTAAGTACTAATACAGACCTTACTCCA
CTACTAGGCTTCCTCCTAGCAGCAGCAGGCAAATCAGCTCAACTAGGCCTTCACCCCTGACTCCCCTCAG
CCATAGAAGGCCCTACCCCTGTTTCAGCCCTACTCCACTCAAGCACCATAGTCGTAGCAGGAATCTTCCT
ACTCATCCGCTTCTACCCCCTAGCAGAGAATAACCCACTAATCCAAACTCTCACGCTATGCCTAGGCGCT
ATCACCACCCTATTCGCAGCAGTCTGCGCCCTCACACAAAATGACATCAAAAAAATCGTGGCCTTCTCCA
CTTCAAGCCAACTAGGACTCATAATAGTTACAATCGGTATCAACCAACCACACCTAGCATTCCTTCACAT
CTGCACCCACGCTTTCTTCAAAGCCATACTATTCATATGCTCCGGATCCATTATTCACAACCTCAATAAT
GAGCAAGACATTCGAAAAATAGGAGGATTACTCAAAACCATACCCCTCACTTCAACCTCCCTCACCATTG
GGAGCCTAGCATTAGCAGGAATACCCTTCCTCACAGGTTTCTACTCCAAAGACCTCATCATCGAAACCGC
TAACATATCATACACAAACGCCTGAGCCCTATCTATTACTCTCATCGCCACCTCTCTGACAAGCGCCTAC
AGCACCCGAATAATCCTCCTCACCCTAACAGGTCAACCTCGCTTCCCAACCCTCACCAACATTAACGAAA
ACAACCCCACTCTGTTAAATCCCATTAAACGCCTAACCATTGGAAGCTTATTTGCAGGATTTCTCATTAC
CAACAACATTCTCCCCATATCTACTCCCCAAGTGACAATTCCCCTTTACTTAAAACTTACAGCCCTAGGC
GTTACTTCCCTAGGACTTCTAACAGCCCTAGACCTCAATTACCTAACCAGCAAGCTCAAAATAAAATCCC
CACTATATACATTTCACTTCTCTAATATACTCGGATTCTACCCTAACATTATACACCGCTCGATCCCCTA
TCTAGGCCTTCTTACAAGCCAAAACCTACCCCTACTTCTTCTAGACCTGACCTGACTAGAGAAACTATTA
CCTAAAACAATTTCACAGTACCAAATCTCCGCTTCCATTACCACCTCAACCCAAAAAGGCATGATCAAAC
TTTATTTCCTCTCTTTTTTCTTCCCTCTCATCTTAACCTTACTCCTAATCACATAACCTATTCCCCCGAG
CAATCTCAATCACAATGTATACACCAACAAACAATGTCCAACCAGTAACTACTACTAACCAACGCCCATA
ATCATATAAGGCCCCCGCACCAATAGGATCCTCCCGAATCAGCCCTGGCCCCTCCCCTTCATAAATTATT
CAACTTCCCACGCTATTAAAATTTACCACAACCACCATCCCATCATACCCTTTTACCCATAACACTAATC
CTACCTCCATCGCCAGTCCTACTAAAACACTAACCAAAACCTCAACCCCTGACCCCCATGCCTCAGGATA
CTCCTCAATAGCCATAGCCGTAGTATACCCAAAAACAACCATTATTCCCCCCAAATAAATTAAAAAAACC
ATTAAACCTATATAACCTCCCCCATAATTCAAAATGATGGCACACCCAACTACACCACTAACAATCAATA
CTAAACCCCCATAAATGGGAGAAGGCTTAGAAGAAAACCCCACAAACCCTATCACTAAACTCACACTCAA
TAAAAATAAAGCATATGTCATTATTCTCGCACGGACTACAACCACGACCAATGATATGAAAAACCATCGT
TGTATTTCAACTACAAGAACACCAATGACCCCGACACGCAAAATTAACCCACTAATAAAATTAATTAATC
ACTCATTTATCGACCTCCCCACCCCATCCAACATTTCCGCATGATGGAACTTCGGCTCACTTCTCGGCGC
CTGCCTAATCCTTCAAATTACCACAGGATTATTCCTAGCTATACACTACTCACCAGACGCCTCAACCGCC
TTCTCGTCGATCGCCCACATCACCCGAGACGTAAACTATGGTTGGATCATCCGCTACCTCCACGCTAACG
GCGCCTCAATATTTTTTATCTGCCTCTTCCTACACATCGGCCGAGGTCTATATTACGGCTCATTTCTCTA
CCTAGAAACCTGAAACATTGGCATTATCCTCTTGCTCACAACCATAGCAACAGCCTTTATGGGCTATGTC
CTCCCATGAGGCCAAATATCCTTCTGAGGAGCCACAGTAATTACAAACCTACTGTCCGCTATCCCATACA
TCGGAACAGACCTGGTCCAGTGAGTCTGAGGAGGCTACTCAGTAGACAGCCCTACCCTTACACGATTCTT
CACCTTCCACTTTATCTTACCCTTCATCATCACAGCCCTAACAACACTTCATCTCCTATTCTTACACGAA
ACAGGATCAAATAACCCCCTAGGAATCACCTCCCACTCCGACAAAATTACCTTCCACCCCTACTACACAA
TCAAAGATATCCTTGGCTTATTCCTTTTCCTCCTTATCCTAATGACATTAACACTATTCTCACCAGGCCT
CCTAGGCGATCCAGACAACTATACCCTAGCTAACCCCCTAAACACCCCACCCCACATTAAACCCGAGTGA
TACTTTCTATTTGCCTACACAATCCTCCGATCCATCCCCAACAAACTAGGAGGCGTCCTCGCCCTACTAC
TATCTATCCTAATCCTAACAGCAATCCCTGTCCTCCACACATCCAAACAACAAAGCATAATATTTCGCCC
ACTAAGCCAACTGCTTTACTGACTCCTAGCCACAGACCTCCTCATCCTAACCTGAATCGGAGGACAACCA
GTAAGCTACCCCTTCATCACCATCGGACAAATAGCATCCGTATTATACTTCACAACAATCCTAATCCTAA
TACCAATCGCCTCTCTAATCGAAAACAAAATACTTGAATGAACCTGCCCTTGTAGTATAAACTAATACAC
CGGTCTTGTAAACCGGAAACGAAAACTTTCTTCCAAGGACAAATCAGAGAAAAAGTAATTAACTTCACCA
TCAGCACCCAAAGCTAAGATTCTAATTTAAACTATTCTCTGTTCTTTCATGGGGAAGCAAATTTAGGTAC
CACCTAAGTACTGGCTCATTCATTACAACCGCTATGTATTTCGTACATTACTGCCAGCCACCATGAATAT
CGTACAGTACCATATCACCCAACTACCTATAGTACATAAAATCCACTCCCACATCAAAACCTTCACTCCA
TGCTTACAAGCACGCACAACAATCAACTCCCAACTGTCGAACATAAAACACAATTCCAACGACACCCCTC
CCCCACCCCGATACCAACAGACCTATCTCCCCTTGACAGAACATAGTACATACAACCATACACCGTACAT
AGCACATTACAGTCAAACCCCTCCTCGCCCCCACGGATGCTCCCCCTCAGATAGGAATCCCTTGGTCACC
ATCCTCCGTGAAATCAATATCCCGCACAAGAGTGACTCTCCTCGCTCCGGGCCCATAACATCTGGGGGTA
GCTAAAGTGAACTGTATCCGACATCTGGTTCCTACCTCAGGGCCATGAAGTTCAAAAGACTCCCACACGT
TCCCCTTAAATAAGACATCACGATGGATCACAGGTCTATCACCCTATTAACCAGTCACGGGAGCCTTCCA
TGCATTTGGTATTTTCGTCTGGGGGGTGTGCACGCGATAGCATTGCGAAACGCTGGCCCCGGAGCACCCT
ATGTCGCAGTATCTGTCTTTGATTCCTGCCCCATTGTATTATTTATCGCACCTACGTTCAATATTACGAC
CTAGCATACCTACTAAAGTGTGTTGATTAATTAATGCTTGCAGGACATAACAACAGCAGCAAAATGCTCA
CATAACTGCTTTCCACACCAACATCATAACAAAAAATTCCCACAAACCCCCCCTTCCCCCCGGCCACAGC
ACTCAAACAAATCTCTGCCAAACCCCAAAAACAAAGAACCCAGACGCCAGCCTAGCCAGACTTCAAATTT
CATCTTTAGGCGGTATGCACTTTTAACAGTCACCCCTCAATTAACATGCCCTCCCCCCTCAACTCCCATT
CTACTAGCCCCAGCAACGTAACCCCCTACTCACCCTACTCAACACATATACCGCTGCTAACCCCATACCC
TGAACCAACCAAACCCCAAAGACACCCCTACACA
35 changes: 35 additions & 0 deletions src/cline_tools/help_pages/orffinder-to-gtf.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
USAGE
orffinder-to-gtf [-in input] [-infmt format] [-out output] [-outfmt format] [-orf_size int]
[-remove_nested boolean] [-trim_trailing boolean] [-max_orfs_per_sequence int]
[-attr_name string]

DESCRIPTION
ORFFinder Python v1.5

PARAMETERS
[-h]
Shows this interface.

[-in (string)]:
Input nucleotide sequence to extract ORFs from.

[-infmt (string)]:
Can be "fasta", "genbank", or any other Biopython supported format. Default: "fasta"

[-out (string)]
Optional output file. If not specified, will output to stdout.

[-orf_size (integer)]
Minimum size (in nucleotides) of ORF. Default: 75

[-remove_nested (boolean)]
Remove ORFs that are completely nested in another ORF. Default: False

[-trim_trailing (boolean)]
Remove ORFs that have a start codon but no stop codon at the edges of the sequence. Default: False

[-max_orfs_per_sequence (integer)]
Maximum number of ORFs to return per sequence, sorted by length. Default: -1 (no limit)

[-attr_name (string)]
Attribute ID name in GTF file. Suffixed by ORF index number. Default: "ORF_"
41 changes: 41 additions & 0 deletions src/cline_tools/help_pages/orffinder-to-sequence.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
USAGE
orffinder-to-gtf [-in input] [-infmt format] [-out output] [-orf_size int]
[-remove_nested boolean] [-trim_trailing boolean] [-max_orfs_per_sequence int]
[-attr_name string] [-outtype protein/nucleotide]

DESCRIPTION
ORFFinder Python v1.5

PARAMETERS
[-h]
Shows this interface.

[-in (string)]:
Input nucleotide sequence to extract ORFs from.

[-infmt (string)]:
Can be "fasta", "genbank", or any other Biopython supported format. Default: "fasta"

[-out (string)]
Optional output file. If not specified, will output to stdout.

[-outfmt (string)]:
Can be "fasta" or "fasta-2line". Default: "fasta"

[-orf_size (integer)]
Minimum size (in nucleotides) of ORF. Default: 75

[-remove_nested (boolean)]
Remove ORFs that are completely nested in another ORF. Default: False

[-trim_trailing (boolean)]
Remove ORFs that have a start codon but no stop codon at the edges of the sequence. Default: False

[-max_orfs_per_sequence (integer)]
Maximum number of ORFs to return per sequence, sorted by length. Default: -1 (no limit)

[-attr_name (string)]
Attribute ID name in GTF file. Suffixed by ORF index number. Default: "ORF_"

[-outtype (string)]
Can be "protein" or "nucleotide". Default: "nucleotide"
8 changes: 7 additions & 1 deletion src/cline_tools/orffinder-to-gtf.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
import sys
import os
from Bio import SeqIO
from orffinder import orffinder

arguments = sys.argv
arguments = sys.argv + [""]
classed_arguments = {"orf_size": "75", "max_orfs_per_sequence": "-1", "remove_nested": "False", "trim_trailing": "False", "infmt": "fasta", "attr_name": "ORF_"}

try:
Expand All @@ -14,6 +15,11 @@

classed_arguments[argument[1:]] = arguments[i + 1]

if "h" in classed_arguments.keys():
help_output = open("help_pages/orffinder-to-gtf.txt", "r").read()
print(help_output)
os._exit(1)

sequences = SeqIO.parse(classed_arguments["in"], classed_arguments["infmt"])

orf_size = int(classed_arguments["orf_size"])
Expand Down
Loading

0 comments on commit adf1827

Please sign in to comment.