todo

- Do the read/write_motifs() functions work with the gap slots?
- Scan for motif clusters?
- motif_peaks() is in dire need of a re-write
- Check that everything in convert_motifs() still works
- Check that all read/write_*() functions still work
- Add tests
- Visualisation of scan_sequences()/enrich_motifs() --> mention EnrichedHeatmap
- Work on Rmd output of compare_motifs()
- motif_peaks(): compare two sets of sequences (enrichment analysis)
- motif_peaks(): serious code cleanup needed
- dependencies to consider removing:
    + !! yaml (write my own parser, doesn't matter if it's slow)
- Remove some of the input parameter checking in merge_motifs()/view_motifs()/
  motif_tree(), rely solely on compare_motifs()
- in read_*() files, check to input file exists
- change internal motif precision limit from 1e-3 to 1e-6 (will require changes
  to how allow.nonfinite works)
- add a vignette section about pseudocounts and how the functions interact with
  them
- small meme p-values solution: keep the pval/qval/eval slots as log-transformed,
  un-log for print and `[` methods?
- de novo motifs:
    + first search 4/5/6/7/8-mers for enrichment differences (either
      against background or statistically expected based on 1/2/3-let frequencies)
    + then for significantly enriched k-mers, start changing PWM positions to
      see if that increases significance/enrichment OR start merging enriched
      k-mers? (fast version: only k-mer merging?)
    + for every enriched k-mer, scan with a logodds threshold of 0 -- look
      for enrichment of lower score hits?
    + finally extend motif until desired length
    + trim useless edges
- is the MEME E/p-value log10 or natural log?
- let switch_alph() be used to change to arbitrary alphabets (but keep default
  behaviour consistent with previous version!)
- time for another make_DBscores() re-write I think --> use dynamic pvals?
- need a function which can allow for motifs with different background freqs
   to be comparable: maybe something like PPM --> PWM (change background) PWM
   --> PPM?
- create_motif(): is there a check for the "type" input?
    + why does create_motif(type="asdf") work?
- read_*(): allow to set pseudocount
- make n^k protection in motif_pvalue() optional
- make sure no errors are possible inside c++ code -- can lead to memory leaks
  since c++/R may not communicate properly about what needs to be freed
- new view_logo() example idea: show how to plot A->C mutation change (need to
  add arrows to fontDFroboto)
- read_matrix(): make it so that it's possible to read all other formats with
  it via tweaking parsing options
- go through vignettes again
- use NULL instead of missing for optional args
- get rid of DataFrame and just go pure data.frame? (store metadata in attributes?)
- better manipulation of alphabet slot
- stop this from working? motif["name"] <- NA
- motif_pvalue() vignette bit: explain why exact calculations dont give exact
  results (rounding to three digits for use as ints)
- filter_motifs() example in vignette: nsites column returning NAs?
- weird interaction:
  + mydf %>% dplyr::rename(name = altname, altname = name) %>% to_list()
  + if altname was blank, this causes "0" to be filled in the name slot
- scan_sequences() and large datasets: spends a LOT of time in pre- and
  post-processing... need to optimize this. The actual scanning is quite fast
  even with billions of letters.
- Make use of average_ic() to filter motifs in more function examples? Right
  now it just shows up in compare_motifs().
- merge_motifs() 2nd example: also plot submotifs
- make sure window size can't be set too low (shuffle_sequences, get_bkg)
- motif_score(create_motif("ATCGTACGTG"), 0, allow.nonfinite = TRUE) --> broken?
- should I change the default comparison metric back to PCC...? I feel bad for
  constantly switching but ALLR is kind of annoying to work with in some cases
- add an extranum slot
- does view_motifs(use.type="ICM") work with KL motifs? (is ylim properly set)
- pretty sure the allow.nonfinite option is borked in motif_pvalue()
- native logo plotting with trees
- motif p-values vignette section: show effects of k using plots instead of tables
- emit a warning whenever the multifreq/gap slots are lost
- motif_tree() + view_motifs():
  + ggtree::inset
  + tree + ggtree::geom_subview(mot_plot_list [+theme_inset()], ...) [see inset code]
  + ggtree::facet_plot
  + ggtree::geom_fruit --> geom_subview for circular layouts
- would it be faster to handle chars from XString directly instead of converting to
  integer?
- should def rework shuffle so that it doesn't create unconnected vertices!
- dont think enrichment is taking into account respect.strand in enrich_motifs(),
  only looks at RC=TRUE/FALSE
- shuffle_motifs() adds a pseudocount??
- klet --> kmer
- multifreq --> higher order ? (maybe nmer motif)
- work with methylated DNA?
- scan_sequences(motifs[1:100], athaliana, warn.NA=F,threshold=.9,
    threshold.type="logodds",nthreads=6) --> crashes!!! (out of memory I think)
- enrich_motifs with size 1 sequences causes crash --> check size of seqs
- stop using multifreq slot; make motif objects represent a single order level
- comparison: add option to use sum to find best alignment, then apply mean/etc
  (or p-value? see gupta et al 2007)
- create a separate repo for a universalmotif website; otherwise the vignettes
  in the package will become too big/annoying
- scan_sequences() P-values: should they use the bkg of the input sequences instead
  of the motif? wouldn't this calculate a better null for calc.qvals.method="fdr"?
- Really need to get straight what background probabilities to use in motif_pvalue()
  when scanning; the motifs own, or those of the sequences being scanned? Also need
  to make sure that feeding bkg.probs to motif_pvalue() actually uses those in the
  create of the PWMs!
- Add an example in vignettes about methylated DNA, referring to
  https://www.biorxiv.org/content/10.1101/043794v1
  e.g. create_motif(alphabet="ACGTm", bkg = c(A=.25,C=.125,G=.25,T=.25,m=.125))
  + however this means no reverse complement scanning...
- to_df() et al: don't show warning when printing if errors are present?
- change the default pseudocount value to 0.1? (instead of 1) 
  --> or... change default nsites to 1000?
- fix motif_peaks()
- compare_motifs(): allow overhangs to be compared to background fequencies (see
  homer)
- compare_motifs(): dynamic P-values!
- create a separate container for higher order motifs
- scan_sequences(): for large jobs, save temporary results on disk; maybe use the qs pkg?
- combine view_motifs() with EnrichedHeatmap/ComplexHeatmap?
- compare_motifs() w/ normalization: does this still work for scores w/ -ve values?
- scan_sequences(): needs a serious performance overhaul for large jobs