Poisson context likelihood and better cli flexibility #126
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR optionally (and by default) replaces "mutability parsimony" with a poisson S5F context-based likelihood, described in https://github.com/matsengrp/poisson-subs-models/blob/main/main.tex
This likelihood assumes MLE branch lengths, and doesn't require any parameter fitting on the parsimony forest.
The PR also adds two new command line arguments to
gctree infer
:--use_old_mut_parsimony
allows the user to use the old version of mutability parsimony, instead of the new context-based likelihood.--branching_process_ranking_coeff
allows the user to specify the ranking coefficient for branching process likelihood, in particular allowing branching process likelihood to be ignored altogether. The default value remains-1
as before, but if the provided value is zero, branching process parameter fitting is skipped entirelyThe
CollapsedForest.filter_trees
method was largely rewritten. It should be somewhat faster and provides much better log messages (ifverbose
flag is set) indicating how tree ranking is performed, which is reassuring when you want to be sure that the ranking coefficients line up with the proper weights. It also warns the user if the sign of a ranking coefficient doesn't match the appropriate optimization function for that weight (e.g. coefficients for likelihoods should be negative, since higher likelihoods are better)More test cases are now included in
tests/smalltest.sh
, testing more of the possible cli parameter combinations. There is also a new test intests/test_likelihoods.py
comparing the new context-based likelihood to a couple of hand-computed simple cases.