- Tests
- Workflow test with example data
- Trivial examples for each function
- Unit tests for SSI
- Unit tests for density features
- Integrate DiffNets.
- Lay out module structure in separate branch.
- Copy core network from DiffNets repo.
- Try to use existing featurization.
- Include existing DiffNets featurization and compare.
- exploratory analysis via correlation coefficients of the features
- First tests --> not very promising.
- Try different metric
- Find useful application or leave it out.
- Unified tutorial in documentation. Make one page for each subpackage
- preprocessing
- coordinates
- densities
- featurization
- structure features
- water features
- atom features
- comparison
- dimensionality reduction
- clusters (show how to cluster on PCs)
- SSI
- preprocessing
- Try using MDAnalysis instead of biotite for water featurization
- Integrate more options for features from PyEMMA (think carefully about how to make it more flexible)
- More example tcl scripts for VMD
- Facilitate calculation of JSD etc. on principal components
- Facilitate calculation of SSI on results of joint clustering.
- Weighted PCA/tICA? (to account for varying simulation lengths or uncertainty)
- Feature comparison of more than two ensembles
- with respect to the joint ensemble (all metrics)
- with respect to a reference ensemble (will not always work for KLD)
- Implement T-distributed Stochastic Neighbor Embedding (t-SNE)
- Read up on t-SNE for molecular trajectories
- See if we can import or adapt existing code.
- First tests with (regular) t-SNE
- Test time-lagged t-SNE. How to handle time-dependence across simulations/ensembles?
- write module
- write unit tests
- Implement a clustering algorithem designed for structural ensembles
- Read up about CLoNe
- First tests
- write module
- write unit tests
- Put shared functionality of PCA and TICA into shared functions.
- Make file format (png/pdf?) for matplotlib optional.
- Implement Linear Discriminant Analysis
- Logo
- Hydrogen bonds as features
- Contacts as features (can PyEMMA do this?)
- Position deviations as features (similar to components of RMSD)
- Estimate thresholds for significance of feature differences
- Calculate correlation times within trajectories
- modify p-value of KS test using correlation time
- modify p-value of KS test using number of simulation runs per ensemble
- Wasserstein distance to compare ensembles
- Add options to save and load calculated features
- Add option to whiten features
- Featurizers for other molecule types
- ligands
- lipids
- nucleic acids
- Simplify adding hand-crafted features
- Implement conformational entropy calculations
- Implement multi-dimensional scaling
- Try to integrate functional mode analysis.
- Try to integrate VAMPnets.
- Try to integrate network analysis.
- Colab Tutorial
- Put Notebook on Colab and get it to run.
- Add visualizations.
- Fix installation via pip.
- Fix animations (they only show white canvas).
- Add TICA to Colab tutorial.
- Include TICA in unit tests
- Write "getting started" for documentation
- Refactoring and fixes for release 0.2
- Restructure modules to subpackages
- Adapt README
- Adapt API documentation
- Include SSI to comparison example script
- Numbering of principal component trajectories starts with 0, should start with 1
- Axis labels and legend name for distance matrix plot
- Function pca_features() does not have labels
- Function compare_projections() does not have labels or legend
- Slack channel for all developers and testers, and to provide support for the user community.
- Implement clustering in principal component space
- Frame classification via CNN on features
- Prototype to classify simulation frames --> Diffnets probably more powerful.
- Interpret weights as relevance of features
- Write module
- Write unit tests