- Hotfix to improve GFR numerical stability when adjacency is nearly zero.
This version formalizes the inclusion of new features introduced from 0.8a1 to 0.8a18. An (incomplete) list of features include:
- Dataset downloaders (
graphdot.dataset
) - Graph Hausdorff distance metric (
graphdot.metric.maximin
) - Gaussian field regressor (
graphdot.model.gfr
) - Kernel-induced distance metrics (
graphdot.metric
) - Low-rank GPR via Nystrom approximation (
graphdot.model.gpr.nystrom
) - Multiplicative regularization for GPR
- Maintenance update.
- Fixed a QM9 downloader issue.
- Fixed a bug with the Maximin metric when some hyperparameters are fixed.
- Downloader for QM9.
- Minor tweaks to look-ahead rewriter logic.
- A new and experimental calling convention to allow evaluations of the graph kernels at a list of specific indices.
- A sequence-based rewriter for Monte Carlo tree search.
- Convert any kernel into a metric via
KernelInducedDistance
. - Convert any norm into a kernel via
KernelOverMetric
.
- Improvements to the active learning hierarchical drafter.
- Performance optimization for GFR leave-one-out cross validation gradients.
- Leave-one-out cross validation for Gaussian field regressor.
- Normalized the MaxiMin graph distance metric to [0, 1].
- More data downloaders: METLIN SMRT, AMES, and a custom downloader.
- Gradient evaluation for the MaxiMin graph metric.
- Gradient evaluation for Gaussian field regressor prediction loss.
- Optimized the evaluation of the gradient of the loss function for the Gaussian field regressor.
- Implemented a finite-difference based graph kernel nodal gradient.
- Added a downloader for the QM7 dataset.
- Prototype implementation of a Gaussian field harmonic function regressor.
- Added an multiplicative regularization option to GPR, which may perform better when the kernel is not normalized.
- Fixed a linear algebra type error when the GPR kernel matrix is solved with pseudoinverse.
- Added an experimental Monte Carlo tree search model.
- Enabled Low-rank GPR (Nystrom) training with missing target values.
- Enabled GPR training with missing target values.
This version formalizes the inclusion of new features introduced from 0.7a1 to 0.7b2. An (incomplete) list of features include:
- A redesigned active learning module (
graphdot.model.active_learning
). - The PBR graph reordering algorithm for graph kernel acceleration
(
graphdot.graph.reorder.pbr
). - LOOCV predictions using the low-rank approximate GPR.
- Significant improvement to the robustness of the training methods of GPR and Low-rank GPR models.
- Allow kernel/microkernel hyperparameters to be declared as 'fixed' via the
*_bounds
arguments. - Added a
DotProduct
microkernel for vector-valued node and edge features. - Added a
.normalized
attribute to all elementary and composite microkernels. - Graph representation string can now be directly deserialized using
eval
. - New atomic adjacency options such as alternative bell-shaped compact
adjacency functions (
compactbell[a,b]
), and new length scale choices using covalent radiu etc. - Perform value range check for the node and edge kernels during graph kernel creation.
- Added a
to_networkx()
method tographdot.Graph
. - Enhanced the readability of the string representations of kernel hyperparameters using an indented print layout.
- Various performance and bug fixes.
- Added a
DotProduct
microkernel for vector-valued node and edge features. - Added a
.normalized
attribute to all elementary and composite microkernels. - Perform value range check for the node and edge kernels during graph kernel creation.
- Performance improvements to the variance minimizing active learner.
- Furture improvements to the robustness of the GPR training process.
- Uses a more robust pseudoinverse algorithm for GPR when the kernel matrix is nearly singular.
- Added bell-shaped compact adjacency functions.
- Redesigned the active learning module.
- Enhanced the readability of the string representations of kernel hyperparameters.
- New atomic adjacency options.
- Improved numerical stability tolerance of the GPR and Low-rank GPR models.
- Added a
to_networkx()
method tographdot.Graph
.
- Graph representation string can now be directly deserialized using
eval
.
- Optimized GPU gradient evaluation performance
predict_loocv
now available for the LowRankApproximateGPR model.- Unified the
fit
andfit_loocv
method of GaussianProcessRegressor.
- Fixed a bug related to bounds of kernels contains fixed hyperparameters.
- Allow kernel/microkernel hyperparameters to be declared as 'fixed' via the
*_bounds
arguments.
- Fixed a memory layout issue that slowed down computations using normalized kernels.
- The PBR graph reordering algorithm as proposed in 10.1109/IPDPS47924.2020.00080 is now available.
- Improved the performance of gradient evaluation for the marginalized graph kernel.
- Introduced a new
MaxiMin
distance metric between graphs.
- Added
save
andload
methods to the Gaussian process regressor models.
- Fixed a bug related to the
lmin=1
option of the marginalized graph kernel.
- Fixed a bug regarding target value normalization in the
fit_loocv
method of GPR.
- Fixed a performance degradation due to the inconsistent lexical sorting
behavior between
numpy.lexsort
andnumpy.unique
.
- Fixed a bug in computing the gradient of diagonal kernel entries.
- Fixed a bug in kernel normalization.
This version formally releases the new features as have been introduced in the various 0.6alpha versions, such as:
- Nystrom low-rank approximate Gaussian process regressor
- Graphs with self-looping edges
- Graph permutation and reordering operations for GPU performance boost.
- Hyperparameterized and optimizable starting probabilities for the graph kernel.
- Supports graphs with self-looping edges.
- Made the
Graph.from_rdkit
method optional in case if RDKit itself is not available.
- Ensures that graph cookies are not pickled.
- Fixed a problem assocaited with converting permuted graphs to octilegraphs.
- Fixed a problem with caching graphs on the GPU.
- Introduced a graph reordering mechanism to improve computational performance on GPUs.
- The default starting probability of the marginalized graph kernel is now hyperparameterized and will be optimized by default during training.
- Allow users to specify custom starting probability distributions.
- Performance improvements due to the in situ computation of starting probabilities instead of loading from memory.
- Added
repeat
,theta_jitter
andtol
options to the Gaussian process regressor. - Fixed a normalization bug in
GaussianProcessRegressor.fit_loocv
.
- Added a verbose training progress option to the GPR module.
- The
graphdot.kernel.basekernel
package has been redesigned and renamed tographdot.microkernel
.
- Initial formal release of the Gaussian Process regresion module.
- Implemented the base kernel exponentiation, i.e.
k**a
, semantics. - Minor docstring fixes.
- Fixed a regression that causes data frame unpickling errors.
- Added the leave-one-out cross-validation prediction and training to GPR.
- Fixed an automatic documentation issue.
- Added check for the shape of hyperparameter bounds specification to prevent users from unknowingly provide invalid values.
- Fixed a bug related to Jacobian dimensionality.
- Added a built-in Gaussian process regression (GPR) module.
- Fixed an issue that prevented the pickling of graphs.
- Fixed a minor bug in
Graph.from_rdkit
.
- Replaced
from_smiles
with a more robustfrom_rdkit
function with additional ring stereochemistry features. Thanks to Yan Xiang for the contribution. - Added a new
Compose
method for creating base kernels beyond tensor product base kernels. - Fixed a performance degradation issue (#57).
- Ensure that graphs can be pickled.
- Ensure graph feature data layout consistency involving a mixture of scalar and variable-length features. Fixes #56.
- Fixed an integer sign issue introduced with graph type unification.
- Renamed
Graph.normalize_types
toGraph.unify_datatype
.
- Now allowing variable-length node and edge features thanks to a redesign of the Python/C++ data interoperation mechanism.
- Introduced a
Convolution
base kernel for composing kernels on variable-length attributes using scalar base kernels.
- Added a
dtype
option to theMarginalizedGraphKernel
to specify the type of returned matrix elements.
- Specified the minimum version of sympy in installation requirements.
- Allow M3 metric to use partial charge information.
- Made the element, bond, and charge parameters adjustable in the M3 metric.
- Miscellaneous bug fixes.
- Analytic computation of graph kernel derivatives against hyperparameters.
- Users can now define new base kernels easily using SymPy expression #45.
- Better scikit-learn interoperability.
- Fixed a bug related to atomic adjacency #43.
- Added an experimental 'M3' distance metric
- Bug fixes and stability improvements
- Improved the performance of hyperparameter optimization by enabling lightweight re-parameterization.
- Implemented a few properties and methods for scikit-learn interoperability.
- Improved the performance of successive graph kernel evaluations
- Improved the performance of graph format conversion for the GPU kernel by 3 times.
- Incorporated many new optimizations as detailed in https://arxiv.org/abs/1910.06310.
- Preparing for faster memory allocation and job creation.
- Fixes #32, #33.
- Reduced kernel launch preparation time by 50% to address #28.
- Fixed a memory leak issue #31.
- Changed return type of the
diag()
method ofMarginalizedGraphKernel
to fix #30.
- Fixed an edge label consistency issue with graphs generated from SMILES strings.
- Added a freshly-designed atomic adjacency rule.
- Significantly accelerated conversion from ASE molecules to graphs.
- Documentation update.
- Documentation update.
- Added the
diag()
method toTang2019MolecularKernel
.
- Fixed a regression in the CUDA kernel that caused an order-of-magnitude slowdown
- Switched to single-precision floating points for edge length in
Graph.from_ase
- Added several performance benchmark code to
example/perfbench