-
Notifications
You must be signed in to change notification settings - Fork 21
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
EBjerrum
committed
Sep 22, 2023
1 parent
21229a8
commit 0848f6d
Showing
3 changed files
with
41 additions
and
10 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
# Contribution | ||
|
||
Thanks for your interest in contributing to the project. Please read on in the sections that apply. | ||
|
||
|
||
## Installation | ||
Clone and install in editable more like this | ||
|
||
git clone [email protected]:EBjerrum/scikit-mol.git | ||
pip install -e . | ||
|
||
## Adding transformers | ||
The projects transformers subclasses the BaseEstimator and Transformer mixin classes from sklearn. Their documentation page contains information on what requisites are necessary [https://scikit-learn.org/stable/developers/develop.html](https://scikit-learn.org/stable/developers/develop.html). Most notably: | ||
* The arguments accepted by __init__ should all be keyword arguments with a default value. | ||
* Every keyword argument accepted by __init__ should correspond to an attribute on the instance. | ||
* * There should be no logic, not even input validation, and the parameters should not be changed. | ||
Scikit-learn classes depends on this in order to for e.g. the .get_params(), .set_params(), cloning abilities and representation rendering to work. | ||
|
||
### Docstrings | ||
We should ultimately consolidate on the numpy docstring format [https://numpydoc.readthedocs.io/en/latest/format.html#docstring-standard](https://numpydoc.readthedocs.io/en/latest/format.html#docstring-standard) which is also used by SciPy and other scikits. | ||
|
||
### Testing | ||
New transformer classes should be added to the pytest tests in the tests directory. There may be a need for specific tests for the specific transformer, but it should also be added to the general tests that test sklearn necessary aspects of the transformer such as clonability. | ||
|
||
### Tips | ||
* We have observed that some external tools used "exotic" types such at np.int64 when doing hyperparameter tuning. It is thus necessary to cast to standard types before making calls to rdkit functions. This behaviour is tested in the test_parameter_types test | ||
|
||
* @property getters and setters can be used if additional logic are needed when setting the attributes from the keywords while at the same time adhering to the sklearn requisites. | ||
|
||
* Some RDKit features uses objects as generators which may not be picklable. If instantiated and added to the object rather than instantiated at each function call for individual molecules, these should thus be removed and recreated via overloading the _get_state() and _set_state() methods. See MHFingerprintTransformer for an example. | ||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -59,11 +59,6 @@ Bleeding edge | |
|
||
pip install git+https://github.com:EBjerrum/scikit-mol.git | ||
|
||
Developers | ||
|
||
git clone [email protected]:EBjerrum/scikit-mol.git | ||
pip install -e . | ||
|
||
## Documentation | ||
|
||
There are a collection of notebooks in the notebooks directory which demonstrates some different aspects and use cases | ||
|
@@ -76,6 +71,9 @@ There are a collection of notebooks in the notebooks directory which demonstrate | |
* [Integrated hyperparameter tuning of Scikit-Learn estimator and Scikit-Mol transformer](https://github.com/EBjerrum/scikit-mol/blob/main/notebooks/06_hyperparameter_tuning.ipynb) | ||
* [Using parallel execution to speed up descriptor and fingerprint calculations](https://github.com/EBjerrum/scikit-mol/blob/main/notebooks/07_parallel_transforms.ipynb) | ||
|
||
## Contributing | ||
|
||
There are more information about how to contribute to the project in [CONTRIBUTION.md](https://github.com/EBjerrum/scikit-mol/CONTRIBUTION.md) | ||
|
||
## BUGS | ||
Probably still, please check issues at GitHub and report there | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters