Safe inference and parallel bug fixes #55
Merged
+97
−61
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi @EBjerrum, thank you and other contributors for making this package possible
I had a plan to contribute to the library during the Christmas holidays, but after forking it suddenly realized, that a lot of tests are failing and this PR aims to solve this prior to making any other contributions
Overview
This pull request includes various changes aimed at improving the handling of invalid SMILES strings, enhancing the
SafeInferenceWrapper
functionality, and updating test fixtures for better consistency. The most important changes include adjustments to chunk processing, updates to theSafeInferenceWrapper
class, and modifications to test fixtures.Enhancements to chunk processing:
scikit_mol/descriptors.py
: Added a check to ensuren_chunks
does not exceed the length ofX
to avoid empty chunks.scikit_mol/fingerprints/baseclasses.py
: Added a check to ensuren_chunks
does not exceed the length ofX
to avoid empty chunks.Updates to
SafeInferenceWrapper
:In current version value for
replace_value
is not propagated from theSafeInferenceWrapper
to thefilter_invalid_rows
causing the fill value to always benp.nan
instead of actual value. Also when the input contains only invalid SMILES in the safe inference mode, empty array is passed to the estimator causing errorscikit_mol/safeinference.py
: Added__all__
to export specific classes and functions, updated thefilter_invalid_rows
function to usereplace_value
from the class it is applied to, and added a check forreplace_value
in theSafeInferenceWrapper
, added check for the case, where all the inputs are invalid in the safe inference modeModifications to test fixtures:
tests/fixtures.py
: Replaced theinvalid_smiles_list
fixture withsmiles_list_with_invalid
to include both valid and invalid SMILES strings, added newinvalid_smiles_list
fixture to include only invalid smiles stringstests/test_safeinferencemode.py
: Updated tests to use the newsmiles_list_with_invalid
fixture and added new tests for handling single invalid SMILES and using different fill valuesOther test updates:
tests/test_sanitizer.py
: Updated tests to use thesmiles_list_with_invalid
fixturetests/test_smilestomol.py
: Updated tests to use thesmiles_list_with_invalid
fixture.