Version 0.6.0
Changelog
Changed models
..............
The following models might give some different sampling due to changes in
scikit-learn:
- :class:
imblearn.under_sampling.ClusterCentroids
- :class:
imblearn.under_sampling.InstanceHardnessThreshold
The following samplers will give different results due to change linked to
the random state internal usage:
- :class:
imblearn.over_sampling.SMOTENC
Bug fixes
.........
-
:class:
imblearn.under_sampling.InstanceHardnessThreshold
now take into
account therandom_state
and will give deterministic results. In addition,
cross_val_predict
is used to take advantage of the parallelism.
:pr:599
by :user:Shihab Shahriar Khan <Shihab-Shahriar>
. -
Fix a bug in :class:
imblearn.ensemble.BalancedRandomForestClassifier
leading to a wrong computation of the OOB score.
:pr:656
by :user:Guillaume Lemaitre <glemaitre>
.
Maintenance
...........
-
Update imports from scikit-learn after that some modules have been privatize.
The following import have been changed:
:class:sklearn.ensemble._base._set_random_states
,
:class:sklearn.ensemble._forest._parallel_build_trees
,
:class:sklearn.metrics._classification._check_targets
,
:class:sklearn.metrics._classification._prf_divide
,
:class:sklearn.utils.Bunch
,
:class:sklearn.utils._safe_indexing
,
:class:sklearn.utils._testing.assert_allclose
,
:class:sklearn.utils._testing.assert_array_equal
,
:class:sklearn.utils._testing.SkipTest
.
:pr:617
by :user:Guillaume Lemaitre <glemaitre>
. -
Synchronize :mod:
imblearn.pipeline
with :mod:sklearn.pipeline
.
:pr:620
by :user:Guillaume Lemaitre <glemaitre>
. -
Synchronize :class:
imblearn.ensemble.BalancedRandomForestClassifier
and add
parametersmax_samples
andccp_alpha
.
:pr:621
by :user:Guillaume Lemaitre <glemaitre>
.
Enhancement
...........
-
:class:
imblearn.under_sampling.RandomUnderSampling
,
:class:imblearn.over_sampling.RandomOverSampling
,
:class:imblearn.datasets.make_imbalance
accepts Pandas DataFrame in and
will output Pandas DataFrame. Similarly, it will accepts Pandas Series in and
will output Pandas Series.
:pr:636
by :user:Guillaume Lemaitre <glemaitre>
. -
:class:
imblearn.FunctionSampler
accepts a parametervalidate
allowing
to check or not the inputX
andy
.
:pr:637
by :user:Guillaume Lemaitre <glemaitre>
. -
:class:
imblearn.under_sampling.RandomUnderSampler
,
:class:imblearn.over_sampling.RandomOverSampler
can resample when non
finite values are present inX
.
:pr:643
by :user:Guillaume Lemaitre <glemaitre>
. -
All samplers will output a Pandas DataFrame if a Pandas DataFrame was given
as an input.
:pr:644
by :user:Guillaume Lemaitre <glemaitre>
. -
The samples generation in
:class:imblearn.over_sampling.SMOTE
,
:class:imblearn.over_sampling.BorderlineSMOTE
,
:class:imblearn.over_sampling.SVMSMOTE
,
:class:imblearn.over_sampling.KMeansSMOTE
,
:class:imblearn.over_sampling.SMOTENC
is now vectorize with giving
an additional speed-up whenX
in sparse.
:pr:596
by :user:Matt Eding <MattEding>
.
Deprecation
...........
-
The following classes have been removed after 2 deprecation cycles:
ensemble.BalanceCascade
andensemble.EasyEnsemble
.
:pr:617
by :user:Guillaume Lemaitre <glemaitre>
. -
The following functions have been removed after 2 deprecation cycles:
utils.check_ratio
.
:pr:617
by :user:Guillaume Lemaitre <glemaitre>
. -
The parameter
ratio
andreturn_indices
has been removed from all
samplers.
:pr:617
by :user:Guillaume Lemaitre <glemaitre>
. -
The parameters
m_neighbors
,out_step
,kind
,svm_estimator
have been removed from the :class:imblearn.over_sampling.SMOTE
.
:pr:617
by :user:Guillaume Lemaitre <glemaitre>
.