Releases · rasbt/mlxtend

19 Nov 07:24

rasbt

v.0.9.1

6610cb7

Version 0.9.1

Version 0.9.1 (2017-11-19)

Downloads

New Features

Added mlxtend.evaluate.bootstrap_point632_score to evaluate the performance of estimators using the .632 bootstrap. (#283)
New max_len parameter for the frequent itemset generation via the apriori function to allow for early stopping. (#270)

Changes

All feature index tuples in SequentialFeatureSelector or now in sorted order. (#262)
The SequentialFeatureSelector now runs the continuation of the floating inclusion/exclusion as described in Novovicova & Kittler (1994).
Note that this didn't cause any difference in performance on any of the test scenarios but could lead to better performance in certain edge cases.
(#262)
utils.Counter now accepts a name variable to help distinguish between multiple counters, time precision can be set with the 'precision' kwarg and the new attribute end_time holds the time the last iteration completed. (#278 via Mathew Savage)

Bug Fixes

Fixed an deprecation error that occured with McNemar test when using SciPy 1.0. (#283)

Assets 2

22 Oct 00:31

rasbt

v0.9.0

1072b27

Version 0.9.0

New Features

Added evaluate.permutation_test, a permutation test for hypothesis testing (or A/B testing) to test if two samples come from the same distribution. Or in other words, a procedure to test the null hypothesis that that two groups are not significantly different (e.g., a treatment and a control group). (#250)
Added 'leverage' and 'conviction as evaluation metrics to the frequent_patterns.association_rules function. (#246 & #247)
Added a loadings_ attribute to PrincipalComponentAnalysis to compute the factor loadings of the features on the principal components. (#251)
Allow grid search over classifiers/regressors in ensemble and stacking estimators. (#259)
New make_multiplexer_dataset function that creates a dataset generated by a n-bit Boolean multiplexer for evaluating supervised learning algorithms. (#263)
Added a new BootstrapOutOfBag class, an implementation of the out-of-bag bootstrap to evaluate supervised learning algorithms. (#265)
The parameters for StackingClassifier, StackingCVClassifier, StackingRegressor, StackingCVRegressor, and EnsembleVoteClassifier can now be tuned using scikit-learn's GridSearchCV (#254 via James Bourbeau)

Changes

The 'support' column returned by frequent_patterns.association_rules was changed to compute the support of "antecedant union consequent", and new antecedant support' and 'consequent support' column were added to avoid ambiguity. (#245)
Allow the OnehotTransactions to be cloned via scikit-learn's clone function, which is required by e.g., scikit-learn's FeatureUnion or GridSearchCV (via Iaroslav Shcherbatyi). (#249)

Bug Fixes

Fix issues with self._init_time parameter in _IterativeModel subclasses. (#256)
Fix imprecision bug that occurred in plot_ecdf when run on Python 2.7. (264)
The vectors from SVD in PrincipalComponentAnalysis are no being scaled so that the eigenvalues via solver='eigen' and solver='svd' now store eigenvalues that have the same magnitudes. (#251)

Assets 2

09 Sep 08:47

rasbt

v0.8.0

e17f49b

Version 0.8.0

Downloads

New Features

Added a mlxtend.evaluate.bootstrap that implements the ordinary nonparametric bootstrap to bootstrap a single statistic (for example, the mean. median, R^2 of a regression fit, and so forth) #232
SequentialFeatureSelecor's k_features now accepts a string argument "best" or "parsimonious" for more "automated" feature selection. For instance, if "best" is provided, the feature selector will return the feature subset with the best cross-validation performance. If "parsimonious" is provided as an argument, the smallest feature subset that is within one standard error of the cross-validation performance will be selected. #238

Changes

SequentialFeatureSelector now uses np.nanmean over normal mean to support scorers that may return np.nan #211 (via mrkaiser)
The skip_if_stuck parameter was removed from SequentialFeatureSelector in favor of a more efficient implementation comparing the conditional inclusion/exclusion results (in the floating versions) to the performances of previously sampled feature sets that were cached #237
ExhaustiveFeatureSelector was modified to consume substantially less memory #195 (via Adam Erickson)

Bug Fixes

Fixed a bug where the SequentialFeatureSelector selected a feature subset larger than then specified via the k_features tuple max-value #213

Assets 2

23 Jun 03:36

rasbt

v0.7.0

7806f7c

Version 0.7.0

Version 0.7.0 (2017-06-22)

New Features

New mlxtend.plotting.ecdf function for plotting empirical cumulative distribution functions (#196).
New StackingCVRegressor for stacking regressors with out-of-fold predictions to prevent overfitting (#201via Eike Dehling).

Changes

The TensorFlow estimator have been removed from mlxtend, since TensorFlow has now very convenient ways to build on estimators, which render those implementations obsolete.
plot_decision_regions now supports plotting decision regions for more than 2 training features #189, via James Bourbeau).
Parallel execution in mlxtend.feature_selection.SequentialFeatureSelector and mlxtend.feature_selection.ExhaustiveFeatureSelector is now performed over different feature subsets instead of the different cross-validation folds to better utilize machines with multiple processors if the number of features is large (#193, via @whalebot-helmsman).
Raise meaningful error messages if pandas DataFrames or Python lists of lists are fed into the StackingCVClassifer as a fit arguments (198).
The n_folds parameter of the StackingCVClassifier was changed to cv and can now accept any kind of cross validation technique that is available from scikit-learn. For example, StackingCVClassifier(..., cv=StratifiedKFold(n_splits=3)) or StackingCVClassifier(..., cv=GroupKFold(n_splits=3)) (#203, via Konstantinos Paliouras).

Bug Fixes

SequentialFeatureSelector now correctly accepts a None argument for the scoring parameter to infer the default scoring metric from scikit-learn classifiers and regressors (#171).
The plot_decision_regions function now supports pre-existing axes objects generated via matplotlib's plt.subplots. (#184, see example)
Made math.num_combinations and math.num_permutations numerically stable for large numbers of combinations and permutations (#200).

Assets 2

18 Mar 22:52

rasbt

v0.6.0

efec082

Version 0.6.0

Version 0.6.0 (2017-03-18)

Downloads

New Features

An association_rules function is implemented that allows to generate rules based on a list of frequent itemsets (via Joshua Goerner).

Changes

Adds a black edgecolor to plots via plotting.plot_decision_regions to make markers more distinguishable from the background in matplotlib>=2.0.
The association submodule was renamed to frequent_patterns.

Bug Fixes

The DataFrame index of apriori results are now unique and ordered.

Assets 2

14 Feb 06:26

rasbt

v0.5.1

86e40d5

Version 0.5.1

Version 0.5.1 (2017-02-14)

The CHANGELOG for the current development version is available at
https://github.com/rasbt/mlxtend/blob/master/docs/sources/CHANGELOG.md.

New Features

The EnsembleVoteClassifier has a new refit attribute that prevents refitting classifiers if refit=False to save computational time.
Added a new lift_score function in evaluate to compute lift score (via Batuhan Bardak).
StackingClassifier and StackingRegressor support multivariate targets if the underlying models do (via kernc).
StackingClassifier has a new use_features_in_secondary attribute like StackingCVClassifier.

Changes

Changed default verbosity level in SequentialFeatureSelector to 0
The EnsembleVoteClassifier now raises a NotFittedError if the estimator wasn't fit before calling predict. (via Anton Loss)
Added new TensorFlow variable initialization syntax to guarantee compatibility with TensorFlow 1.0

Bug Fixes

Fixed wrong default value for k_features in SequentialFeatureSelector
Cast selected feature subsets in the SequentialFeautureSelector as sets to prevent the iterator from getting stuck if the k_idx are different permutations of the same combination (via Zac Wellmer).
Fixed an issue with learning curves that caused the performance metrics to be reversed (via ipashchenko)
Fixed a bug that could occur in the SequentialFeatureSelector if there are similarly-well performing subsets in the floating variants (via Zac Wellmer).

Assets 2

11 Nov 07:20

rasbt

v0.5.0

d90bd1b

v0.5.0

Version 0.5.0

Downloads

New Features

New ExhaustiveFeatureSelector estimator in mlxtend.feature_selection for evaluating all feature combinations in a specified range
The StackingClassifier has a new parameter average_probas that is set to True by default to maintain the current behavior. A deprecation warning was added though, and it will default to False in future releases (0.6.0); average_probas=False will result in stacking of the level-1 predicted probabilities rather than averaging these.
New StackingCVClassifier estimator in 'mlxtend.classifier' for implementing a stacking ensemble that uses cross-validation techniques for training the meta-estimator to avoid overfitting (Reiichiro Nakano)
New OnehotTransactions encoder class added to the preprocessing submodule for transforming transaction data into a one-hot encoded array
The SequentialFeatureSelector estimator in mlxtend.feature_selection now is safely stoppable mid-process by control+c, and deprecated print_progress in favor of a more tunable verbose parameter (Will McGinnis)
New apriori function in association to extract frequent itemsets from transaction data for association rule mining
New checkerboard_plot function in plotting to plot checkerboard tables / heat maps
New mcnemar_table and mcnemar functions in evaluate to compute 2x2 contingency tables and McNemar's test

Changes

All plotting functions have been moved to mlxtend.plotting for compatibility reasons with continuous integration services and to make the installation of matplotlib optional for users of mlxtend's core functionality
Added a compatibility layer for scikit-learn 0.18 using the new model_selection module while maintaining backwards compatibility to scikit-learn 0.17.

Bug Fixes

mlxtend.plotting.plot_decision_regions now draws decision regions correctly if more than 4 class labels are present
Raise AttributeError in plot_decision_regions when the X_higlight argument is a 1D array (chkoar)

Assets 2

25 Aug 02:43

rasbt

v0.4.2

93a4cdb

v0.4.2

Version 0.4.2 (2016-08-24)

New Features

Added preprocessing.CopyTransformer, a mock class that returns copies of
imput arrays via transform and fit_transform

Changes

Added AppVeyor to CI to ensure MS Windows compatibility
Dataset are now saved as compressed .txt or .csv files rather than being imported as Python objects
feature_selection.SequentialFeatureSelector now supports the selection of k_features using a tuple to specify a "min-max" k_features range
Added "SVD solver" option to the PrincipalComponentAnalysis
Raise a AttributeError with "not fitted" message in SequentialFeatureSelector if transform or get_metric_dict are called prior to fit
Use small, positive bias units in TfMultiLayerPerceptron's hidden layer(s) if the activations are ReLUs in order to avoid dead neurons
Added an optional clone_estimator parameter to the SequentialFeatureSelector that defaults to True, avoiding the modification of the original estimator objects
More rigorous type and shape checks in the evaluate.plot_decision_regions function
DenseTransformer now doesn't raise and error if the input array is not sparse
API clean-up using scikit-learn's BaseEstimator as parent class for feature_selection.ColumnSelector

Bug Fixes

Fixed a problem when a tuple-range was provided as argument to the SequentialFeatureSelector's k_features parameter and the scoring metric was more negative than -1 (e.g., as in scikit-learn's MSE scoring function) via wahutch
Fixed an AttributeError issue when verbose > 1 in StackingClassifier
Fixed a bug in classifier.SoftmaxRegression where the mean values of the offsets were used to update the bias units rather than their sum
Fixed rare bug in MLP _layer_mapping functions that caused a swap between the random number generation seed when initializing weights and biases

Assets 2

02 May 00:17

rasbt

0.4.1

cc54c52

v0.4.1

Version 0.4.1 (2016-05-01)

New Features

New TensorFlow estimator for Linear Regression (tf_regressor.TfLinearRegression)
New k-means clustering estimator (cluster.Kmeans)
New TensorFlow k-means clustering estimator (tf_cluster.Kmeans)

Changes

Due to refactoring of the estimator classes, the init_weights parameter of the fit methods was globally renamed to init_params
Overall performance improvements of estimators due to code clean-up and refactoring
Added several additional checks for correct array types and more meaningful exception messages
Added optional dropout to the tf_classifier.TfMultiLayerPerceptron classifier for regularization
Added an optional decay parameter to the tf_classifier.TfMultiLayerPerceptron classifier for adaptive learning via an exponential decay of the learning rate eta
Replaced old NeuralNetMLP by more streamlined MultiLayerPerceptron (classifier.MultiLayerPerceptron); now also with softmax in the output layer and categorical cross-entropy loss.
Unified init_params parameter for fit functions to continue training where the algorithm left off (if supported)

Assets 2

01 Feb 01:03

rasbt

v0.3.0

974e0ba

v0.3.0

Version 0.3.0 (2016-01-31)

The mlxtend.preprocessing.standardize function now optionally returns the parameters, which are estimated from the array, for re-use. A further improvement makes the standardize function smarter in order to avoid zero-division errors
Added a progress bar tracker to classifier.NeuralNetMLP
Added a function to score predicted vs. target class labels evaluate.scoring
Added confusion matrix functions to create (evaluate.confusion_matrix) and plot (evaluate.plot_confusion_matrix) confusion matrices
Cosmetic improvements to the evaluate.plot_decision_regions function such as hiding plot axes
Renaming of classifier.EnsembleClassfier to classifier.EnsembleVoteClassifier
Improved random weight initialization in Perceptron, Adaline, LinearRegression, and LogisticRegression
Changed learning parameter of mlxtend.classifier.Adaline to solver and added "normal equation" as closed-form solution solver
New style parameter and improved axis scaling in mlxtend.evaluate.plot_learning_curves
Hide y-axis labels in mlxtend.evaluate.plot_decision_regions in 1 dimensional evaluations
Added loadlocal_mnist to mlxtend.data for streaming MNIST from a local byte files into numpy arrays
New NeuralNetMLP parameters: random_weights, shuffle_init, shuffle_epoch
Sequential Feature Selection algorithms were unified into a single SequentialFeatureSelector class with parameters to enable floating selection and toggle between forward and backward selection.
New SFS features such as the generation of pandas DataFrame results tables and plotting functions (with confidence intervals, standard deviation, and standard error bars)
Added support for regression estimators in SFS
Stratified sampling of MNIST (now 500x random samples from each of the 10 digit categories)
Added Boston housing dataset
Renaming mlxtend.plotting to mlxtend.general_plotting in order to distinguish general plotting function from specialized utility function such as evaluate.plot_decision_regions
Shuffle fix and new shuffle parameter for classifier.NeuralNetMLP

Assets 2

Releases: rasbt/mlxtend

Version 0.9.1

Version 0.9.1 (2017-11-19)

Downloads

New Features

Changes

Bug Fixes

Version 0.9.0

New Features

Changes

Bug Fixes

Version 0.8.0

Downloads

New Features

Changes

Bug Fixes

Version 0.7.0

Version 0.7.0 (2017-06-22)

New Features

Changes

Bug Fixes

Version 0.6.0

Version 0.6.0 (2017-03-18)

Downloads

New Features

Changes

Bug Fixes

Version 0.5.1

Version 0.5.1 (2017-02-14)

New Features

Changes

Bug Fixes

v0.5.0

Version 0.5.0

Downloads

New Features

Changes

Bug Fixes

v0.4.2

Version 0.4.2 (2016-08-24)

New Features

Changes

Bug Fixes

v0.4.1

Version 0.4.1 (2016-05-01)

New Features

Changes

v0.3.0

Version 0.3.0 (2016-01-31)