All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Added the
event_name
argument forLRScheduler
for optional recording of LR changes insidenet.history
. NOTE: Supported only in Pytorch>=1.4 - Make it easier to add custom modules or optimizers to a neural net class by automatically registering them where necessary and by making them available to set_params
- Added the
step_every
argument forLRScheduler
to set whether the scheduler step should be taken on every epoch or on every batch.
- Removed support for schedulers with a
batch_step()
method inLRScheduler
. - Raise
FutureWarning
inCVSplit
whenrandom_state
is not used. Will raise an exception in a future (#620) - The behavior of method
net.get_params
changed to make it more consistent with sklearn: it will no longer return "learned" attributes likemodule_
; therefore, functions likesklearn.base.clone
, when called with a fitted net, will no longer return a fitted net but instead an uninitialized net; if you want a copy of a fitted net, usecopy.deepcopy
instead;net.get_params
is used under the hood by many sklearn functions and classes, such asGridSearchCV
, whose behavior may thus be affected by the change. (#521, #527) - Raise
FutureWarning
when usingCyclicLR
scheduler, because the default behavior has changed from taking a step every batch to taking a step every epoch. (#626) - Set train/validation on criterion if it's a PyTorch module (#621)
- Don't pass
y=None
toNeuralNet.train_split
to enable the direct use of split functions without positionaly
in their signatures. This is useful when working with unsupervised data (#605).
- Fixed a bug where
CyclicLR
scheduler would update during both training and validation rather than just during training. - Fixed a bug introduced by moving the
optimizer.zero_grad()
call outside of the train step function, making it incompatible with LBFGS and other optimizers that call the train step several times per batch (#636)
0.8.0 - 2019-04-11
- Added
NeptuneLogger
callback for logging experiment metadata to neptune.ai (#586) - Add
DataFrameTransformer
, an sklearn compatible transformer that helps working with pandas DataFrames by transforming the DataFrame into a representation that works well with neural networks (#507) - Added
WandbLogger
callback for logging to Weights & Biases (#607) - Added
None
option todevice
which leaves the device(s) unmodified (#600) - Add
PassthroughScoring
, a scoring callback that just calculates the average score of a metric determined at batch level and then writes it to the epoch level (#595)
- When using caching in scoring callbacks, no longer uselessly iterate over the data; this can save time if iteration is slow (#552, #557)
- Cleaned up duplicate code in the
fit_loop
(#564)
- WARNING: In release 0.10.0 of skorch, Python 3.5 support will be officially dropped (#634)
- Make skorch compatible with sklearn 0.22 (#571, #573, #575)
- Fixed a bug that could occur when a new "settable" (via
set_params
) attribute was added toNeuralNet
whose name starts the same as an existing attribute's name (#590)
0.7.0 - 2019-11-29
- More careful check for wrong parameter names being passed to
NeuralNet
(#500) - More helpful error messages when trying to predict using an uninitialized model
- Add
TensorBoard
callback for automatic logging to tensorboard - Make
NeuralNetBinaryClassifier
work withsklearn.calibration.CalibratedClassifierCV
- Improve
NeuralNetBinaryClassifier
compatibility with certain sklearn metrics (#515) NeuralNetBinaryClassifier
automatically squeezes module output if necessary (#515)NeuralNetClassifier
now has aclasses_
attribute after fit is called, which is inferred from y by default (#465, #486)NeuralNet.load_params
with a checkpoint now initializes when needed (#497)
- Improve numerical stability when using
NLLLoss
inNeuralNetClassifer
(#491) - Refactor code to make gradient accumulation easier to implement (#506)
NeuralNetBinaryClassifier.predict_proba
now returns a 2-dim array; to access the "old"y_proba
, takey_proba[:, 1]
(#515)net.history
is now a property that accessesnet.history_
, which stores theHistory
object (#527)- Remove deprecated
skorch.callbacks.CyclicLR
, usetorch.optim.lr_scheduler.CyclicLR
instead
- WARNING: In a future release, the behavior of method
net.get_params
will change to make it more consistent with sklearn: it will no longer return "learned" attributes likemodule_
. Therefore, functions likesklearn.base.clone
, when called with a fitted net, will no longer return a fitted net but instead an uninitialized net. If you want a copy of a fitted net, usecopy.deepcopy
instead. Note thatnet.get_params
is used under the hood by many sklearn functions and classes, such asGridSearchCV
, whose behavior may thus be affected by the change. (#521, #527)
- Fixed a bug that caused
LoadInitState
not to work withTrainEndCheckpoint
(#528) - Fixed
NeuralNetBinaryClassifier
wrongly squeezing the batch dimension when usingbatch_size = 1
(#558)
0.6.0 - 2019-07-19
- Adds FAQ entry regarding the initialization behavior of
NeuralNet
when passed instantiated models. (#409) - Added CUDA pickle test including an artifact that supports testing on CUDA-less CI machines
- Adds
train_batch_count
andvalid_batch_count
to history in training loop. (#445) - Adds score method for NeuralNetClassifier, NeuralNetBinaryClassifier, and NeuralNetRegressor (#469)
- Wrapper class for torch Datasets to make them work with some sklearn features (e.g. grid search). (#443)
- Repository moved to https://github.com/skorch-dev/skorch/, please change your git remotes
- Treat cuda dependent attributes as prefix to cover values set using
set_params
since previously"criterion_"
would not matchnet.criterion__weight
as set bynet.set_params(criterion__weight=w)
- skorch pickle format changed in order to improve CUDA compatibility, if you have pickled models, please re-pickle them to be able to load them in the future
net.criterion_
and its parameters are now moved to target device when using criteria that inherit fromtorch.nn.Module
. Previously the user had to make sure that parameters such as class weight are on the compute device- skorch now assumes PyTorch >= 1.1.0. This mainly affects learning rate schedulers, whose inner workings have been changed with version 1.1.0. This update will also invalidate pickled skorch models after a change introduced in PyTorch optimizers.
- Include requirements in MANIFEST.in
- Add
criterion_
toNeuralNet.cuda_dependent_attributes_
to avoid issues with criterion weight tensors from, e.g.,NLLLoss
(#426) TrainEndCheckpoint
can be cloned bysklearn.base.clone
. (#459)
0.5.0 - 2018-12-13
- Basic usage notebook now runs on Google Colab
- Advanced usage notebook now runs on Google Colab
- MNIST with scikit-learn and skorch now runs on Google Colab
- Better user-facing messages when module or optimizer are re-initialized
- Added an experimental API (
net._register_virtual_param
) to register "virtual" parameters on the network with custom setter functions. (#369) - Setting parameters
lr
,momentum
,optimizer__lr
, etc. no longer resets the optmizer. As of now you can donet.set_params(lr=0.03)
ornet.set_params(optimizer__param_group__0__momentum=0.86)
without triggering a re-initialization of the optimizer (#369) - Support for scipy sparse CSR matrices as input (as, e.g., returned by sklearn's
CountVectorizer
); note that they are cast to dense matrices during batching - Helper functions to build command line interfaces with almost no boilerplate, example that shows usage
- Reduce overhead of
BatchScoring
when usingtrain_loss_score
orvalid_loss_score
by skipping superfluous inference step (#381) - The
on_grad_computed
callback function will yield an iterable fornamed_parameters
only when it is used to reduce the run-time overhead of the call (#379) - Default
fn_prefix
inTrainEndCheckpoint
is nowtrain_end_
(#391) - Issues a warning when
Checkpoints
'smonitor
parameter is set tomonitor
and the history contains<monitor>_best
. (#399)
- Re-initialize optimizer when
set_params
is called withlr
argument (#372) - Copying a
SliceDict
now returns aSliceDict
instead of adict
(#388) - Calling
==
onSliceDict
s now works as expected when values are numpy arrays and torch tensors
0.4.0 - 2018-10-24
- Support for PyTorch 0.4.1
- There is no need to explicitly name callbacks anymore (names are assigned automatically, name conflicts are resolved).
- You can now access the training data in the
on_grad_computed
event - There is a new image segmentation example
- Easily create toy network instances for quick experiments using
skorch.toy.make_classifier
and friends - New
ParamMapper
callback to modify/freeze/unfreeze parameters at certain point in time during training:
>>> from sklearn.callbacks import Freezer, Unfreezer
>>> net = Net(module, callbacks=[Freezer('layer*.weight'), Unfreezer('layer*.weight', at=10)])
- Refactored
EpochScoring
for easier sub-classing Checkpoint
callback now supports saving the optimizer, this avoids problems with stateful optimizers such asAdam
orRMSprop
(#360)- Added
LoadInitState
callback for easy continued training from checkpoints (#360) NeuralNetwork.load_params
now supports loading fromCheckpoint
instances- Added documentation for saving and loading
- The
ProgressBar
callback now determines the batches per epoch automatically by default (batches_per_epoch=auto
) - The
on_grad_computed
event now has access to the current training data batch
- Deprecated
filtered_optimizer
in favor ofFreezer
callback (#346) NeuralNet.load_params
andNeuralNet.save_params
deprecatef
parameter for the sake off_optimizer
,f_params
andf_history
(#360)
uses_placeholder_y
should not require existence ofy
field (#311)- LR scheduler creates
batch_idx
on first run (#314) - Use
OrderedDict
for callbacks to fix python 3.5 compatibility issues (#331) - Make
to_tensor
work correctly withPackedSequence
(#335) - Rewrite
History
to not use any recursion to avoid memory leaks during exceptions (#312) - Use
flaky
in some neural network tests to hide platform differences - Fixes ReduceLROnPlateau when mode == max (#363)
- Fix disconnected weights between net and optimizer after copying the net with
copy.deepcopy
(#318) - Fix a bug that intefered with loading CUDA models when the model was a CUDA tensor but the net was configured to use the CPU (#354, #358)