Releases: catalyst-team/catalyst
Catalyst 20.05.1
v20.05.1 Update CHANGELOG.md
Catalyst 20.05
v20.05 Merge branch 'master' of github.com:catalyst-team/catalyst
Catalyst 20.03.1
tl;dr
We finally organise Experiment-Runner-State-Callback as it should be.
We also have great documentation update!
Core
-
Experiment - an abstraction that contains information about the experiment – a model, a criterion, an optimizer, a scheduler, and their hyperparameters. It also contains information about the data and transformations used. In general, the Experiment knows what you would like to run.
-
Runner - a class that knows how to run an experiment. It contains all the logic of how to run the experiment, stages, epoch and batches.
-
State - some intermediate storage between Experiment and Runner that saves the current state of the Experiments – model, criterion, optimizer, schedulers, metrics, loaders, callbacks, etc
-
Callback - a powerful abstraction that lets you customize your experiment run logic. To give users maximum flexibility and extensibility we allow callback execution anywhere in the training loop
on_stage_start on_epoch_start on_loader_start on_batch_start # ... on_batch_end on_epoch_end on_stage_end on_exception
How to combine them together?
First of all - just read the docs for State, Experiment, Runner and Callback abstractions.
Long story short, State just saves everything during experiment and passes to every Callback in Experiment through Runner.run_event.
For example, usual case, for some custom metric implementation, all you need to do is
from catalyst.dl import Callback, State
class MyPureMetric(Callback):
def on_batch_end(self, state: State):
"""To store batch-based metrics"""
state.batch_metrics[{metric_name}] = metric_value
def on_loader_end(self, state: State):
"""To store loader-based metrics"""
state.loader_metrics[{metric_name}] = metric_value
def on_epoch_end(self, state: State):
"""To store epoch-based metrics"""
state.epoch_metrics[{metric_name}] = metric_value
There are coming many more Catalyst concepts, tutorials and docs in near future.
Callback updates
- CheckRunCallback - allows you to check the pipeline correctness during long-run training
- TimerCallback - enables / disables the calculation of the work speed of your training, like
data time per batch
,model time per batch
,samples per second
- MetricManagerCallback - transfers torch tensors to numpy and calculates statistics
- ValidationManagerCallback - collects validation metrics and checks if it's the best epoch
Catalyst best practices
Working with Catalyst.DL it's better to import everything in straightforward way like
from catalyst.dl import SomethingGreat
from catalyst.dl import utils
utils.do_something_cool()
Breaking changes
CriterionAggregatorCallback
moved to catalyst.contrib and will be deprecated in20.04
release.- For
SchedulerCallback
reduce_metric
was renamed toreduced_metric
:) - We have update metric recording mechanism for State-Callback,
Future work
During 20.03 -> 20.04 releases, we are going to deprecate all SomeContribRunner
and transfer them to SomeContribLogger
as more general purpose solution.
20.02.2
Catalyst 20.02.2
Overall
- scripts update
- formatting update
- project manifesto #532
- Alchemy intro #563 #564
- Travis -> TeamCity CI #597 #616 #634
- Core init #578
- tests global update #661
DL
- config wizard #499
- Triplet loss contribution #504
- TelegramLogger support #471
- Trace support for fp16 and Nvidia Apex #497
- runner params fix #545 #633
- weighted sum support for loss calculation #535
- classwise IOU support #533
- BERT text classification example #540
- Ralamb contribution #551
- samplers params support #550
- KNNMetricCallback contribution #560
- MetricAggregationCallback contribution #591
- text to embeddings script #596 #601
- Several metric learning features #589 #598 #599
- Neptune integration #571
- transforms params support #595 #604
- SMP integration #600
- GAN 2.0 #607 #585
- Cutmix callback contribution #635
- RMS Normalization contribution #649
- EarlyStoppingCallback fix #664
- better distributed & slurm support #639 #629 #662 #628
Catalyst 19.11
Catalyst 19.10 -> 19.11
Achievements
NeurIPS 2019: Learn to Move - Walk Around, 2nd place
Overall
- per_gpu_scale feature for multi-gpu experiment runs #406
- contribution guide update #422
- image extension check #432
- pytorch.tensorboard support #439
- docker update #463 #476
- documentation update #475
- codestyle formatting update #477
DL
- GAN example #407
- wandb fix #410
- data mixins #412
- jupyter notebooks dump feature #413
- segmentation tutorial #415 #441 #451
- exponential format support for Config API #418
- pytorch 1.3 naming issue fix #447
- MaskReader #446
- Epoch num: 0 to 1 #411
- wandb logging fix #458
- all metrics logging to checkpoint #455
- imread fix #473
- Global Precision/Recall/F1 Callback #433
- Loggers logic update #443
- runner device fix #482
- checkpoint callback path correctness fix #484
RL
Catalyst 19.10
Catalyst 19.09 -> 19.10
Achievements
NeurIPS 2019: Recursion Cellular Image Classification
- 4th place solution writeup
- 8th place solution
Overall
- LAMA refactoring #353
- TemporalConcatPooling #355
- Extensions on/off support #368
- Hall of Glory update #385
- binary_mask_to_overlay_image refactoring #387
- get_utcnow_time feature #389
- image preprocessing refactoring #398
- TF seed fix #399
- shorthand for bool flag #405
DL
- Table data tutorial #351
- Wandb integration tutorial 365ed7d
- lovasz loss fix #359
- SupervisedRunner, one loader handling #362
- SupervisedRunner.predict_loader, model support #367
- resume from previous stage fix #328
- CriterionAggregator #361
- Weight decay decoupling #370
- Binary class support for AccuracyCallback #374
- SupervisedRunner.train, resume feature #377
- Apex distributed synchronised batchnorm support #379
- Catalyst Init feature #372
- Dict transform support for MergeDataset #388
- BCEDiceLoss weights support #394
- Multi-label accuracy support #381
- PyTorch loaders additional params support #397
RL
Catalyst 19.09
Catalyst 19.08 -> 19.09
Ecosystem
We are happy to announce MLComp release – a distributed DAG (Directed acyclic graph) framework for machine learning with UI. Powered by Catalyst.Team.
We also release a detailed classification tutorial and comprehensive classification pipeline.
Slowly-slowly, more and more challenges are powered by catalyst #302.
We also update the licence to Apache 2.0, start the Patreon and even run catalyst-info repo!
And finally, we have integrate wandb to the catalyst, both DL & RL!
Overall
- config dump to tensorboard #284 – even more reproducibility for both DL and RL
- bash scripts for
parallel-gpu-run
andcatalyst-rl-run
#288 - improver CUDNN deterministic and CUDNN benchmark support #299 #309
- available gpus check feature #300
- SequentialNet update with "soft" residual mode #301
- docs update #303 #336
- layer-wise learning rate support #283
- torchnet dependency removal #304
- environment variables and packages versions dump to tensorboard #313 #317
- tracing script update #319
- additional general case CV scripts #321
- RAdam, LookAhead, Ranger optimizers support #332
- wandb integration #337 #339 #341
DL
- Exception handling support for callbacks #281 #307
- tests update #286
- traced models support for dataset prediction #287
- tensorboard logger fix #294
state_dict
param support for all contrib encoders #292- bias weight decay autoremove #293
- checkpoint save on exception #295
- classification tutorial #296 #297 #326
- better console logging #298
- segmentation models support for dl registry #329
- one-hot feature #331
- accuracy metric fix #340
- path dataset support #335
- callback ordering support #343
- encoders
requires_grad
logic update #346
RL
Catalyst 19.08
Catalyst 19.07 -> 19.08
Overall
DL
- updated InferMaskCallback #252
- scientific notation #248
- tracing tests #255
- BaseExperiment update #269
- BatchSamler support #271
- metrics.json feature #266
- auto accuracy args #274
- lr linear scaling #273
- BaseCheckpointCallback #276
- SupervisedRunner update, new examples #278
RL
- critic refactoring #253
- host support for dbWrappers #254
- RawObservationWrapper #247 and #261
- distributional PPO PoC #257
- PPO fix #258
- RL tests: #259, #260, #262
- RL epoch limit feature #263
- additional policy heads support #264
- Advantage & entropy regularization DQN support #265
- mongo wrapper update #267, #268
- atari example update #270
- code saving #277
Catalyst 19.07
Catalyst 19.06.3
Catalyst 19.06 -> 19.06.3
Overall
DL
- DataParallel for fp16 apex #213
- cpu tracing support #217
- config dump fix #221
- batch2device fix #214
- IterationCheckpointCallback & batch-size fix #228
- fp16 flag fix #227
RL
- Max length removal #211
- RL refactoring #212
- training seeds update #220
- Multi-headed Value functions support #198
Breaking changes
- UtilsFactory replaced with
from catalyst.dl import utils
LossCallback
replaced withCriterionCallback