Skip to content

Release v0.40

Choose a tag to compare
@takuseno takuseno released this 26 Nov 15:28
· 1082 commits to master since this release


  • Support the discrete version of Soft Actor-Critic
  • fit_online has n_steps argument instead of n_epochs for the complete reproduction of the papers.


d3rlpy provides more flexible controls for optimizer configuration via OptimizerFactory.

from d3rlpy.optimizers import AdamFactory
from d3rlpy.algos import DQN

dqn = DQN(optim_factory=AdamFactory(weight_decay=1e-4))

See more at .


d3rlpy provides more flexible controls for the neural network architecture via EncoderFactory.

from d3rlpy.algos import DQN
from d3rlpy.encoders import VectorEncoderFactory

# encoder factory
encoder_factory = VectorEncoderFactory(hidden_units=[300, 400], activation='tanh')

# set OptimizerFactory
dqn = DQN(encoder_factory=encoder_factory)

Also you can build your own encoders.

import torch
import torch.nn as nn

from d3rlpy.encoders import EncoderFactory

# your own neural network
class CustomEncoder(nn.Module):
    def __init__(self, obsevation_shape, feature_size):
        self.feature_size = feature_size
        self.fc1 = nn.Linear(observation_shape[0], 64)
        self.fc2 = nn.Linear(64, feature_size)

    def forward(self, x):
        h = torch.relu(self.fc1(x))
        h = torch.relu(self.fc2(h))
        return h

    def get_feature_size(self):
        return self.feature_size

# your own encoder factory
class CustomEncoderFactory(EncoderFactory):
    TYPE = 'custom' # this is necessary

    def __init__(self, feature_size):
        self.feature_size = feature_size

    def create(self, observation_shape, action_size=None, discrete_action=False):
        return CustomEncoder(observation_shape, self.feature_size)

    def get_params(self, deep=False):
        return {
            'feature_size': self.feature_size

dqn = DQN(encoder_factory=CustomEncoderFactory(feature_size=64))

See more at .

Stable Baselines 3 wrapper


  • fix the memory leak problem at fit_online.
    • Now, you can train online algorithms with the big replay buffer size for the image observation.
  • fix preprocessing at CQL.
  • fix ColorJitter augmentation.



  • From this version, d3rlpy officially supports Windows.
  • The binary packages for each platform are built in GitHub Actions. And they are uploaded, which means that you don't have to install Cython to install this package from PyPi.


  • From previous version, d3rlpy is available in conda-forge.