You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A short description of hyperparameters from config.yml
Training parameters
max_length - maximum file length in seconds. Longer files are omitted. Used to avoid OOM (out-of-memory) error. Also reduces the amount of data samples used for training and validation.
weight_decay - regularization parameter. Can be increased when using constant learning rate
use_onecyclelr - use PyTorch's implementation of cyclic learning rate policy
spec_params - a set of parameters for generating mel spectrograms. Described here. When changing these one should consider changing parameters of masking.
normalize - apply normalization. By default applies normalization from stats saved in assests/stats.npy to each mel channel separately. Saved stats were calculated from LibriTTS. If training on different data, these should be recalculated. More details here.
Augmentation parameters
speed_perturbation - randomly chooses the rate of perturbation between 1 +- specified value.
chunk_size - size of each spectrogram segment (time axis) to apply augmentation to. Set equal to -1 to apply original pytorch's masking. Examples shown below
freq_masking, time_masking - the maximum width of masking band. More details here
Augmentation examples
Original pytorch functions applied. Less aggresive augmentation
Custom function applied. More aggresive augmentation
The text was updated successfully, but these errors were encountered:
A short description of hyperparameters from config.yml
Training parameters
max_length
- maximum file length in seconds. Longer files are omitted. Used to avoid OOM (out-of-memory) error. Also reduces the amount of data samples used for training and validation.weight_decay
- regularization parameter. Can be increased when using constant learning rateuse_onecyclelr
- use PyTorch's implementation of cyclic learning rate policyspec_params
- a set of parameters for generating mel spectrograms. Described here. When changing these one should consider changing parameters of masking.normalize
- apply normalization. By default applies normalization from stats saved inassests/stats.npy
to each mel channel separately. Saved stats were calculated from LibriTTS. If training on different data, these should be recalculated. More details here.Augmentation parameters
speed_perturbation
- randomly chooses the rate of perturbation between 1 +- specified value.chunk_size
- size of each spectrogram segment (time axis) to apply augmentation to. Set equal to -1 to apply original pytorch's masking.Examples shown below
freq_masking, time_masking
- the maximum width of masking band. More details hereAugmentation examples
The text was updated successfully, but these errors were encountered: