Hyperparameters 📚 #2

ivankunyankin · 2021-07-05T13:21:42Z

A short description of hyperparameters from config.yml

max_length - maximum file length in seconds. Longer files are omitted. Used to avoid OOM (out-of-memory) error. Also reduces the amount of data samples used for training and validation.
weight_decay - regularization parameter. Can be increased when using constant learning rate
use_onecyclelr - use PyTorch's implementation of cyclic learning rate policy
spec_params - a set of parameters for generating mel spectrograms. Described here. When changing these one should consider changing parameters of masking.
normalize - apply normalization. By default applies normalization from stats saved in assests/stats.npy to each mel channel separately. Saved stats were calculated from LibriTTS. If training on different data, these should be recalculated. More details here.

speed_perturbation - randomly chooses the rate of perturbation between 1 +- specified value.
chunk_size - size of each spectrogram segment (time axis) to apply augmentation to. Set equal to -1 to apply original pytorch's masking. Examples shown below
freq_masking, time_masking - the maximum width of masking band. More details here

Original pytorch functions applied. Less aggresive augmentation

Custom function applied. More aggresive augmentation

The text was updated successfully, but these errors were encountered:

ivankunyankin mentioned this issue Jul 6, 2021

Things that are different compared to the article #3

Open

ivankunyankin added the documentation Improvements or additions to documentation label Jul 6, 2021

Provide feedback