Skip to content

The state-of-art time domain network for speech separation, and it performs well on speech enhancement and music separation

Notifications You must be signed in to change notification settings

runninging/Conv-Tasnet-for-speech-enchancement-and-seperation

Repository files navigation

Conv-TASNET:

Model with SDR = 16.7 (15.0 in the paper) on WJS0-2speaker dataset

You can find the oringnal paper TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation. The testing results are shown as follows: SDR-11.7

Function

  • dataset.py: read the data into the model
  • Tasnet_model.py:the forword network
  • Tasnet_train.py:the main function to run
  • trainer.py:calculate the loss and for training and testing
  • utils.py: process the raw audio and other useful functions
  • train.yaml: all the parameters used in the model
  • test.py:separate the mixed audio and calculate SDR
  • loss/convTasnet_batch_12.file: the loss for each epoch during training stage
  • loss/test_SDR.file: the SDR on the testing set (step by 10 samples)
  • log/: the loss curves for tensorboard

Training stage:

  • from the beginning: remove the line with "trainer.rerun" in Tasnet_train.py, use "trainer.run" instead
  • from a trained model: remove the line with "trainer.run" in Tasnet_train.py,use "trainer.rerun" instead, and change the "model_path" in train.yaml/temp

About

The state-of-art time domain network for speech separation, and it performs well on speech enhancement and music separation

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages