implentation of paper Yin, D., Luo, C., Xiong, Z., & Zeng, W. (2020, April). Phasen: A phase-and-harmonics-aware speech enhancement network. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 34, No. 05, pp. 9458-9465).. Datasets is available at https://datashare.ed.ac.uk/handle/10283/1942
-
Running
bash ./phasen_torch/_1_perprocess.sh
to prepare data. -
Change "root_dir" parameter in phasen_torch/FLAGS.py to the root of the project. For example "root_dir = /home/user/PHASEN-PYTORCH".
-
Ensure "PARAM = PHASEN_009" is set in last line of phasen_torch/FLAGS.py.
-
Running
cp phasen_torch PHASEN_009 -r
to create the Experiment config code dir.
Running python -m PHASEN_009._2_train
to start training of exp config "PHASEN_009".
Running python -m PHASEN_009._3_enhance_testsets
to get the metrics of Experiment "PHASEN_009". The last ckpt is selected as the default ckpt to load. Alse, you can use --ckpt
to specify the path of ckpt.
See "phasen_torch/_1_preprocess.sh", "phasen_torch/_2_train.py" and "phasen_torch/_3_enhance_testsets.py".
The code has basically reproduced the performance in the PHASEN paper (Exp.ID: PHASEN_009). The experimental results are as follows.
Name | Csig | Cbak | Covl | PESQ | SegSNR | LSD | ESTOI(%) | other | SNR | |
---|---|---|---|---|---|---|---|---|---|---|
noisy | 3.357 | 2.453 | 2.649 | 1.994 | 1.710 | 8.253 | 78.67% | |||
PHASEN (torch) | ||||||||||
PHASEN_001 (ckpt36) | 4.046 | 3.477 | 3.439 | 2.816 | 10.397 | stft+stft | ||||
PHASEN_001 (ckpt36 noisy_phase) | 4.031 | 3.432 | 3.419 | 2.796 | 9.888 | |||||
PHASEN_002 (ckpt36) | 4.052 | 3.469 | 3.464 | 2.877 | 10.113 | remse+cos | ||||
PHASEN_002 (ckpt36 noisy_phase) | 4.162 | 3.401 | 3.489 | 2.790 | 9.106 | |||||
PHASEN_003 (ckpt25) | 4.108 | 3.515 | 3.502 | 2.885 | 10.529 | mag+stft | ||||
PHASEN_004 (ckpt26) | 4.143 | 3.524 | 3.542 | 2.929 | 10.353 | mag+normStft | ||||
PHASEN_005 (ckpt32) | 4.160 | 3.528 | 3.558 | 2.934 | 10.269 | stft+normStft | ||||
PHASEN_fix005 (ckpt39) | 4.192 | 3.528 | 3.574 | 2.935 | 10.198 | |||||
PHASEN_007 (ckpt37) | 4.185 | 3.539 | 3.588 | 2.961 | 10.141 | div normstft 1e-6 | ||||
PHASEN_008 (ckpt34) | 4.185 | 3.540 | 3.572 | 2.935 | 10.373 | div normstft 1e-5 | ||||
PHASEN_009 (ckpt41) | 4.212 | 3.557 | 3.613 | 2.988 | 10.287 | div normstft 1e-7 stft+normStft | ||||
PHASEN_009 (ckpt41 noisy_phase) | 4.181 | 3.500 | 3.570 | 2.935 | 9.777 | |||||
PHASEN_009 (ckpt41 rtpghi_phase) | 3.624 | 2.378 | 2.922 | 2.278 | -1.660 | |||||
PHASEN_009 (ckpt41 pghi_phase) | 4.135 | 2.798 | 3.539 | 2.934 | -1.184 | |||||
PHASEN_010 (ckpt36) | 4.142 | 3.527 | 3.541 | 2.921 | 10.310 | div normstft 1e-7 stft+stft | ||||
PHASEN_011 (ckpt34) | 4.186 | 3.549 | 3.590 | 2.966 | 10.293 | div normstft 1e-7 mag+normstft |