Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

syncnet的训练代码是不是有bug #113

Open
iloveOREO opened this issue Dec 26, 2024 · 3 comments
Open

syncnet的训练代码是不是有bug #113

iloveOREO opened this issue Dec 26, 2024 · 3 comments

Comments

@iloveOREO
Copy link

在获取数据的代码中 https://github.com/anliyuan/Ultralight-Digital-Human/blob/762e3b6de9e82b6927ce7cf414dcef67dd533ff3/syncnet.py#L84C5-L95C31
每次都把y设成了1, 没有用到ex的img, 不是相当于永远用到了同步的数据? 这样模型只需要无脑输出两个相同的向量, 后续计算loss就极小.
训练的时候BCELoss很快就下降到0.000xxx了
应该不太对吧

@drakitLiu
Copy link

他这个训练方法不对的
你可以参考wav2lip的口型判别器方法!

@xiao-keeplearning
Copy link

这个训练syncnet图像特征就输入一帧也不合理,16帧长的音频特征对应1帧图像

@feipengheart
Copy link

没有用到ex的img,会不会是随机到的音频特征未必是负样本,有可能嘴型和正样本也是相似的,这样反而效果更差,所以作者没用

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants