训练模型无法收敛 #6

yjzst · 2020-07-09T01:48:56Z

我训练了MobileFaceNet但是，效果不好，完全按照您提供的方式对数据集划分，损失最终只收敛在了4左右，达不到您提供的24.pth的那个效果？请问我还有啥疏漏的地方吗

siriusdemon · 2020-07-09T01:58:45Z

训练了多久?

yjzst · 2020-07-09T02:00:17Z

150个epoch | | 杨杰之 | | 邮箱：[email protected] | 签名由网易邮箱大师定制在2020年07月09日 09:58，Sirius Demon 写道：训练了多久? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

siriusdemon · 2020-07-09T02:50:56Z

您可以试试权重初始化。用kaiming或者xavier。150个epoch肯定是足够了的。

yjzst · 2020-07-09T02:52:14Z

好的，谢谢您 | | 杨杰之 | | 邮箱：[email protected] | 签名由网易邮箱大师定制在2020年07月09日 10:51，Sirius Demon 写道：您可以试试权重初始化。用kaiming或者xavier。150个epoch肯定是足够了的。 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

yjzst · 2020-07-09T09:59:17Z

您好，不好意思打扰一下，我用了初始化，结果还是不行，您那边有没有最新的代码可以给我一份吗？ | | 杨杰之 | | 邮箱：[email protected] | 签名由网易邮箱大师定制在2020年07月09日 10:51，YangJiezhi 写道：好的，谢谢您 | | 杨杰之 | | 邮箱：[email protected] | 签名由网易邮箱大师定制在2020年07月09日 10:51，Sirius Demon 写道：您可以试试权重初始化。用kaiming或者xavier。150个epoch肯定是足够了的。 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

siriusdemon · 2020-07-10T02:02:56Z

这个仓库的代码我后面都没有改过的呀。参考这个，你现在模型的准确度有多高？

yjzst · 2020-07-10T02:06:40Z

从头训练的话准确率不到20% 那我再试试迁移学习吧 | | 杨杰之 | | 邮箱：[email protected] | 签名由网易邮箱大师定制在2020年07月10日 10:03，Sirius Demon 写道：这个仓库的代码我后面都没有改过的呀。参考这个，你现在模型的准确度有多高？ — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

siriusdemon · 2020-07-10T02:31:47Z

你有改动训练的参数吗？我现在还不知道问题在哪

yjzst · 2020-07-10T02:33:42Z

我没有改模型的参数，直接训练的facemobilenet | | 杨杰之 | | 邮箱：[email protected] | 签名由网易邮箱大师定制在2020年07月10日 10:31，Sirius Demon 写道：你有改动训练的参数吗？我现在还不知道问题在哪 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

siriusdemon · 2020-07-10T02:56:56Z

这个问题有个诡异。因为模型没有使用权重初始化，所以问题有可能出在这里。但从您的反馈来看，似乎不是。也许您可以加大 batch_size 试试。我正在重新用默认配置训练，稍后看看是否有问题。也看看其他使用者的反馈如何。

yjzst · 2020-07-10T02:58:34Z

嗯嗯，好的，谢谢您，辛苦啦 | | 杨杰之 | | 邮箱：[email protected] | 签名由网易邮箱大师定制在2020年07月10日 10:57，Sirius Demon 写道：这个问题有个诡异。因为模型没有使用权重初始化，所以问题有可能出在这里。但从您的反馈来看，似乎不是。也许您可以加大 batch_size 试试。我正在重新用默认配置训练，稍后看看是否有问题。也看看其他使用者的反馈如何。 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Linsongrong · 2020-07-10T03:12:39Z

hi ,我也出现了收敛不了的问题。我训练了150个epoch，loss从11开始，一直在8和9之间震荡，收敛不了。是学习率的问题吗。您代码里面的lr=0.1，会不会太大了。

yjzst · 2020-07-10T03:13:40Z

但是做了递减的呀 | | 杨杰之 | | 邮箱：[email protected] | 签名由网易邮箱大师定制在2020年07月10日 11:12，Linsongrong 写道： hi ,我也出现了收敛不了的问题。我训练了150个epoch，loss从11开始，一直在8和9之间震荡，收敛不了。是学习率的问题吗。您代码里面的lr=0.1，会不会太大了。 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

yjzst · 2020-07-10T03:20:29Z

可以方便加一个您的联系方式吗，相互交流交流，谢谢 | | 杨杰之 | | 邮箱：[email protected] | 签名由网易邮箱大师定制在2020年07月10日 11:13，YangJiezhi 写道：但是做了递减的呀 | | 杨杰之 | | 邮箱：[email protected] | 签名由网易邮箱大师定制在2020年07月10日 11:12，Linsongrong 写道： hi ,我也出现了收敛不了的问题。我训练了150个epoch，loss从11开始，一直在8和9之间震荡，收敛不了。是学习率的问题吗。您代码里面的lr=0.1，会不会太大了。 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Linsongrong · 2020-07-10T03:26:56Z

您好，我的邮箱是[email protected]，您可以通过这个邮箱联系到我。获取 Outlook for Android<https://aka.ms/ghei36>

siriusdemon · 2020-07-10T10:29:31Z

我用默认配置训练，第0个epoch结束的时候，就已经有 80 % 左右的精确率了。

Test Model: checkpoints/0.pth
Accuracy: 0.829
Threshold: 0.481

Test Model: checkpoints/3.pth
Accuracy: 0.884
Threshold: 0.451

Test Model: checkpoints/5.pth
Accuracy: 0.912
Threshold: 0.400

第6个epoch的损失Loss

Epoch 6/150, Loss: 10.049524307250977

项目中提供的24.pth 是训练了 24 个 epoch 之后的权重文件。

yjzst · 2020-07-10T10:32:56Z

请问损失是多少呢 | | 杨杰之 | | 邮箱：[email protected] | 签名由网易邮箱大师定制在2020年07月10日 18:29，Sirius Demon 写道：我用默认配置训练，第0个epoch结束的时候，就已经有 80 % 左右的精确率了。 Test Model: checkpoints/0.pth Accuracy: 0.829 Threshold: 0.481 Test Model: checkpoints/3.pth Accuracy: 0.884 Threshold: 0.451 Test Model: checkpoints/5.pth Accuracy: 0.912 Threshold: 0.400 Loss Epoch 6/150, Loss: 10.049524307250977 项目中提供的24.pth 是训练了 24 个 epoch 之后的权重文件。 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

siriusdemon · 2020-07-10T10:33:12Z

训练中如果有足够的GPU，建议可以加大batch_size，训练20个epoch左右。如果要效果好一些的，建议：

用更强的模型
训练更长的时间（需要综合考量 batch_size 和学习率）

yjzst · 2020-07-10T10:34:36Z

好的，谢谢，辛苦啦 | | 杨杰之 | | 邮箱：[email protected] | 签名由网易邮箱大师定制在2020年07月10日 18:33，Sirius Demon 写道：训练中如果有足够的GPU，建议可以加大batch_size，训练20个epoch左右。如果要效果好一些的，建议：用更强的模型训练更长的时间（需要综合考量 batch_size 和学习率） — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Comedian1926 · 2020-08-07T03:08:58Z

我也遇到了类似的问题，在尝试过bs=128、256、512，对应学习率0.1、0.01、0.001后在lfw上最好的acc是94.5

yjzst · 2020-08-07T03:14:22Z

好的，谢谢您，辛苦啦 | | 杨杰之 | | 邮箱：[email protected] | 签名由网易邮箱大师定制在2020年08月07日 11:09，Comedian1926 写道：我也遇到了类似的问题，在尝试过bs=128、256、512，对应学习率0.1、0.01、0.001后在lfw上最好的acc是94.5 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

siriusdemon changed the title ~~模型效果问题~~ 训练模型无法收敛 Jul 10, 2020

siriusdemon pinned this issue Jul 10, 2020

siriusdemon closed this as completed Jul 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

训练模型无法收敛 #6

训练模型无法收敛 #6

yjzst commented Jul 9, 2020

siriusdemon commented Jul 9, 2020

yjzst commented Jul 9, 2020 via email

siriusdemon commented Jul 9, 2020

yjzst commented Jul 9, 2020 via email

yjzst commented Jul 9, 2020 via email

siriusdemon commented Jul 10, 2020

yjzst commented Jul 10, 2020 via email

siriusdemon commented Jul 10, 2020

yjzst commented Jul 10, 2020 via email

siriusdemon commented Jul 10, 2020

yjzst commented Jul 10, 2020 via email

Linsongrong commented Jul 10, 2020

yjzst commented Jul 10, 2020 via email

yjzst commented Jul 10, 2020 via email

Linsongrong commented Jul 10, 2020 via email

siriusdemon commented Jul 10, 2020 •

edited

Loading

yjzst commented Jul 10, 2020 via email

siriusdemon commented Jul 10, 2020

yjzst commented Jul 10, 2020 via email

Comedian1926 commented Aug 7, 2020

yjzst commented Aug 7, 2020 via email

训练模型无法收敛 #6

训练模型无法收敛 #6

Comments

yjzst commented Jul 9, 2020

siriusdemon commented Jul 9, 2020

yjzst commented Jul 9, 2020 via email

siriusdemon commented Jul 9, 2020

yjzst commented Jul 9, 2020 via email

yjzst commented Jul 9, 2020 via email

siriusdemon commented Jul 10, 2020

yjzst commented Jul 10, 2020 via email

siriusdemon commented Jul 10, 2020

yjzst commented Jul 10, 2020 via email

siriusdemon commented Jul 10, 2020

yjzst commented Jul 10, 2020 via email

Linsongrong commented Jul 10, 2020

yjzst commented Jul 10, 2020 via email

yjzst commented Jul 10, 2020 via email

Linsongrong commented Jul 10, 2020 via email

siriusdemon commented Jul 10, 2020 • edited Loading

yjzst commented Jul 10, 2020 via email

siriusdemon commented Jul 10, 2020

yjzst commented Jul 10, 2020 via email

Comedian1926 commented Aug 7, 2020

yjzst commented Aug 7, 2020 via email

siriusdemon commented Jul 10, 2020 •

edited

Loading