Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prog_bar = tqdm(enumerate(train_data_loader))这一行卡住的话,是因为filelists/train.txt, filelists/val.txt, filelists/test.txt没配置好,导致Dataset死循环 #6

Open
yuanmaitian opened this issue Jul 25, 2024 · 6 comments

Comments

@yuanmaitian
Copy link

prog_bar = tqdm(enumerate(train_data_loader)) 这一行卡住的话,是因为filelists/train.txt, filelists/val.txt, filelists/test.txt没配置好,导致Dataset死循环
现在我也遇到了这个情况,训练集、验证集的全路径写在了trian.txt和val.txt ,类似下面的:
/root/workspace/processed_data/train/57f7d1d11ca308f3f24728cb5c930574_15.78_16.16
/root/workspace/processed_data/train/4c3df35b8ffdbd064ce3d00288c0d26a_2.44_2.71
/root/workspace/processed_data/train/f5a9b275800eeaa1561b72f5b412908c_2.88_3.3

/root/workspace/processed_data/val/01b7a9520196fad67d4670fdf85cf59d_1.08_1.32
/root/workspace/processed_data/val/8b57ece4a8a0a07395ec833d712497f6_4.68_5.22
/root/workspace/processed_data/val/ade95c1b7497724a5201de58c7304f59_1.6_1.94
然后说原版的代码处理数据集会死循环,要怎么修改呢?请假大佬

@yuanmaitian
Copy link
Author

请教大佬,自从修好了报index error的bug后,prog_bar = tqdm(enumerate(train_data_loader))这一条的进度就没变过

@bjfrbjx
Copy link
Owner

bjfrbjx commented Jul 25, 2024

如果是cpu的话会很慢

@bjfrbjx
Copy link
Owner

bjfrbjx commented Jul 25, 2024

在__getitem__里所有continue处打断点,然后debug,看进了哪一处。
死循环就是因为while True下都过continue而不走return造成的,原版这样写是为了自动跳过一些没处理好的数据。

@yuanmaitian
Copy link
Author

如果是cpu的话会很慢

是的,在等了一个半小时后,终于进度条动了,1000step后loss从开始的0.7764下降到了0.6998,算正常吗?

@bjfrbjx
Copy link
Owner

bjfrbjx commented Jul 25, 2024

如果是cpu的话会很慢

是的,在等了一个半小时后,终于进度条动了,1000step后loss从开始的0.7764下降到了0.6998,算正常吗?

嗯,正常。我gpu运行一天多,跑400000step,loss到0.35左右。

@yuanmaitian
Copy link
Author

大佬,现在跑完35000step,loss就停在0.694不下降了,是LWR1000数据集不行吗?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants