You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
你好,恭喜您取得如此优秀的成果,我在复现你的代码的时候也是使用了8块3090,但是还是跑不起来您所使用的batchsize ,我研究了一下您使用的多卡训练的代码采用的是数据并行的方式,但是大多数我们用来解决多卡运行显存不足时,通常使用的是模型并行的方式,想请教一下您是否做过这方面的部署呢,如果有,希望您开源一下代码,或者我们进行一些交流,我的邮箱是[email protected]。
Hello,Congratulations on achieving such excellent results. While reproducing your code, I also used 8 RTX 3090 GPUs, but I still couldn't manage to run the batch size you used. After some research, I noticed that your multi-GPU training code uses data parallelism, while most of us typically use model parallelism to solve the issue of insufficient GPU memory. I would like to ask if you have deployed model parallelism in your setup. If so, could you please share your code, or perhaps we could have some discussions on this matter? My email is [email protected] regards.
The text was updated successfully, but these errors were encountered:
你好,恭喜您取得如此优秀的成果,我在复现你的代码的时候也是使用了8块3090,但是还是跑不起来您所使用的batchsize ,我研究了一下您使用的多卡训练的代码采用的是数据并行的方式,但是大多数我们用来解决多卡运行显存不足时,通常使用的是模型并行的方式,想请教一下您是否做过这方面的部署呢,如果有,希望您开源一下代码,或者我们进行一些交流,我的邮箱是[email protected]。
Hello,Congratulations on achieving such excellent results. While reproducing your code, I also used 8 RTX 3090 GPUs, but I still couldn't manage to run the batch size you used. After some research, I noticed that your multi-GPU training code uses data parallelism, while most of us typically use model parallelism to solve the issue of insufficient GPU memory. I would like to ask if you have deployed model parallelism in your setup. If so, could you please share your code, or perhaps we could have some discussions on this matter? My email is [email protected] regards.
The text was updated successfully, but these errors were encountered: