Skip to content
This repository has been archived by the owner on Mar 23, 2023. It is now read-only.

运行GPT2案例出现RuntimeError: Could not find 'SLURM_PROCID'问题,是必须要装SLURM环境? #161

Open
ZXM1063694570 opened this issue Jul 26, 2022 · 1 comment

Comments

@ZXM1063694570
Copy link

🐛 Describe the bug

使用了提供的Dockerhub上的镜像0.1.7,但是在运行GPT案例时候出现RuntimeError: Could not find 'SLURM_PROCID'问题,并且在0.1.8镜像版本中也是如此
M4QKMAI76Q~U952 KAY5Y
T4GKG9P$KSS$XIGXL7{EVAM
这是我的run脚本:
260CY7X5}DOF1363S{4PJ`1
其中我的gpt2_configs配置换了其他的配置也出现同样的问题

Environment

docker pull hpcaitech/colossalai:0.1.7 & 0.1.8
pip install transformers
pip install titans

8张A100

@feifeibear
Copy link
Contributor

加一下 --from_torch在启动命令args里。没加默认用slurm启动

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants