-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
训练需要的显存,显卡数量和训练时间? #9
Comments
实际上Qwen2VL-7B只需要开放最后一层训练,外加一个connector将Qwen2VL-7B的特征维度对齐到Flux;Flux每个stage只训练部分层;每个stage训练的时候,所有可训练参数加起来,大概不到2B |
需要多少天呢?A100是80G的吗 |
请问FLUX每个stage只训练部分层,是出于什么原因呢? |
@erwold 请问您FLUX开放部分层训练的时候,对某一层而言,是所有参数都开放,还是只开放某些MLP呢? |
如题,这两个模型加起来20B 应该得用A100-80G训练了吧? 方便给一个训练样本数,显卡数量和训练时间吗?
The text was updated successfully, but these errors were encountered: