-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
关于llava 预处理 左对齐还是右对齐 #190
Labels
Comments
@yuanzhoulvpi2017 可以帮忙看下吗 |
你看一下这个链接吧#186 (comment) |
好的,我看了数据,<image> 有时候在前面,有时候在后面,感觉需要统一到一个位置比较好吧。官方的代码里面好像处理了这种情况,统一把<image>放到前面。 这个是不是要加到data.py里面 |
我只能用qwen1.5-0.5B 模型训练, 只有4080的卡,batch_size_per_gpu 只有 2 , 感觉loss好大,是不是LLM 太小导致的。我做几个实验看下,有没有影响。 |
如果你能实验结果出来发现左右padding确实影响效果,你可以深究一下,看为啥,说不定可以发一篇论文。这种小细节的问题但是能带来提升的论文,我个人挺喜欢的。 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
我看官方代码好像是在右边打padding的,默认会把在末尾的 <image>替换到前面,然后使用右对齐padding.
默认会把在末尾的 <image>替换到前面
torch.nn.utils.rnn.pad_sequence 这个好像是在右边打padding, 如果《image》在前面,最好实在后面打padding吧
···
@DataClass
class DataCollatorForSupervisedDataset(object):
"""Collate examples for supervised fine-tuning."""
···
The text was updated successfully, but these errors were encountered: