You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I used the codes in this repo to finetune open llama model, to reduce the finetuning time, when I generate dataset, I only use one prompt for training, valadation and test set on Beauty. I use random indexing and use the original setting in your repo. And then when I evaluate, I found that the generated output sequences are full of unexpected chracters, like '(@*$^)(*Y(8'. And also, when I want to use the codes to finetune other llama series model, the generated sentences become to be full of '!' .
Can anyone give a hint about this? Is this the problem od tokenizer?
Thanks
The text was updated successfully, but these errors were encountered:
Hi,
I used the codes in this repo to finetune open llama model, to reduce the finetuning time, when I generate dataset, I only use one prompt for training, valadation and test set on Beauty. I use random indexing and use the original setting in your repo. And then when I evaluate, I found that the generated output sequences are full of unexpected chracters, like '(@*$^)(*Y(8'. And also, when I want to use the codes to finetune other llama series model, the generated sentences become to be full of '!' .
Can anyone give a hint about this? Is this the problem od tokenizer?
Thanks
The text was updated successfully, but these errors were encountered: