Unexpected generated sentences when using llama #36

xiaxin1998 · 2024-09-29T03:59:58Z

Hi,
I used the codes in this repo to finetune open llama model, to reduce the finetuning time, when I generate dataset, I only use one prompt for training, valadation and test set on Beauty. I use random indexing and use the original setting in your repo. And then when I evaluate, I found that the generated output sequences are full of unexpected chracters, like '(@*$^)(*Y(8'. And also, when I want to use the codes to finetune other llama series model, the generated sentences become to be full of '!' .
Can anyone give a hint about this? Is this the problem od tokenizer?

Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unexpected generated sentences when using llama #36

Unexpected generated sentences when using llama #36

xiaxin1998 commented Sep 29, 2024

Unexpected generated sentences when using llama #36

Unexpected generated sentences when using llama #36

Comments

xiaxin1998 commented Sep 29, 2024