Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About DPO Data Preparation #193

Open
pratim808 opened this issue Jan 20, 2025 · 1 comment
Open

About DPO Data Preparation #193

pratim808 opened this issue Jan 20, 2025 · 1 comment

Comments

@pratim808
Copy link

After using this code
`#method2
def format_function(example):
# Format 'chosen' text
messages_chosen = [
{"role": "user", "content": str(example["chosen"])} # convert list to string
]
formatted_chosen = tokenizer.apply_chat_template(
messages_chosen,
tokenize=False,
add_generation_prompt=False
)

    # Format 'rejected' text
    messages_rejected = [
        {"role": "user", "content": str(example["rejected"])} # convert list to string
    ]
    formatted_rejected = tokenizer.apply_chat_template(
        messages_rejected,
        tokenize=False,
        add_generation_prompt=False
    )
    
    chosen = example["chosen"]
    rejected = example["rejected"]
    return {
        "formatted_chosen": formatted_chosen,
        "formatted_rejected": formatted_rejected,
        #"chosen": chosen,
        #"rejected": rejected,
    }`

i get this output
Image

Is this right way to prepare data before finetuning?
I mean as convert our data this way
`
Conversation with template: <|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
Hello, how are you?<|im_end|>
<|im_start|>assistant
I'm doing well, thank you! How can I assist you today?<|im_end|>

`
should i need to prepare my data this way i mean i need the 'text' column?
Please tell me how can i prepare my data for fintuing DPO?
Please help me out.

Manas.
This is the collab notbook link https://colab.research.google.com/github/huggingface/smol-course/blob/main/2_preference_alignment/notebooks/dpo_finetuning_example.ipynb

https://github.com/pratim808/smol-course/blob/main/2_preference_alignment/notebooks/dpo_finetuning_example.ipynb

@wang-jinghui
Copy link

Dataset({
features: ['chosen', 'rejected', 'prompt'],
num_rows: 62135
})
For example:
{'prompt': 'Is the milk produced by a hippopotamus pink in color?', 'chosen': 'No, the milk produced by a hippopotamus is not pink. It is ' 'typically white or beige in color. The misconception arises due to ' 'the hipposudoric acid, a red pigment found in hippo skin ' 'secretions, which people mistakenly assume affects the color of ' 'their milk.', 'rejected': 'No, hippopotamus milk is not pink in color. It is actually white ' 'or grayish-white.'}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants