Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

environmental question #7

Open
MaTengSYSU opened this issue Sep 13, 2024 · 3 comments
Open

environmental question #7

MaTengSYSU opened this issue Sep 13, 2024 · 3 comments

Comments

@MaTengSYSU
Copy link

Hello, I am very interested in your outstanding work and would like to reproduce your project. Therefore, I would like to inquire about the specific environment for executing the steps of STEP 2 Amplifying Image Harmfulness with LLMs, and STEP 3 Amplifying Image Harmfulness with Grade Update, in order to ensure that the reproduced results are consistent with yours.

@AoiDragon
Copy link
Owner

Hello @matengxiaotiancai,

We mainly utilized models PixArt XL 2 and LLaVA-1.5 for these two steps. You can refer to their respective repositories for instructions on setting up their environments.

@MaTengSYSU
Copy link
Author

Hello @matengxiaotiancai,

We mainly utilized models PixArt XL 2 and LLaVA-1.5 for these two steps. You can refer to their respective repositories for instructions on setting up their environments.

Thank you for your reply!
I have another question. May I ask what does' flagged 'mean when it is True or False in the file' generate_init_image.py '?
I didn't understand the following code:
processed_data = {}
for entry in raw_data:
entry_id = entry['id']
print(f"Processing entry {entry_id}")

    if entry_id in processed_data and entry['flagged']:
        if int(entry['step']) < int(processed_data[entry_id]['step']):
            processed_data[entry_id] = entry
    elif entry_id not in processed_data:
        processed_data[entry_id] = entry
    elif not entry['flagged'] and processed_data[entry_id]['flagged']:
        continue
    elif entry['step'] == "3":
        processed_data[entry_id] = entry

@pyogher
Copy link
Collaborator

pyogher commented Oct 7, 2024

Hi @matengxiaotiancai,

This part of the code is primarily designed to filter out samples that successfully jailbreak target MLLM during the black-box optimization process. In our experiments, we set five steps to jailbreak target MLLM with black-box optimization. If a successful attack sample appears at any of these five steps, we will save the samples and employ early stop in our optimization process, i,e., if the jailbreak is completed in the second step, we retain the results from the second step. If none of the five steps successfully completed the jailbreak, we retain the results from the fifth step.

processed_data = {}
for entry in raw_data:
    entry_id = entry['id']
    
    # If the id is in processed_data and the current entry is flagged, check the step
    if entry_id in processed_data and entry['flagged']:
        # If the current entry's step is smaller, replace it
        if int(entry['step']) < int(processed_data[entry_id]['step']):
            processed_data[entry_id] = entry
    elif entry_id not in processed_data:
        # If the id is not in processed_data, then add it
        processed_data[entry_id] = entry
    # If the current entry is not flagged, but the same id in processed_data is flagged, skip it
    elif not entry['flagged'] and processed_data[entry_id]['flagged']:
        continue
    # If the current entry's step is 5, replace it (only executes if there are no flagged entries)
    elif entry['step'] == "5":
        processed_data[entry_id] = entry

# Ensure the retained samples are only those flagged as true, and have a step of 5 (if no flagged samples exist)
output_list = list(processed_data.values())

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants