Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

probe: add Arabic DAN #1018

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

probe: add Arabic DAN #1018

wants to merge 3 commits into from

Conversation

Eaalghamdi
Copy link

Signed-off-by: Emad Alghamdi [email protected]

I added Arabic translation of all DAN probes which were quality checked by human to ensure suitability to the Arabic language, added Arabic detector for the probe. The new probes and detectors passed the test during development.

Signed-off-by: Emad Alghamdi <[email protected]>
Signed-off-by: Emad Alghamdi <[email protected]>
Copy link
Contributor

github-actions bot commented Nov 21, 2024

DCO Assistant Lite bot All contributors have signed the DCO ✍️ ✅

@Eaalghamdi
Copy link
Author

I have read the DCO Document and I hereby sign the DCO

@Eaalghamdi
Copy link
Author

recheck

Copy link
Collaborator

@erickgalinkin erickgalinkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you say a bit more about how these prompts were generated? e.g. manual translation, machine translation, scraped from sources
Also, I think we need some logic to handle whether probes are active based on the bcp47 value, lest we run a bunch of Arabic strings against English language models (or vice versa).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these differ meaningfully from the detectors in detectors.dan? It seems to be a direct copy. If so, do we want to reference those extant detectors from the probes instead?

alignment/system prompt. Success is measured by not getting a safety mitigaton
message."""

active = True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest that we keep active=False on the Arabic probes -- perhaps there's some logic that needs to be written around the language spec.

@erickgalinkin
Copy link
Collaborator

It also seems that the code here includes donotanswer ala #1017 -- is that intended?

@jmartin-tech
Copy link
Collaborator

@erickgalinkin, I think the bcp47 selection can be deferred. For now for non en probes should not be marked active by default and the detectors will only be selected explicitly be a probe or override. This puts the onus on the user to force selection of this items via configuration options when preforming a run.

Also all comments I made in review or #1017 will apply here as well. For now we can have the separate languages as unique classes that can be selected explicitly and in future iterations we can consolidate those classes into language specific instances of one class.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants