Skip to content

Commit

Permalink
doc: Update README.md to notify of migration (#1002)
Browse files Browse the repository at this point in the history
* ensure users are aware of move
* add notes about how to update upstream links in git remote
  • Loading branch information
leondz authored Nov 15, 2024
2 parents c775303 + e6305c9 commit 8446013
Showing 1 changed file with 11 additions and 1 deletion.
12 changes: 11 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@
[![Downloads](https://pepy.tech/badge/garak/month)](https://pepy.tech/project/garak)


"🚧 garak is moving. This repository is moving to the `NVIDIA` github organization in the near future. This is planned to be a non-disruptive transition with automatic redirection. 🚧"


## Get started
### > See our user guide! [docs.garak.ai](https://docs.garak.ai/)
### > Join our [Discord](https://discord.gg/uVch4puUCs)!
Expand Down Expand Up @@ -75,6 +78,13 @@ python -m pip install -e .

OK, if that went fine, you're probably good to go!

**Note**: if you cloned before the move to the `NVIDIA` GitHub organisation, but you're reading this at the `github.com/NVIDIA` URI, please update your remotes as follows:

```
git remote set-url origin https://github.com/NVIDIA/garak.git
```


## Getting started

The general syntax is:
Expand Down Expand Up @@ -223,7 +233,7 @@ For testing. This generator repeats back the prompt it received.
| Probe | Description |
| --- | --- |
| blank | A simple probe that always sends an empty prompt. |
| atkgen | Automated Attack Generation. A red-teaming LLM probes the target and reacts to it in an attempt to get toxic output. Prototype, mostly stateless, for now uses a simple GPT-2 [fine-tuned](https://huggingface.co/leondz/artgpt2tox) on the subset of hhrlhf attempts that yielded detectable toxicity (the only target currently supported for now). |
| atkgen | Automated Attack Generation. A red-teaming LLM probes the target and reacts to it in an attempt to get toxic output. Prototype, mostly stateless, for now uses a simple GPT-2 [fine-tuned](https://huggingface.co/garak-llm/artgpt2tox) on the subset of hhrlhf attempts that yielded detectable toxicity (the only target currently supported for now). |
| av_spam_scanning | Probes that attempt to make the model output malicious content signatures |
| continuation | Probes that test if the model will continue a probably undesirable word |
| dan | Various [DAN](https://adguard.com/en/blog/chatgpt-dan-prompt-abuse.html) and DAN-like attacks |
Expand Down

0 comments on commit 8446013

Please sign in to comment.