forked from openai/gpt-2
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
d14501a
commit 0503b1b
Showing
5 changed files
with
9 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,3 @@ | ||
__pycache__ | ||
.mypy_cache/ | ||
models/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,23 +2,23 @@ | |
|
||
Code and samples from the paper ["Language Models are Unsupervised Multitask Learners"](https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf). | ||
|
||
For now, we have only released a smaller (117M parameter) version of GPT-2. | ||
We have currently released small (117M parameter) and medium (345M parameter) versions of GPT-2. | ||
|
||
See more details in our [blog post](https://blog.openai.com/better-language-models/). | ||
|
||
## Usage | ||
|
||
This repository is meant to be a starting point for researchers and engineers to experiment with GPT-2-117M. While GPT-2-117M is less proficient than GPT-2-1.5B, it is useful for a wide range of research and applications which could also apply to larger models. | ||
This repository is meant to be a starting point for researchers and engineers to experiment with GPT-2. | ||
|
||
### Some caveats | ||
|
||
- GPT-2-117M robustness and worst case behaviors are not well-understood. As with any machine-learned model, carefully evaluate GPT-2-117M for your use case, especially if used without fine-tuning or in safety-critical applications where reliability is important. | ||
- The dataset our GPT-2-117M was trained on contains many texts with [biases](https://twitter.com/TomerUllman/status/1101485289720242177) and factual inaccuracies, and thus GPT-2-117M is likely to be biased and inaccurate as well. | ||
- GPT-2 models' robustness and worst case behaviors are not well-understood. As with any machine-learned model, carefully evaluate GPT-2 for your use case, especially if used without fine-tuning or in safety-critical applications where reliability is important. | ||
- The dataset our GPT-2 models were trained on contains many texts with [biases](https://twitter.com/TomerUllman/status/1101485289720242177) and factual inaccuracies, and thus GPT-2 models are likely to be biased and inaccurate as well. | ||
- To avoid having samples mistaken as human-written, we recommend clearly labeling samples as synthetic before wide dissemination. Our models are often incoherent or inaccurate in subtle ways, which takes more than a quick read for a human to notice. | ||
|
||
### Work with us | ||
|
||
Please [let us know](mailto:[email protected]) if you’re doing interesting research with or working on applications of GPT-2-117M! We’re especially interested in hearing from and potentially working with those who are studying | ||
Please [let us know](mailto:[email protected]) if you’re doing interesting research with or working on applications of GPT-2! We’re especially interested in hearing from and potentially working with those who are studying | ||
- Potential malicious use cases and defenses against them (e.g. the detectability of synthetic text) | ||
- The extent of problematic content (e.g. bias) being baked into the models and effective mitigations | ||
|
||
|