Create detector/README.md
Browse files- detector/README.md +51 -0
detector/README.md
ADDED
|
@@ -0,0 +1,51 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
GPT-2 Output Detector
|
| 2 |
+
=====================
|
| 3 |
+
|
| 4 |
+
This directory contains the code for working with the GPT-2 output detector model, obtained by fine-tuning a
|
| 5 |
+
[RoBERTa model](https://ai.facebook.com/blog/roberta-an-optimized-method-for-pretraining-self-supervised-nlp-systems/)
|
| 6 |
+
with [the outputs of the 1.5B-parameter GPT-2 model](https://github.com/openai/gpt-2-output-dataset).
|
| 7 |
+
For motivations and discussions regarding the release of this detector model, please check out
|
| 8 |
+
[our blog post](https://openai.com/blog/gpt-2-1-5b-release/) and [report](https://d4mucfpksywv.cloudfront.net/papers/GPT_2_Report.pdf).
|
| 9 |
+
|
| 10 |
+
## Downloading a pre-trained detector model
|
| 11 |
+
|
| 12 |
+
Download the weights for the fine-tuned `roberta-base` model (478 MB):
|
| 13 |
+
|
| 14 |
+
```bash
|
| 15 |
+
wget https://storage.googleapis.com/gpt-2/detector-models/v1/detector-base.pt
|
| 16 |
+
```
|
| 17 |
+
|
| 18 |
+
or `roberta-large` model (1.5 GB):
|
| 19 |
+
|
| 20 |
+
```bash
|
| 21 |
+
wget https://storage.googleapis.com/gpt-2/detector-models/v1/detector-large.pt
|
| 22 |
+
```
|
| 23 |
+
|
| 24 |
+
These RoBERTa-based models are fine-tuned with a mixture of temperature-1 and nucleus sampling outputs,
|
| 25 |
+
which should generalize well to outputs generated using different sampling methods.
|
| 26 |
+
|
| 27 |
+
## Running a detector model
|
| 28 |
+
|
| 29 |
+
You can launch a web UI in which you can enter a text and see the detector model's prediction
|
| 30 |
+
on whether or not it was generated by a GPT-2 model.
|
| 31 |
+
|
| 32 |
+
```bash
|
| 33 |
+
# (on the top-level directory of this repository)
|
| 34 |
+
pip install -r requirements.txt
|
| 35 |
+
python -m detector.server detector-base.pt
|
| 36 |
+
```
|
| 37 |
+
|
| 38 |
+
After the script says "Ready to serve", nagivate to http://localhost:8080 to view the UI.
|
| 39 |
+
|
| 40 |
+
## Training a new detector model
|
| 41 |
+
|
| 42 |
+
You can use the provided training script to train a detector model on a new set of datasets.
|
| 43 |
+
We recommend using a GPU machine for this task.
|
| 44 |
+
|
| 45 |
+
```bash
|
| 46 |
+
# (on the top-level directory of this repository)
|
| 47 |
+
pip install -r requirements.txt
|
| 48 |
+
python -m detector.train
|
| 49 |
+
```
|
| 50 |
+
|
| 51 |
+
The training script supports a number of different options; append `--help` to the command above for usage.
|