Update README.md
Browse files
README.md
CHANGED
|
@@ -93,7 +93,7 @@ Use the code below to get started with the model.
|
|
| 93 |
|
| 94 |
```
|
| 95 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
| 96 |
-
model_id = "Uppaal/Mistral-ProFS-
|
| 97 |
|
| 98 |
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
| 99 |
model = AutoModelForCausalLM.from_pretrained(model_id)
|
|
@@ -108,10 +108,8 @@ print(tokenizer.decode(out[0], skip_special_tokens=True))
|
|
| 108 |
## Training (Editing) Details
|
| 109 |
|
| 110 |
### Data
|
| 111 |
-
We use
|
| 112 |
|
| 113 |
-
- Non-toxic sequences: sampled from WikiText-2.
|
| 114 |
-
- Toxic counterparts: generated using the Plug-and-Play Language Model (PPLM) method to inject toxic content.
|
| 115 |
- Data format: (toxic, non-toxic) sentence pairs.
|
| 116 |
- Sample size: 500 pairs for ProFS editing (compared to 2,000 pairs used for DPO fine-tuning).
|
| 117 |
|
|
|
|
| 93 |
|
| 94 |
```
|
| 95 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
| 96 |
+
model_id = "Uppaal/Mistral-ProFS-safety"
|
| 97 |
|
| 98 |
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
| 99 |
model = AutoModelForCausalLM.from_pretrained(model_id)
|
|
|
|
| 108 |
## Training (Editing) Details
|
| 109 |
|
| 110 |
### Data
|
| 111 |
+
We use [HH-Golden dataset](https://huggingface.co/datasets/nz/anthropic-hh-golden-rlhf), which manually improves the quality of noisy samples in the HH-RLHF dataset.
|
| 112 |
|
|
|
|
|
|
|
| 113 |
- Data format: (toxic, non-toxic) sentence pairs.
|
| 114 |
- Sample size: 500 pairs for ProFS editing (compared to 2,000 pairs used for DPO fine-tuning).
|
| 115 |
|