Update README.md
Browse files
README.md
CHANGED
|
@@ -108,7 +108,7 @@ print(tokenizer.decode(out[0], skip_special_tokens=True))
|
|
| 108 |
## Training (Editing) Details
|
| 109 |
|
| 110 |
### Data
|
| 111 |
-
We use [HH-Golden dataset](https://huggingface.co/datasets/nz/anthropic-hh-golden-rlhf), which manually improves the quality of noisy samples in the HH-RLHF dataset.
|
| 112 |
|
| 113 |
- Data format: (toxic, non-toxic) sentence pairs.
|
| 114 |
- Sample size: 500 pairs for ProFS editing (compared to 2,000 pairs used for DPO fine-tuning).
|
|
|
|
| 108 |
## Training (Editing) Details
|
| 109 |
|
| 110 |
### Data
|
| 111 |
+
We use the [HH-Golden dataset](https://huggingface.co/datasets/nz/anthropic-hh-golden-rlhf), which manually improves the quality of noisy samples in the HH-RLHF dataset.
|
| 112 |
|
| 113 |
- Data format: (toxic, non-toxic) sentence pairs.
|
| 114 |
- Sample size: 500 pairs for ProFS editing (compared to 2,000 pairs used for DPO fine-tuning).
|