Uppaal commited on
Commit
71dc743
·
verified ·
1 Parent(s): 95e866b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -108,7 +108,7 @@ print(tokenizer.decode(out[0], skip_special_tokens=True))
108
  ## Training (Editing) Details
109
 
110
  ### Data
111
- We use [HH-Golden dataset](https://huggingface.co/datasets/nz/anthropic-hh-golden-rlhf), which manually improves the quality of noisy samples in the HH-RLHF dataset.
112
 
113
  - Data format: (toxic, non-toxic) sentence pairs.
114
  - Sample size: 500 pairs for ProFS editing (compared to 2,000 pairs used for DPO fine-tuning).
 
108
  ## Training (Editing) Details
109
 
110
  ### Data
111
+ We use the [HH-Golden dataset](https://huggingface.co/datasets/nz/anthropic-hh-golden-rlhf), which manually improves the quality of noisy samples in the HH-RLHF dataset.
112
 
113
  - Data format: (toxic, non-toxic) sentence pairs.
114
  - Sample size: 500 pairs for ProFS editing (compared to 2,000 pairs used for DPO fine-tuning).