Update README.md
Browse files
README.md
CHANGED
|
@@ -15,7 +15,7 @@ datasets:
|
|
| 15 |
|
| 16 |
Pythia-1b supervised finetuned with Anthropic-hh-rlhf dataset for 1 epoch (sft-model), before DPO [(paper)](https://arxiv.org/abs/2305.18290) with same dataset for 1 epoch.
|
| 17 |
|
| 18 |
-
[wandb log](https://wandb.ai/pythia_dpo/Pythia_DPO_new/runs/
|
| 19 |
|
| 20 |
See [Pythia-1b](https://huggingface.co/EleutherAI/pythia-1b) for model details [(paper)](https://arxiv.org/abs/2101.00027).
|
| 21 |
|
|
|
|
| 15 |
|
| 16 |
Pythia-1b supervised finetuned with Anthropic-hh-rlhf dataset for 1 epoch (sft-model), before DPO [(paper)](https://arxiv.org/abs/2305.18290) with same dataset for 1 epoch.
|
| 17 |
|
| 18 |
+
[wandb log](https://wandb.ai/pythia_dpo/Pythia_DPO_new/runs/jk09pzqb)
|
| 19 |
|
| 20 |
See [Pythia-1b](https://huggingface.co/EleutherAI/pythia-1b) for model details [(paper)](https://arxiv.org/abs/2101.00027).
|
| 21 |
|