Amu
/

spin-phi2

Text Generation

alignment-handbook

Generated from Trainer

Eval Results (legacy)

text-generation-inference

Model card Files Files and versions

Amu commited on Mar 3, 2024

Commit

ed108b7

·

verified ·

1 Parent(s): 5040b8b

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -16,6 +16,9 @@ This model is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/m
 I think SPIN not only can use on a SFT model, but also it  can use on a pretrained model.
 Therefore, I use SPIN on a pretrained model microsoft/phi-2. And I get a higher score better than origin pretrained model. You can check the [open llm leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
 ## Training procedure
 ### Training hyperparameters

 I think SPIN not only can use on a SFT model, but also it  can use on a pretrained model.
 Therefore, I use SPIN on a pretrained model microsoft/phi-2. And I get a higher score better than origin pretrained model. You can check the [open llm leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
+The best paradigm for training a conversational Large Language Model (LLM):
+pretrain -> dpo(spin) -> sft -> dpo(spin)
 ## Training procedure
 ### Training hyperparameters