gemma-9b-sft / README.md
Samzy17's picture
Create README.md
1b400b6 verified

Trained on 1600 samples of LM-SYS conversational data having undergone the REFLECT process.

Official finetuned model using REFLECT generated for analyzing trend in KL Divergence against Winrate as training progresses.