gemma-9b-sft / README.md
Samzy17's picture
Create README.md
1b400b6 verified
Trained on 1600 samples of LM-SYS conversational data having undergone the [REFLECT](https://arxiv.org/pdf/2601.18730) process.
Official finetuned model using REFLECT generated for analyzing trend in KL Divergence against Winrate as training progresses.