Meant for computing KL divergence vs win rate at different training checkpoints for Gemma family, 2b (this) and 9b (Samzy17/gemma-9b-sft)
Meant for computing KL divergence vs win rate at different training checkpoints for Gemma family, 2b (this) and 9b (Samzy17/gemma-9b-sft)