tttx
/

sft_r1_7b

@@ -1,11 +1,14 @@
 ---
 base_model: deepseek-ai/Deepseek-R1-Distill-Qwen-7B
 library_name: peft
 license: mit
 tags:
 - trl
 - sft
-- alignment-handbook
 - generated_from_trainer
 model-index:
 - name: sft_r1_7b
@@ -17,7 +20,7 @@ should probably proofread and complete it, then remove this comment. -->
 # sft_r1_7b
-This model is a fine-tuned version of [deepseek-ai/Deepseek-R1-Distill-Qwen-7B](https://huggingface.co/deepseek-ai/Deepseek-R1-Distill-Qwen-7B) on the None dataset.
 ## Model description

 ---
 base_model: deepseek-ai/Deepseek-R1-Distill-Qwen-7B
+datasets:
+- tttx/r1-trajectories-collection-round-2
+- tttx/r1-trajectories-arcagi-barc
 library_name: peft
 license: mit
 tags:
+- alignment-handbook
 - trl
 - sft
 - generated_from_trainer
 model-index:
 - name: sft_r1_7b
 # sft_r1_7b
+This model is a fine-tuned version of [deepseek-ai/Deepseek-R1-Distill-Qwen-7B](https://huggingface.co/deepseek-ai/Deepseek-R1-Distill-Qwen-7B) on the tttx/r1-trajectories-collection-round-2 and the tttx/r1-trajectories-arcagi-barc datasets.
 ## Model description