tttx/r1-trajectories-collection-round-2
Viewer • Updated • 2.91k • 47 • 1
How to use tttx/15k_sft_020525 with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("deepseek-ai/Deepseek-R1-Distill-Qwen-32B")
model = PeftModel.from_pretrained(base_model, "tttx/15k_sft_020525")This model is a fine-tuned version of deepseek-ai/Deepseek-R1-Distill-Qwen-32B on the tttx/r1-trajectories-arcagi-barc, the tttx/r1-masked-arcagi-v1, the tttx/r1-barc-r1-feb-6, the tttx/r1-masked-feb-6-p2, the tttx/r1-masked-feb-6-p1 and the tttx/r1-trajectories-collection-round-2 datasets. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 0.4154 | 1.0 | 526 | 0.4942 |
| 0.4029 | 2.0 | 1052 | 0.4883 |