tttx/r1-trajectories-collection-round-2
Viewer • Updated • 2.91k • 52 • 1
How to use tttx/sft-32b-continue with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("deepseek-ai/Deepseek-R1-Distill-Qwen-32B")
model = PeftModel.from_pretrained(base_model, "tttx/sft-32b-continue")This model is a fine-tuned version of tttx/sft_r1_32b on the tttx/r1-trajectories-collection-round-2 dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 0.4298 | 1.0 | 92 | 0.4860 |
| 0.3989 | 2.0 | 184 | 0.4810 |
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("deepseek-ai/Deepseek-R1-Distill-Qwen-32B") model = PeftModel.from_pretrained(base_model, "tttx/sft-32b-continue")