Small Model Learnability Gap: Models
Collection
24 items • Updated • 2
How to use UWNSL/Llama3.3_70B_Instruct_Long_CoT_lora with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.3-70B-Instruct")
model = PeftModel.from_pretrained(base_model, "UWNSL/Llama3.3_70B_Instruct_Long_CoT_lora")This model is a fine-tuned version of meta-llama/Llama-3.3-70B-Instruct on the MATH_training_Qwen_QwQ_32B_Preview dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 0.3436 | 0.4338 | 200 | 0.3342 |
| 0.3048 | 0.8677 | 400 | 0.3144 |
| 0.2566 | 1.3015 | 600 | 0.3054 |
| 0.3036 | 1.7354 | 800 | 0.2994 |
Base model
meta-llama/Llama-3.1-70B