SRD_V7 - Standard Reasoning (SRD) Model (V7)

Dataset

  • Source: CoT_reasoning_unsloth.jsonl
  • Examples: 9,340
  • Format: messages[] chat format

Training Configuration

Parameter Value
Learning Rate 0.00015
LoRA Rank 32
LoRA Alpha 64
LoRA Dropout 0.0
Target Modules All (MLP + Attention)
Epochs 2
Batch Size (effective) 16
Warmup 3%
RSLoRA Disabled

Training Results

  • Training Time: 1.38 hours
  • Final Loss: 1.2049

Part of Experiment

  • kinzakhan1/CRD_V7
  • kinzakhan1/SRD_V7 (this model)
  • kinzakhan1/MIXED_V7
Downloads last month
39
Safetensors
Model size
8B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for kinzakhan1/SRD_V7

Finetuned
(2155)
this model