File size: 529 Bytes
9d8010b
cd2425b
9d8010b
 
cd2425b
9d8010b
 
 
cd2425b
9d8010b
 
 
cd2425b
9d8010b
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# ToT-Reasoner-Qwen3-1.7B

## Model Description
Fine-tuned `ziadrone/oneplusaries1` using Supervised Fine-Tuning (SFT) on `open-r1/Mixture-of-Thoughts` (math split). Optimized for mathematical reasoning.

## Training Data
- **Source**: `open-r1/Mixture-of-Thoughts` (math split, up to 50 samples).
- **Format**: Prompts with `<reasoning>...</reasoning><answer>...</answer>` structure.

## Fine-Tuning Process
- **Method**: SFT with learning rate=1e-5, 3 epochs, batch size=1.
- **Setup**: Google Colab Pro with T4 GPU.

## Usage