File size: 529 Bytes
9d8010b cd2425b 9d8010b cd2425b 9d8010b cd2425b 9d8010b cd2425b 9d8010b | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | # ToT-Reasoner-Qwen3-1.7B
## Model Description
Fine-tuned `ziadrone/oneplusaries1` using Supervised Fine-Tuning (SFT) on `open-r1/Mixture-of-Thoughts` (math split). Optimized for mathematical reasoning.
## Training Data
- **Source**: `open-r1/Mixture-of-Thoughts` (math split, up to 50 samples).
- **Format**: Prompts with `<reasoning>...</reasoning><answer>...</answer>` structure.
## Fine-Tuning Process
- **Method**: SFT with learning rate=1e-5, 3 epochs, batch size=1.
- **Setup**: Google Colab Pro with T4 GPU.
## Usage
|