| # ToT-Reasoner-Qwen3-1.7B |
|
|
| ## Model Description |
| Fine-tuned `ziadrone/oneplusaries1` using Supervised Fine-Tuning (SFT) on `open-r1/Mixture-of-Thoughts` (math split). Optimized for mathematical reasoning. |
|
|
| ## Training Data |
| - **Source**: `open-r1/Mixture-of-Thoughts` (math split, up to 50 samples). |
| - **Format**: Prompts with `<reasoning>...</reasoning><answer>...</answer>` structure. |
|
|
| ## Fine-Tuning Process |
| - **Method**: SFT with learning rate=1e-5, 3 epochs, batch size=1. |
| - **Setup**: Google Colab Pro with T4 GPU. |
|
|
| ## Usage |
|
|