ToT-Reasoner-Qwen3-1.7B

Model Description

Fine-tuned ziadrone/oneplusaries1 using Supervised Fine-Tuning (SFT) on open-r1/Mixture-of-Thoughts (math split). Optimized for mathematical reasoning.

Training Data

Source: open-r1/Mixture-of-Thoughts (math split, up to 50 samples).
Format: Prompts with <reasoning>...</reasoning><answer>...</answer> structure.

Fine-Tuning Process

Method: SFT with learning rate=1e-5, 3 epochs, batch size=1.
Setup: Google Colab Pro with T4 GPU.

Usage

Downloads last month: 1

Safetensors

Model size

2B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support