ziadrone
/

oneplusaries2

Model card Files Files and versions

oneplusaries2 / README.md

ziadrone's picture

Upload model card

9d8010b verified 12 months ago

|

history blame contribute delete

529 Bytes

	# ToT-Reasoner-Qwen3-1.7B

	## Model Description
	Fine-tuned `ziadrone/oneplusaries1` using Supervised Fine-Tuning (SFT) on `open-r1/Mixture-of-Thoughts` (math split). Optimized for mathematical reasoning.

	## Training Data
	- Source: `open-r1/Mixture-of-Thoughts` (math split, up to 50 samples).
	- Format: Prompts with `<reasoning>...</reasoning><answer>...</answer>` structure.

	## Fine-Tuning Process
	- Method: SFT with learning rate=1e-5, 3 epochs, batch size=1.
	- Setup: Google Colab Pro with T4 GPU.

	## Usage