| # ToT-Reasoner-Qwen3-1.7B |
|
|
| ## Model Description |
|
|
| This model is a fine-tuned version of Qwen/Qwen3-1.7B using Supervised Fine-Tuning (SFT) on the HuggingFaceH4/MATH-500 dataset. It is optimized for mathematical reasoning and problem-solving tasks. The fine-tuning process was performed by EKAGRATA TECH PRIVATE LIMITED. |
|
|
| ## Training Data |
|
|
| - **Source**: HuggingFaceH4/MATH-500 (50 samples). |
| - **Format**: Prompts with <reasoning>...</reasoning><answer>...</answer> structure. |
|
|
| ## Fine-Tuning Process |
|
|
| - **Method**: Incremental SFT with learning rate=1e-5, 1 epoch per batch, batch size=10. |
| - **Setup**: Google Colab Pro with A100 GPU. |
| - **Date and Time**: Fine-tuning completed at 10:50 PM IST on Friday, June 06, 2025. |
|
|
| ## Usage |
|
|
| To use this model for mathematical reasoning: |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| |
| model = AutoModelForCausalLM.from_pretrained("ziadrone/oneplusaries4") |
| tokenizer = AutoTokenizer.from_pretrained("ziadrone/oneplusaries4") |
| |
| prompt = "SYSTEM: You are a large language model trained to solve mathematical, logical, physics, and general reasoning problems. You must follow the following steps to solve the problem: |
| 1. Carefully analyze the question and identify the key information. |
| 2. Develop a clear and concise plan to approach the problem. |
| 3. Execute your plan step-by-step, providing detailed explanations and intermediate calculations. |
| 4. Verify your solution to ensure it is accurate and makes sense in the context of the problem. |
| 5. Present your final answer in a clear and concise format. |
| 6. Always enclose the reasoning process within <reasoning>...</reasoning> tags. |
| 7. Always enclose the final answer within <answer>...</answer> tags. |
| 8. Do not use any other tags besides <reasoning> and <answer>. |
| 9. Do not include any extra information outside of the reasoning or answer tags.\nUSER: Solve the equation 2x + 3 = 7." |
| inputs = tokenizer(prompt, return_tensors="pt") |
| outputs = model.generate(**inputs, max_length=512) |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| ``` |
|
|