|
|
--- |
|
|
language: en |
|
|
license: apache-2.0 |
|
|
library_name: transformers |
|
|
tags: |
|
|
- text-generation |
|
|
- mathematics |
|
|
- reasoning |
|
|
- qwen |
|
|
- sft |
|
|
- fine-tune |
|
|
base_model: Qwen/Qwen3-1.7B |
|
|
datasets: |
|
|
- HuggingFaceH4/MATH-500 |
|
|
--- |
|
|
|
|
|
# ToT-Reasoner-Qwen3-1.7B |
|
|
|
|
|
## Model Description |
|
|
|
|
|
This model is a fine-tuned version of `Qwen/Qwen3-1.7B` using Supervised Fine-Tuning (SFT) on the `HuggingFaceH4/MATH-500` dataset. It is optimized for mathematical reasoning and problem-solving tasks. The fine-tuning process was performed by EKAGRATA TECH PRIVATE LIMITED. |
|
|
|
|
|
## Training Data |
|
|
|
|
|
- **Source**: `HuggingFaceH4/MATH-500` (50 samples). |
|
|
- **Format**: Prompts with `<reasoning>...</reasoning><answer>...</answer>` structure. |
|
|
|
|
|
## Fine-Tuning Process |
|
|
|
|
|
- **Method**: Incremental SFT with learning rate=1e-5, 1 epoch per batch, batch size=10. |
|
|
- **Setup**: Google Colab Pro with A100 GPU. |
|
|
- **Date and Time**: Model uploaded at 07:23 AM on Friday, July 04, 2025. |
|
|
|
|
|
## Usage |
|
|
|
|
|
To use this model for mathematical reasoning: |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
|
|
model = AutoModelForCausalLM.from_pretrained("ziadrone/oneplusaries55") |
|
|
tokenizer = AutoTokenizer.from_pretrained("ziadrone/oneplusaries55") |
|
|
|
|
|
SYSTEM_PROMPT = """You are a large language model trained to solve mathematical, logical, physics, and general reasoning problems. You must follow the following steps to solve the problem: |
|
|
1. Carefully analyze the question and identify the key information. |
|
|
2. Develop a clear and concise plan to approach the problem. |
|
|
3. Execute your plan step-by-step, providing detailed explanations and intermediate calculations. |
|
|
4. Verify your solution to ensure it is accurate and makes sense in the context of the problem. |
|
|
5. Present your final answer in a clear and concise format. |
|
|
6. Always enclose the reasoning process within <reasoning>...</reasoning> tags. |
|
|
7. Always enclose the final answer within <answer>...</answer> tags. |
|
|
8. Do not use any other tags besides <reasoning> and <answer>. |
|
|
9. Do not include any extra information outside of the reasoning or answer tags.""" |
|
|
|
|
|
prompt = f"SYSTEM: {SYSTEM_PROMPT}\nUSER: Solve the equation 2x + 3 = 7." |
|
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
|
outputs = model.generate(**inputs, max_length=512) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
``` |
|
|
|
|
|
## Performance |
|
|
|
|
|
This model has been fine-tuned on mathematical reasoning tasks and should perform well on similar problems involving step-by-step logical reasoning. |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- The model was trained on a limited dataset (50 samples) |
|
|
- Performance may vary on problems significantly different from the training data |
|
|
- Always verify mathematical results for critical applications |
|
|
|
|
|
## License |
|
|
|
|
|
This model is released under the Apache 2.0 license. |
|
|
|