finetunedmodelll — SmolLM2-360M Fine-Tuned on Deepthink-Reasoning
finetunedmodelll is a fine-tuned version of HuggingFaceTB/SmolLM2-360M, trained on the prithivMLmods/Deepthink-Reasoning dataset.
The goal of this model is to generate step-by-step reasoning and final answers for tasks such as arithmetic, logic, and basic conceptual questions. It is formatted as a chat model using the SmolLM2 chat template and trained with supervised fine-tuning (SFT).
This model is intended for educational and experimental uses, not for high-stakes decision-making.
Features
- Base model: SmolLM2-360M (small, fast, and lightweight).
- Fine-tuned on Deepthink-Reasoning (instruction–response pairs with detailed reasoning).
- Uses chat-style formatting via
tokenizer.apply_chat_templatewithuserandassistantroles. - Produces:
- Step-by-step reasoning.
- A final concise answer at the end.
- Suitable for:
- Simple math and arithmetic reasoning.
- Logic-style questions.
- Educational demonstrations of chain-of-thought.
Intended Use
Recommended
- Educational reasoning demos.
- Step-by-step solutions to simple math problems.
- Simple “explain your reasoning” style questions.
- Toy tasks and experimentation with small language models.
Not Recommended
- Medical, legal, or financial advice.
- High-stakes or real-world decision-making.
- Safety-critical applications.
- Factual tasks where reliability is crucial.
This is an experimental model trained on a reasoning dataset and is not designed for reliable factual knowledge or domain-specific professional use.
Training Data
- Dataset:
prithivMLmods/Deepthink-Reasoning - Format:
- Columns: at least
promptandresponse. - Each example is converted to a chat format:
{"role": "user", "content": prompt}{"role": "assistant", "content": response}
- Columns: at least
The notebook uses:
texts = [
tokenizer.apply_chat_template(
[
{"role": "user", "content": p},
{"role": "assistant", "content": r},
],
tokenize=False
)
for p, r in zip(prompts, responses)
]
- Downloads last month
- 2
Model tree for shahdaboelfotouh/finetunedmodelll
Base model
HuggingFaceTB/SmolLM2-360M