|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: HuggingFaceTB/SmolLM3-3B |
|
|
tags: |
|
|
- code |
|
|
- instruction-following |
|
|
- pytorch |
|
|
- smollm |
|
|
- lora |
|
|
- finetuned |
|
|
- general-knowledge |
|
|
- math |
|
|
- reasoning |
|
|
- tool-calling |
|
|
language: |
|
|
- code |
|
|
- en |
|
|
pipeline_tag: text-generation |
|
|
library_name: transformers |
|
|
--- |
|
|
|
|
|
# Fyodor SmolLM3-3B v2 Instruct |
|
|
|
|
|
Fine-tuned SmolLM3-3B with enhanced general knowledge, coding, math, tool calling, reasoning, and instruction-following capabilities. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Base Model**: [HuggingFaceTB/SmolLM3-3B](https://huggingface.co/HuggingFaceTB/SmolLM3-3B) |
|
|
- **Model Type**: Causal Language Model (3B parameters) |
|
|
- **Language(s)**: English, Python, and multiple programming languages |
|
|
- **License**: Apache 2.0 |
|
|
- **Training Method**: LoRA fine-tuning with mixed precision (bfloat16) |
|
|
- **Model Size**: ~3B parameters |
|
|
- **Dtype**: bfloat16 |
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Training Strategy |
|
|
|
|
|
This model was trained using LoRA (Low-Rank Adaptation) fine-tuning with the following configuration: |
|
|
|
|
|
- **Training Strategy**: smollm3_3b_lora_hard_merge |
|
|
- **Final Training Loss**: 0.3240 |
|
|
- **Number of Epochs**: 3 |
|
|
- **Learning Rate**: 2e-4 |
|
|
- **Batch Size**: 8 |
|
|
- **Gradient Accumulation Steps**: 8 (effective batch size: 64) |
|
|
- **Max Sequence Length**: 1024 tokens |
|
|
- **Warmup Steps**: 100 |
|
|
|
|
|
### LoRA Configuration |
|
|
|
|
|
```python |
|
|
lora_r: 32 |
|
|
lora_alpha: 64 |
|
|
lora_dropout: 0.05 |
|
|
lora_target_modules: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"] |
|
|
``` |
|
|
|
|
|
### Training Data Distribution |
|
|
|
|
|
The model was trained on a carefully balanced mix of high-quality datasets: |
|
|
|
|
|
- **30% General Knowledge**: MuskumPillerum/General-Knowledge, HuggingFaceH4/ultrachat_200k, teknium/OpenHermes-2.5, cognitivecomputations/dolphin |
|
|
- **20% Coding**: bigcode/starcoderdata (Python), sahil2801/CodeAlpaca-20k, iamtarun/python_code_instructions_18k_alpaca |
|
|
- **20% Tool Calling**: Salesforce/xlam-function-calling-60k, glaiveai/glaive-function-calling-v2, NousResearch/hermes-function-calling-v1 |
|
|
- **10% Math**: meta-math/MetaMathQA, openai/gsm8k |
|
|
- **10% Advanced Reasoning**: Open-Orca/OpenOrca |
|
|
- **10% Instruction Following**: tatsu-lab/alpaca, HuggingFaceH4/ultrachat_200k |
|
|
|
|
|
## Usage |
|
|
|
|
|
### Installation |
|
|
|
|
|
```bash |
|
|
pip install transformers torch accelerate |
|
|
``` |
|
|
|
|
|
### Basic Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
import torch |
|
|
|
|
|
# Load model and tokenizer |
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
"Kiy-K/Fyodor-Mini-3B", |
|
|
torch_dtype=torch.bfloat16, |
|
|
trust_remote_code=True, |
|
|
device_map="auto" |
|
|
) |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained("Kiy-K/Fyodor-Mini-3B") |
|
|
|
|
|
# Generate text |
|
|
prompt = """### Instruction: |
|
|
Write a Python function to calculate Fibonacci numbers using dynamic programming. |
|
|
|
|
|
### Response: |
|
|
""" |
|
|
|
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
|
|
|
with torch.no_grad(): |
|
|
outputs = model.generate( |
|
|
**inputs, |
|
|
max_new_tokens=512, |
|
|
temperature=0.7, |
|
|
top_p=0.95, |
|
|
do_sample=True, |
|
|
pad_token_id=tokenizer.eos_token_id |
|
|
) |
|
|
|
|
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
print(response) |
|
|
``` |
|
|
|
|
|
### Code Generation Example |
|
|
|
|
|
```python |
|
|
prompt = """### Instruction: |
|
|
Create a Python class for a binary search tree with insert and search methods. |
|
|
|
|
|
### Response: |
|
|
""" |
|
|
|
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.2) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
``` |
|
|
|
|
|
### Tool Calling Example |
|
|
|
|
|
```python |
|
|
prompt = """You have access to the following functions: |
|
|
|
|
|
[ |
|
|
{ |
|
|
"name": "get_weather", |
|
|
"description": "Get current weather for a location", |
|
|
"parameters": { |
|
|
"location": {"type": "string", "description": "City name"} |
|
|
} |
|
|
} |
|
|
] |
|
|
|
|
|
User: What's the weather in Paris? |
|
|
Assistant:""" |
|
|
|
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.3) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
``` |
|
|
|
|
|
### Math Problem Solving |
|
|
|
|
|
```python |
|
|
prompt = """Question: A train travels 120 km in 2 hours. What is its average speed in km/h? |
|
|
Answer:""" |
|
|
|
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.1) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
``` |
|
|
|
|
|
## Capabilities |
|
|
|
|
|
This model excels at: |
|
|
|
|
|
- ✅ **General Knowledge**: Answering questions across various domains |
|
|
- ✅ **Code Generation**: Writing Python, JavaScript, and other programming languages |
|
|
- ✅ **Mathematical Reasoning**: Solving arithmetic and word problems |
|
|
- ✅ **Tool/Function Calling**: Understanding and generating function calls |
|
|
- ✅ **Chain-of-Thought Reasoning**: Step-by-step problem solving |
|
|
- ✅ **Instruction Following**: Understanding and executing complex instructions |
|
|
|
|
|
## Recommended Generation Parameters |
|
|
|
|
|
For best results, use these generation settings based on your use case: |
|
|
|
|
|
### Code Generation |
|
|
```python |
|
|
temperature=0.2 |
|
|
top_p=0.95 |
|
|
max_new_tokens=512 |
|
|
do_sample=True |
|
|
``` |
|
|
|
|
|
### Creative Writing |
|
|
```python |
|
|
temperature=0.8 |
|
|
top_p=0.95 |
|
|
max_new_tokens=1024 |
|
|
do_sample=True |
|
|
``` |
|
|
|
|
|
### Mathematical Reasoning |
|
|
```python |
|
|
temperature=0.1 |
|
|
top_p=0.9 |
|
|
max_new_tokens=512 |
|
|
do_sample=True |
|
|
``` |
|
|
|
|
|
### General Q&A |
|
|
```python |
|
|
temperature=0.7 |
|
|
top_p=0.95 |
|
|
max_new_tokens=512 |
|
|
do_sample=True |
|
|
``` |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Context window limited to 1024 tokens during training (base model supports up to 2048) |
|
|
- May occasionally generate incorrect information or code |
|
|
- Not specifically optimized for languages other than English |
|
|
- Should not be used for medical, legal, or other professional advice without expert review |
|
|
- Generated code should always be reviewed and tested before production use |
|
|
- May exhibit biases present in the training data |
|
|
|
|
|
## Ethical Considerations |
|
|
|
|
|
- This model can generate code that may have security vulnerabilities - always review before deployment |
|
|
- The model should not be used to generate malicious code or harmful content |
|
|
- Be aware of potential biases inherited from training data |
|
|
- Not suitable for making critical decisions without human oversight |
|
|
- Users are responsible for ensuring appropriate use of generated content |
|
|
|
|
|
## Performance Benchmarks |
|
|
|
|
|
Training metrics: |
|
|
- **Final Validation Loss**: 0.3240 |
|
|
- **Training Strategy**: Hard LoRA merge |
|
|
- **Perplexity**: ~1.38 (estimated from loss) |
|
|
|
|
|
## Model Card Contact |
|
|
|
|
|
For questions, feedback, or issues, please: |
|
|
- Open an issue on the [model repository](https://huggingface.co/Kiy-K/Fyodor-Mini-3B) |
|
|
- Contact the author through Hugging Face |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model in your research or applications, please cite: |
|
|
|
|
|
```bibtex |
|
|
@misc{fyodor-mini-2025, |
|
|
author = {Khoi}, |
|
|
title = {Fyodor SmolLM3-3B v2 Instruct}, |
|
|
year = {2025}, |
|
|
publisher = {HuggingFace}, |
|
|
url = {https://huggingface.co/Kiy-K/Fyodor-Mini-3B} |
|
|
} |
|
|
``` |
|
|
|
|
|
## Acknowledgments |
|
|
|
|
|
- Base model by [HuggingFace](https://huggingface.co/HuggingFaceTB) |
|
|
- Built on [SmolLM3-3B](https://huggingface.co/HuggingFaceTB/SmolLM3-3B) |
|
|
- Training data from various open-source datasets (see Training Details) |
|
|
- Trained using PyTorch and Transformers library |
|
|
- GGUF conversions and local hosting accessibilities by Team Mradermacher |
|
|
|
|
|
--- |
|
|
|
|
|
*This model was trained with care and attention to quality. Always verify outputs for your specific use case.* |