Fyodor-Mini-3B / README.md
Kiy-K's picture
Update README.md
4deb51f verified
---
license: apache-2.0
base_model: HuggingFaceTB/SmolLM3-3B
tags:
- code
- instruction-following
- pytorch
- smollm
- lora
- finetuned
- general-knowledge
- math
- reasoning
- tool-calling
language:
- code
- en
pipeline_tag: text-generation
library_name: transformers
---
# Fyodor SmolLM3-3B v2 Instruct
Fine-tuned SmolLM3-3B with enhanced general knowledge, coding, math, tool calling, reasoning, and instruction-following capabilities.
## Model Details
- **Base Model**: [HuggingFaceTB/SmolLM3-3B](https://huggingface.co/HuggingFaceTB/SmolLM3-3B)
- **Model Type**: Causal Language Model (3B parameters)
- **Language(s)**: English, Python, and multiple programming languages
- **License**: Apache 2.0
- **Training Method**: LoRA fine-tuning with mixed precision (bfloat16)
- **Model Size**: ~3B parameters
- **Dtype**: bfloat16
## Training Details
### Training Strategy
This model was trained using LoRA (Low-Rank Adaptation) fine-tuning with the following configuration:
- **Training Strategy**: smollm3_3b_lora_hard_merge
- **Final Training Loss**: 0.3240
- **Number of Epochs**: 3
- **Learning Rate**: 2e-4
- **Batch Size**: 8
- **Gradient Accumulation Steps**: 8 (effective batch size: 64)
- **Max Sequence Length**: 1024 tokens
- **Warmup Steps**: 100
### LoRA Configuration
```python
lora_r: 32
lora_alpha: 64
lora_dropout: 0.05
lora_target_modules: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
```
### Training Data Distribution
The model was trained on a carefully balanced mix of high-quality datasets:
- **30% General Knowledge**: MuskumPillerum/General-Knowledge, HuggingFaceH4/ultrachat_200k, teknium/OpenHermes-2.5, cognitivecomputations/dolphin
- **20% Coding**: bigcode/starcoderdata (Python), sahil2801/CodeAlpaca-20k, iamtarun/python_code_instructions_18k_alpaca
- **20% Tool Calling**: Salesforce/xlam-function-calling-60k, glaiveai/glaive-function-calling-v2, NousResearch/hermes-function-calling-v1
- **10% Math**: meta-math/MetaMathQA, openai/gsm8k
- **10% Advanced Reasoning**: Open-Orca/OpenOrca
- **10% Instruction Following**: tatsu-lab/alpaca, HuggingFaceH4/ultrachat_200k
## Usage
### Installation
```bash
pip install transformers torch accelerate
```
### Basic Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
"Kiy-K/Fyodor-Mini-3B",
torch_dtype=torch.bfloat16,
trust_remote_code=True,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("Kiy-K/Fyodor-Mini-3B")
# Generate text
prompt = """### Instruction:
Write a Python function to calculate Fibonacci numbers using dynamic programming.
### Response:
"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.7,
top_p=0.95,
do_sample=True,
pad_token_id=tokenizer.eos_token_id
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```
### Code Generation Example
```python
prompt = """### Instruction:
Create a Python class for a binary search tree with insert and search methods.
### Response:
"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.2)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
### Tool Calling Example
```python
prompt = """You have access to the following functions:
[
{
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"location": {"type": "string", "description": "City name"}
}
}
]
User: What's the weather in Paris?
Assistant:"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.3)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
### Math Problem Solving
```python
prompt = """Question: A train travels 120 km in 2 hours. What is its average speed in km/h?
Answer:"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.1)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
## Capabilities
This model excels at:
-**General Knowledge**: Answering questions across various domains
-**Code Generation**: Writing Python, JavaScript, and other programming languages
-**Mathematical Reasoning**: Solving arithmetic and word problems
-**Tool/Function Calling**: Understanding and generating function calls
-**Chain-of-Thought Reasoning**: Step-by-step problem solving
-**Instruction Following**: Understanding and executing complex instructions
## Recommended Generation Parameters
For best results, use these generation settings based on your use case:
### Code Generation
```python
temperature=0.2
top_p=0.95
max_new_tokens=512
do_sample=True
```
### Creative Writing
```python
temperature=0.8
top_p=0.95
max_new_tokens=1024
do_sample=True
```
### Mathematical Reasoning
```python
temperature=0.1
top_p=0.9
max_new_tokens=512
do_sample=True
```
### General Q&A
```python
temperature=0.7
top_p=0.95
max_new_tokens=512
do_sample=True
```
## Limitations
- Context window limited to 1024 tokens during training (base model supports up to 2048)
- May occasionally generate incorrect information or code
- Not specifically optimized for languages other than English
- Should not be used for medical, legal, or other professional advice without expert review
- Generated code should always be reviewed and tested before production use
- May exhibit biases present in the training data
## Ethical Considerations
- This model can generate code that may have security vulnerabilities - always review before deployment
- The model should not be used to generate malicious code or harmful content
- Be aware of potential biases inherited from training data
- Not suitable for making critical decisions without human oversight
- Users are responsible for ensuring appropriate use of generated content
## Performance Benchmarks
Training metrics:
- **Final Validation Loss**: 0.3240
- **Training Strategy**: Hard LoRA merge
- **Perplexity**: ~1.38 (estimated from loss)
## Model Card Contact
For questions, feedback, or issues, please:
- Open an issue on the [model repository](https://huggingface.co/Kiy-K/Fyodor-Mini-3B)
- Contact the author through Hugging Face
## Citation
If you use this model in your research or applications, please cite:
```bibtex
@misc{fyodor-mini-2025,
author = {Khoi},
title = {Fyodor SmolLM3-3B v2 Instruct},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/Kiy-K/Fyodor-Mini-3B}
}
```
## Acknowledgments
- Base model by [HuggingFace](https://huggingface.co/HuggingFaceTB)
- Built on [SmolLM3-3B](https://huggingface.co/HuggingFaceTB/SmolLM3-3B)
- Training data from various open-source datasets (see Training Details)
- Trained using PyTorch and Transformers library
- GGUF conversions and local hosting accessibilities by Team Mradermacher
---
*This model was trained with care and attention to quality. Always verify outputs for your specific use case.*