--- license: apache-2.0 base_model: HuggingFaceTB/SmolLM3-3B tags: - code - instruction-following - pytorch - smollm - lora - finetuned - general-knowledge - math - reasoning - tool-calling language: - code - en pipeline_tag: text-generation library_name: transformers --- # Fyodor SmolLM3-3B v2 Instruct Fine-tuned SmolLM3-3B with enhanced general knowledge, coding, math, tool calling, reasoning, and instruction-following capabilities. ## Model Details - **Base Model**: [HuggingFaceTB/SmolLM3-3B](https://huggingface.co/HuggingFaceTB/SmolLM3-3B) - **Model Type**: Causal Language Model (3B parameters) - **Language(s)**: English, Python, and multiple programming languages - **License**: Apache 2.0 - **Training Method**: LoRA fine-tuning with mixed precision (bfloat16) - **Model Size**: ~3B parameters - **Dtype**: bfloat16 ## Training Details ### Training Strategy This model was trained using LoRA (Low-Rank Adaptation) fine-tuning with the following configuration: - **Training Strategy**: smollm3_3b_lora_hard_merge - **Final Training Loss**: 0.3240 - **Number of Epochs**: 3 - **Learning Rate**: 2e-4 - **Batch Size**: 8 - **Gradient Accumulation Steps**: 8 (effective batch size: 64) - **Max Sequence Length**: 1024 tokens - **Warmup Steps**: 100 ### LoRA Configuration ```python lora_r: 32 lora_alpha: 64 lora_dropout: 0.05 lora_target_modules: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"] ``` ### Training Data Distribution The model was trained on a carefully balanced mix of high-quality datasets: - **30% General Knowledge**: MuskumPillerum/General-Knowledge, HuggingFaceH4/ultrachat_200k, teknium/OpenHermes-2.5, cognitivecomputations/dolphin - **20% Coding**: bigcode/starcoderdata (Python), sahil2801/CodeAlpaca-20k, iamtarun/python_code_instructions_18k_alpaca - **20% Tool Calling**: Salesforce/xlam-function-calling-60k, glaiveai/glaive-function-calling-v2, NousResearch/hermes-function-calling-v1 - **10% Math**: meta-math/MetaMathQA, openai/gsm8k - **10% Advanced Reasoning**: Open-Orca/OpenOrca - **10% Instruction Following**: tatsu-lab/alpaca, HuggingFaceH4/ultrachat_200k ## Usage ### Installation ```bash pip install transformers torch accelerate ``` ### Basic Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch # Load model and tokenizer model = AutoModelForCausalLM.from_pretrained( "Kiy-K/Fyodor-Mini-3B", torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained("Kiy-K/Fyodor-Mini-3B") # Generate text prompt = """### Instruction: Write a Python function to calculate Fibonacci numbers using dynamic programming. ### Response: """ inputs = tokenizer(prompt, return_tensors="pt").to(model.device) with torch.no_grad(): outputs = model.generate( **inputs, max_new_tokens=512, temperature=0.7, top_p=0.95, do_sample=True, pad_token_id=tokenizer.eos_token_id ) response = tokenizer.decode(outputs[0], skip_special_tokens=True) print(response) ``` ### Code Generation Example ```python prompt = """### Instruction: Create a Python class for a binary search tree with insert and search methods. ### Response: """ inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.2) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ### Tool Calling Example ```python prompt = """You have access to the following functions: [ { "name": "get_weather", "description": "Get current weather for a location", "parameters": { "location": {"type": "string", "description": "City name"} } } ] User: What's the weather in Paris? Assistant:""" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.3) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ### Math Problem Solving ```python prompt = """Question: A train travels 120 km in 2 hours. What is its average speed in km/h? Answer:""" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.1) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Capabilities This model excels at: - ✅ **General Knowledge**: Answering questions across various domains - ✅ **Code Generation**: Writing Python, JavaScript, and other programming languages - ✅ **Mathematical Reasoning**: Solving arithmetic and word problems - ✅ **Tool/Function Calling**: Understanding and generating function calls - ✅ **Chain-of-Thought Reasoning**: Step-by-step problem solving - ✅ **Instruction Following**: Understanding and executing complex instructions ## Recommended Generation Parameters For best results, use these generation settings based on your use case: ### Code Generation ```python temperature=0.2 top_p=0.95 max_new_tokens=512 do_sample=True ``` ### Creative Writing ```python temperature=0.8 top_p=0.95 max_new_tokens=1024 do_sample=True ``` ### Mathematical Reasoning ```python temperature=0.1 top_p=0.9 max_new_tokens=512 do_sample=True ``` ### General Q&A ```python temperature=0.7 top_p=0.95 max_new_tokens=512 do_sample=True ``` ## Limitations - Context window limited to 1024 tokens during training (base model supports up to 2048) - May occasionally generate incorrect information or code - Not specifically optimized for languages other than English - Should not be used for medical, legal, or other professional advice without expert review - Generated code should always be reviewed and tested before production use - May exhibit biases present in the training data ## Ethical Considerations - This model can generate code that may have security vulnerabilities - always review before deployment - The model should not be used to generate malicious code or harmful content - Be aware of potential biases inherited from training data - Not suitable for making critical decisions without human oversight - Users are responsible for ensuring appropriate use of generated content ## Performance Benchmarks Training metrics: - **Final Validation Loss**: 0.3240 - **Training Strategy**: Hard LoRA merge - **Perplexity**: ~1.38 (estimated from loss) ## Model Card Contact For questions, feedback, or issues, please: - Open an issue on the [model repository](https://huggingface.co/Kiy-K/Fyodor-Mini-3B) - Contact the author through Hugging Face ## Citation If you use this model in your research or applications, please cite: ```bibtex @misc{fyodor-mini-2025, author = {Khoi}, title = {Fyodor SmolLM3-3B v2 Instruct}, year = {2025}, publisher = {HuggingFace}, url = {https://huggingface.co/Kiy-K/Fyodor-Mini-3B} } ``` ## Acknowledgments - Base model by [HuggingFace](https://huggingface.co/HuggingFaceTB) - Built on [SmolLM3-3B](https://huggingface.co/HuggingFaceTB/SmolLM3-3B) - Training data from various open-source datasets (see Training Details) - Trained using PyTorch and Transformers library - GGUF conversions and local hosting accessibilities by Team Mradermacher --- *This model was trained with care and attention to quality. Always verify outputs for your specific use case.*