Fyodor-Mini-3B / README.md

Kiy-K

Update README.md

4deb51f verified about 2 months ago

preview code

raw

history blame contribute delete

7.33 kB

metadata

license: apache-2.0
base_model: HuggingFaceTB/SmolLM3-3B
tags:
  - code
  - instruction-following
  - pytorch
  - smollm
  - lora
  - finetuned
  - general-knowledge
  - math
  - reasoning
  - tool-calling
language:
  - code
  - en
pipeline_tag: text-generation
library_name: transformers

Fyodor SmolLM3-3B v2 Instruct

Fine-tuned SmolLM3-3B with enhanced general knowledge, coding, math, tool calling, reasoning, and instruction-following capabilities.

Model Details

Base Model: HuggingFaceTB/SmolLM3-3B
Model Type: Causal Language Model (3B parameters)
Language(s): English, Python, and multiple programming languages
License: Apache 2.0
Training Method: LoRA fine-tuning with mixed precision (bfloat16)
Model Size: ~3B parameters
Dtype: bfloat16

Training Details

Training Strategy

This model was trained using LoRA (Low-Rank Adaptation) fine-tuning with the following configuration:

Training Strategy: smollm3_3b_lora_hard_merge
Final Training Loss: 0.3240
Number of Epochs: 3
Learning Rate: 2e-4
Batch Size: 8
Gradient Accumulation Steps: 8 (effective batch size: 64)
Max Sequence Length: 1024 tokens
Warmup Steps: 100

LoRA Configuration

lora_r: 32
lora_alpha: 64
lora_dropout: 0.05
lora_target_modules: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]

Training Data Distribution

The model was trained on a carefully balanced mix of high-quality datasets:

30% General Knowledge: MuskumPillerum/General-Knowledge, HuggingFaceH4/ultrachat_200k, teknium/OpenHermes-2.5, cognitivecomputations/dolphin
20% Coding: bigcode/starcoderdata (Python), sahil2801/CodeAlpaca-20k, iamtarun/python_code_instructions_18k_alpaca
20% Tool Calling: Salesforce/xlam-function-calling-60k, glaiveai/glaive-function-calling-v2, NousResearch/hermes-function-calling-v1
10% Math: meta-math/MetaMathQA, openai/gsm8k
10% Advanced Reasoning: Open-Orca/OpenOrca
10% Instruction Following: tatsu-lab/alpaca, HuggingFaceH4/ultrachat_200k

Usage

Installation

pip install transformers torch accelerate

Basic Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
    "Kiy-K/Fyodor-Mini-3B",
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    device_map="auto"
)

tokenizer = AutoTokenizer.from_pretrained("Kiy-K/Fyodor-Mini-3B")

# Generate text
prompt = """### Instruction:
Write a Python function to calculate Fibonacci numbers using dynamic programming.

### Response:
"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=512,
        temperature=0.7,
        top_p=0.95,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Code Generation Example

prompt = """### Instruction:
Create a Python class for a binary search tree with insert and search methods.

### Response:
"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.2)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Tool Calling Example

prompt = """You have access to the following functions:

[
  {
    "name": "get_weather",
    "description": "Get current weather for a location",
    "parameters": {
      "location": {"type": "string", "description": "City name"}
    }
  }
]

User: What's the weather in Paris?
Assistant:"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.3)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Math Problem Solving

prompt = """Question: A train travels 120 km in 2 hours. What is its average speed in km/h?
Answer:"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.1)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Capabilities

This model excels at:

✅ General Knowledge: Answering questions across various domains
✅ Code Generation: Writing Python, JavaScript, and other programming languages
✅ Mathematical Reasoning: Solving arithmetic and word problems
✅ Tool/Function Calling: Understanding and generating function calls
✅ Chain-of-Thought Reasoning: Step-by-step problem solving
✅ Instruction Following: Understanding and executing complex instructions

Recommended Generation Parameters

For best results, use these generation settings based on your use case:

Code Generation

temperature=0.2
top_p=0.95
max_new_tokens=512
do_sample=True

Creative Writing

temperature=0.8
top_p=0.95
max_new_tokens=1024
do_sample=True

Mathematical Reasoning

temperature=0.1
top_p=0.9
max_new_tokens=512
do_sample=True

General Q&A

temperature=0.7
top_p=0.95
max_new_tokens=512
do_sample=True

Limitations

Context window limited to 1024 tokens during training (base model supports up to 2048)
May occasionally generate incorrect information or code
Not specifically optimized for languages other than English
Should not be used for medical, legal, or other professional advice without expert review
Generated code should always be reviewed and tested before production use
May exhibit biases present in the training data

Ethical Considerations

This model can generate code that may have security vulnerabilities - always review before deployment
The model should not be used to generate malicious code or harmful content
Be aware of potential biases inherited from training data
Not suitable for making critical decisions without human oversight
Users are responsible for ensuring appropriate use of generated content

Performance Benchmarks

Training metrics:

Final Validation Loss: 0.3240
Training Strategy: Hard LoRA merge
Perplexity: ~1.38 (estimated from loss)

Model Card Contact

For questions, feedback, or issues, please:

Open an issue on the model repository
Contact the author through Hugging Face

Citation

If you use this model in your research or applications, please cite:

@misc{fyodor-mini-2025,
  author = {Khoi},
  title = {Fyodor SmolLM3-3B v2 Instruct},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/Kiy-K/Fyodor-Mini-3B}
}

Acknowledgments

Base model by HuggingFace
Built on SmolLM3-3B
Training data from various open-source datasets (see Training Details)
Trained using PyTorch and Transformers library
GGUF conversions and local hosting accessibilities by Team Mradermacher

This model was trained with care and attention to quality. Always verify outputs for your specific use case.