README.md · Kiy-K/Fyodor-Mini-3B at main

Fyodor-Mini-3B / README.md

Kiy-K

Update README.md

4deb51f verified about 2 months ago

preview code

raw

history blame contribute delete

7.33 kB

	---
	license: apache-2.0
	base_model: HuggingFaceTB/SmolLM3-3B
	tags:
	- code
	- instruction-following
	- pytorch
	- smollm
	- lora
	- finetuned
	- general-knowledge
	- math
	- reasoning
	- tool-calling
	language:
	- code
	- en
	pipeline_tag: text-generation
	library_name: transformers
	---

	# Fyodor SmolLM3-3B v2 Instruct

	Fine-tuned SmolLM3-3B with enhanced general knowledge, coding, math, tool calling, reasoning, and instruction-following capabilities.

	## Model Details

	- Base Model: [HuggingFaceTB/SmolLM3-3B](https://huggingface.co/HuggingFaceTB/SmolLM3-3B)
	- Model Type: Causal Language Model (3B parameters)
	- Language(s): English, Python, and multiple programming languages
	- License: Apache 2.0
	- Training Method: LoRA fine-tuning with mixed precision (bfloat16)
	- Model Size: ~3B parameters
	- Dtype: bfloat16

	## Training Details

	### Training Strategy

	This model was trained using LoRA (Low-Rank Adaptation) fine-tuning with the following configuration:

	- Training Strategy: smollm3_3b_lora_hard_merge
	- Final Training Loss: 0.3240
	- Number of Epochs: 3
	- Learning Rate: 2e-4
	- Batch Size: 8
	- Gradient Accumulation Steps: 8 (effective batch size: 64)
	- Max Sequence Length: 1024 tokens
	- Warmup Steps: 100

	### LoRA Configuration

	```python
	lora_r: 32
	lora_alpha: 64
	lora_dropout: 0.05
	lora_target_modules: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
	```

	### Training Data Distribution

	The model was trained on a carefully balanced mix of high-quality datasets:

	- 30% General Knowledge: MuskumPillerum/General-Knowledge, HuggingFaceH4/ultrachat_200k, teknium/OpenHermes-2.5, cognitivecomputations/dolphin
	- 20% Coding: bigcode/starcoderdata (Python), sahil2801/CodeAlpaca-20k, iamtarun/python_code_instructions_18k_alpaca
	- 20% Tool Calling: Salesforce/xlam-function-calling-60k, glaiveai/glaive-function-calling-v2, NousResearch/hermes-function-calling-v1
	- 10% Math: meta-math/MetaMathQA, openai/gsm8k
	- 10% Advanced Reasoning: Open-Orca/OpenOrca
	- 10% Instruction Following: tatsu-lab/alpaca, HuggingFaceH4/ultrachat_200k

	## Usage

	### Installation

	```bash
	pip install transformers torch accelerate
	```

	### Basic Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	# Load model and tokenizer
	model = AutoModelForCausalLM.from_pretrained(
	"Kiy-K/Fyodor-Mini-3B",
	torch_dtype=torch.bfloat16,
	trust_remote_code=True,
	device_map="auto"
	)

	tokenizer = AutoTokenizer.from_pretrained("Kiy-K/Fyodor-Mini-3B")

	# Generate text
	prompt = """### Instruction:
	Write a Python function to calculate Fibonacci numbers using dynamic programming.

	### Response:
	"""

	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

	with torch.no_grad():
	outputs = model.generate(
	**inputs,
	max_new_tokens=512,
	temperature=0.7,
	top_p=0.95,
	do_sample=True,
	pad_token_id=tokenizer.eos_token_id
	)

	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(response)
	```

	### Code Generation Example

	```python
	prompt = """### Instruction:
	Create a Python class for a binary search tree with insert and search methods.

	### Response:
	"""

	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.2)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	### Tool Calling Example

	```python
	prompt = """You have access to the following functions:

	[
	{
	"name": "get_weather",
	"description": "Get current weather for a location",
	"parameters": {
	"location": {"type": "string", "description": "City name"}
	}
	}
	]

	User: What's the weather in Paris?
	Assistant:"""

	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.3)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	### Math Problem Solving

	```python
	prompt = """Question: A train travels 120 km in 2 hours. What is its average speed in km/h?
	Answer:"""

	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.1)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	## Capabilities

	This model excels at:

	- ✅ General Knowledge: Answering questions across various domains
	- ✅ Code Generation: Writing Python, JavaScript, and other programming languages
	- ✅ Mathematical Reasoning: Solving arithmetic and word problems
	- ✅ Tool/Function Calling: Understanding and generating function calls
	- ✅ Chain-of-Thought Reasoning: Step-by-step problem solving
	- ✅ Instruction Following: Understanding and executing complex instructions

	## Recommended Generation Parameters

	For best results, use these generation settings based on your use case:

	### Code Generation
	```python
	temperature=0.2
	top_p=0.95
	max_new_tokens=512
	do_sample=True
	```

	### Creative Writing
	```python
	temperature=0.8
	top_p=0.95
	max_new_tokens=1024
	do_sample=True
	```

	### Mathematical Reasoning
	```python
	temperature=0.1
	top_p=0.9
	max_new_tokens=512
	do_sample=True
	```

	### General Q&A
	```python
	temperature=0.7
	top_p=0.95
	max_new_tokens=512
	do_sample=True
	```

	## Limitations

	- Context window limited to 1024 tokens during training (base model supports up to 2048)
	- May occasionally generate incorrect information or code
	- Not specifically optimized for languages other than English
	- Should not be used for medical, legal, or other professional advice without expert review
	- Generated code should always be reviewed and tested before production use
	- May exhibit biases present in the training data

	## Ethical Considerations

	- This model can generate code that may have security vulnerabilities - always review before deployment
	- The model should not be used to generate malicious code or harmful content
	- Be aware of potential biases inherited from training data
	- Not suitable for making critical decisions without human oversight
	- Users are responsible for ensuring appropriate use of generated content

	## Performance Benchmarks

	Training metrics:
	- Final Validation Loss: 0.3240
	- Training Strategy: Hard LoRA merge
	- Perplexity: ~1.38 (estimated from loss)

	## Model Card Contact

	For questions, feedback, or issues, please:
	- Open an issue on the [model repository](https://huggingface.co/Kiy-K/Fyodor-Mini-3B)
	- Contact the author through Hugging Face

	## Citation

	If you use this model in your research or applications, please cite:

	```bibtex
	@misc{fyodor-mini-2025,
	author = {Khoi},
	title = {Fyodor SmolLM3-3B v2 Instruct},
	year = {2025},
	publisher = {HuggingFace},
	url = {https://huggingface.co/Kiy-K/Fyodor-Mini-3B}
	}
	```

	## Acknowledgments

	- Base model by [HuggingFace](https://huggingface.co/HuggingFaceTB)
	- Built on [SmolLM3-3B](https://huggingface.co/HuggingFaceTB/SmolLM3-3B)
	- Training data from various open-source datasets (see Training Details)
	- Trained using PyTorch and Transformers library
	- GGUF conversions and local hosting accessibilities by Team Mradermacher

	---

	This model was trained with care and attention to quality. Always verify outputs for your specific use case.