SmolQwen-0.5B-Chat

SmolQwen-0.5B-Chat is a lightweight, fine-tuned instruction-following model designed for edge devices and local deployment. It is based on Qwen2.5-0.5B-Instruct and has been fine-tuned using LoRA (Low-Rank Adaptation) on the Magpie-Ultra subset of the SmolTalk2 dataset.

This model is optimized to be a "smart assistant" that adheres to strict constraints (e.g., negative instructions) and writes robust, production-style code with error handling, correcting common behaviors found in the base model.

Model Details

Model Description

Developed by: LVN-Architech
Model type: Causal Language Model (Fine-tuned with LoRA)
Language(s): English
License: Apache 2.0
Finetuned from model: Qwen/Qwen2.5-0.5B-Instruct

Training Details

Training Data

The model was trained on 10,000 examples from the Magpie-Ultra subset...

Uses

Direct Use

This model is designed for:

Edge Computing: Running on Raspberry Pi, mobile phones, or consumer laptops without a dedicated GPU.
Smart Home Commands: Parsing natural language into JSON or strict commands.
Coding Assistance: Writing simple Python functions with built-in error handling and input validation.
Chatbots: General conversation with a consistent, helpful persona.

Out-of-Scope Use

Complex Reasoning: Do not use for advanced math, logic puzzles, or medical/legal advice. The 0.5B parameter size limits its "world knowledge."
Long Context: Performance may degrade with extremely long prompts (>8k tokens).

🚀 Key Improvements vs. Base Model

We benchmarked this adapter against the original Qwen-0.5B-Instruct. The fine-tuning significantly alters behavior in three key areas:

Test Case	Base Model (Qwen-0.5B)	SmolQwen-0.5B-Chat (This Model)
Instruction: "Decline a job offer."	❌ Failed: Wrote an email accepting an interview request.	✅ Success: Correctly wrote a polite email declining the offer.
Coding Style	⚠️ Basic: Writes simple, functional code.	✅ Robust: Defaults to adding `try/except` blocks and input validation.
Tone	😐 Dry: Sounds like a textbook.	🗣️ Engaging: Uses warmer language and better analogies.

How to Get Started with the Model

You can use this model with the peft library. It automatically loads the base model and applies the adapter.

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# 1. Load Base Model
base_model_name = "Qwen/Qwen2.5-0.5B-Instruct"
model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    torch_dtype=torch.float16,
    device_map="auto"
)

# 2. Load the Adapter
adapter_model_name = "LVN-Architech/SmolQwen-0.5B-Chat"
model = PeftModel.from_pretrained(model, adapter_model_name)
tokenizer = AutoTokenizer.from_pretrained(base_model_name)

# 3. Run Inference
prompt = "Write a professional email declining a job offer politely."
messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

inputs = tokenizer(text, return_tensors="pt").to("cuda")

# Use temperature=0.1 for consistent results
outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.1)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Downloads last month: 1

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for LVN-Architech/SmolQwen-0.5B-Chat

Base model

Qwen/Qwen2.5-0.5B

Finetuned

Qwen/Qwen2.5-0.5B-Instruct

Adapter

(500)

this model

LVN-Architech
/

SmolQwen-0.5B-Chat