Model Card: SmolLM3-Chat-v1
SmolLM3-Chat-v1 is a finetune of the SmolLM3-3B-Base model, designed to be casual, witty, and human-like. Unlike standard assistants that sound robotic and overly formal, this model captures a distinct "internet-native" vibe.
It was trained on a curated mix of high-quality instruction data and custom conversation logs to balance intelligence with personality.
Note: This is the full merged version. If you are looking for the LoRA adapter, please check
SmolLM3-Chat-v1-adapter.
β οΈ Important: System Instructions
Less is more.
This model relies on a specific "vibe" learned during training. Over-prompting it with complex system instructions (e.g., "You are a helpful assistant who is polite, follows rules X, Y, Z...") will actually degrade the quality of the output.
For the System Instuction simply leave it empty for the most raw, casual experience
Ironically, less instruction = more human.
β οΈ Quantization Warning
Avoid re-quantizing this merged model.
This model was trained using QLoRA (on a 4-bit base model) and then merged back to Float16. Compressing this merged model again (e.g., converting it to 4-bit GGUF, AWQ, or GPTQ) causes "double quantization" noise.
This often breaks the specific "vibe" of the model, leading to:
- Broken grammar or incoherent responses.
- Loss of the casual/witty personality.
- Looping issues.
If you need a low-VRAM (4-bit) version:
- β Do not quantize this merged model.
- β
Use the Adapter instead: SmolLM3-Chat-v1-adapter.
- Load the base
SmolLM3-3Bin 4-bit and attach the adapter. This preserves the original training quality.
- Load the base
π» Usage
To get the best performance (and prevent repetition loops), you must use the specific generation configuration below.
import torch
from threading import Thread
from transformers import AutoModelForCausalLM, AutoTokenizer, TextIteratorStreamer
MODEL_ID = "igidn/SmolLM3-Chat-v1"
# 1. Load Model & Tokenizer
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
torch_dtype=torch.float16,
device_map="auto"
)
# 2. Define Conversation
messages = [
{"role": "user", "content": "hellooooo"}
]
# 3. Apply Chat Template
prompt = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
inputs = tokenizer([prompt], return_tensors="pt").to(model.device)
# 4. Streamer Setup
streamer = TextIteratorStreamer(
tokenizer,
timeout=10.0,
skip_prompt=True,
skip_special_tokens=True
)
# 5. Generation Configuration (CRITICAL)
generate_kwargs = dict(
**inputs,
streamer=streamer,
max_new_tokens=512,
do_sample=True,
# Core Parameters for "Vibe"
temperature=0.8, # High creativity
top_p=0.85, # Nuanced sampling
# Stability Parameters
repetition_penalty=1.15, # Prevents "I'll be gone in 5 mins" loops
no_repeat_ngram_size=3, # Hard block on repetitive phrases
pad_token_id=tokenizer.eos_token_id
)
# 6. Run Inference
thread = Thread(target=model.generate, kwargs=generate_kwargs)
thread.start()
print("Assistant: ", end="")
for new_text in streamer:
print(new_text, end="", flush=True)
π Training Details
The model was trained for 2 epochs using SFTTrainer with a cosine learning rate scheduler.
Dataset Composition
- OpenHermes-2.5 (5k subset): Provides logic, reasoning, and general helpfulness.
- Custom Dataset (15k): Focused on casual chat, roleplay, and human-like interaction patterns.
- Total: 20,000 examples.
Training metrics
The model showed steady convergence without catastrophic overfitting. The final loss indicates a strong grasp of the training data without losing generalization capabilities.
| Metric | Start | End |
|---|---|---|
| Loss | 2.47 | 1.41 |
| Token Accuracy | 53.3% | 65.9% |
| Epochs | 0 | 2.0 |
Loss Curve:
- Epoch 0.2: Loss 1.59 (Rapid initial learning)
- Epoch 1.0: Loss 1.65 (Transition point)
- Epoch 2.0: Loss 1.41 (Final convergence)
π οΈ Hyperparameters
- Base Model: HuggingFaceTB/SmolLM3-3B-Base
- Precision: Float16 (Training) / Float16 (Inference)
- LoRA Config: r=32, alpha=64
- Learning Rate: 3e-5 (Cosine Schedule)
- Optimizer: paged_adamw_32bit
Created with <3 by me
- Downloads last month
- 2
Model tree for igidn/SmolLM3-Chat-v1
Base model
HuggingFaceTB/SmolLM3-3B-Base