HuggingFaceTB/smoltalk
Viewer • Updated • 2.2M • 16.1k • 407
How to use LiquidAI/LFM2.5-1.2B-Instruct-smoltalk-LoRA with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("LiquidAI/LFM2.5-1.2B-Instruct")
model = PeftModel.from_pretrained(base_model, "LiquidAI/LFM2.5-1.2B-Instruct-smoltalk-LoRA")LoRA adapter for LiquidAI/LFM2.5-1.2B-Instruct fine-tuned on SmolTalk via the official SFT tutorial.
| Parameter | Value |
|---|---|
| Base model | LiquidAI/LFM2.5-1.2B-Instruct |
| Dataset | HuggingFaceTB/smoltalk (5k examples) |
| Rank (r) | 8 |
| Alpha | 16 |
| Dropout | 0.1 |
| Learning rate | 5e-5 |
| Epochs | 1 |
| Batch size | 1 |
q_proj, k_proj, v_proj, out_proj]w1, w2, w3]in_proj, out_proj]from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
MODEL_ID = "LiquidAI/LFM2.5-1.2B-Instruct"
LORA_ID = "LiquidAI/LFM2.5-1.2B-Instruct-smoltalk-LoRA"
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
base_model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
)
prompt = "What are some ideas for a good short story about a city not on a planet, but rather a generation ship, or on the moon of a gas giant, or somewhere else unusual?"
messages = [{"role": "user", "content": prompt}]
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(input_text, return_tensors="pt").to(base_model.device)
# Generate with base model (no LoRA)
with torch.no_grad():
outputs = base_model.generate(**inputs, max_new_tokens=100, do_sample=False)
print("Base:", tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
# Load LoRA and generate
lora_model = PeftModel.from_pretrained(base_model, LORA_ID)
with torch.no_grad():
outputs = lora_model.generate(**inputs, max_new_tokens=100, do_sample=False)
print("LoRA:", tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
With greedy decoding (do_sample=False), base and LoRA models produce different outputs:
| Model | Output (first 100 chars) |
|---|---|
| Base | What a fascinating concept! A city on a generation ship, a moon orbiting a gas g... |
| LoRA | Imagine a city on a generation ship hurtling through the vast expanse of space, ... |
vllm serve LiquidAI/LFM2.5-1.2B-Instruct \
--host 0.0.0.0 \
--port 30000 \
--dtype float16 \
--enable-lora \
--max-lora-rank 8 \
--lora-modules "smoltalk=LiquidAI/LFM2.5-1.2B-Instruct-smoltalk-LoRA"
To use the LoRA adapter, set the model field to the LoRA adapter name (smoltalk):
curl -s http://localhost:30000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "smoltalk",
"messages": [{"role": "user", "content": "What are some ideas for a good short story about a city not on a planet, but rather a generation ship, or on the moon of a gas giant, or somewhere else unusual?"}],
"max_tokens": 100,
"temperature": 0.0
}'
curl -s http://localhost:30000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "LiquidAI/LFM2.5-1.2B-Instruct",
"messages": [{"role": "user", "content": "What are some ideas for a good short story about a city not on a planet, but rather a generation ship, or on the moon of a gas giant, or somewhere else unusual?"}],
"max_tokens": 100,
"temperature": 0.0
}'
Note: Requires the internal SGLang implementation with LFM2 LoRA support.
python -m sglang.launch_server \
--model-path LiquidAI/LFM2.5-1.2B-Instruct \
--port 30000 \
--enable-lora \
--max-lora-rank 8 \
--lora-paths "smoltalk=LiquidAI/LFM2.5-1.2B-Instruct-smoltalk-LoRA" \
--lora-target-modules q_proj k_proj v_proj out_proj w1 w2 w3 in_proj
Option 1: Using lora_path parameter
curl -s http://localhost:30000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "LiquidAI/LFM2.5-1.2B-Instruct",
"lora_path": "smoltalk",
"messages": [{"role": "user", "content": "What are some ideas for a good short story about a city not on a planet, but rather a generation ship, or on the moon of a gas giant, or somewhere else unusual?"}],
"max_tokens": 100,
"temperature": 0.0
}'
Option 2: Using colon syntax in model name
curl -s http://localhost:30000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "LiquidAI/LFM2.5-1.2B-Instruct:smoltalk",
"messages": [{"role": "user", "content": "What are some ideas for a good short story about a city not on a planet, but rather a generation ship, or on the moon of a gas giant, or somewhere else unusual?"}],
"max_tokens": 100,
"temperature": 0.0
}'
curl -s http://localhost:30000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "LiquidAI/LFM2.5-1.2B-Instruct",
"messages": [{"role": "user", "content": "What are some ideas for a good short story about a city not on a planet, but rather a generation ship, or on the moon of a gas giant, or somewhere else unusual?"}],
"max_tokens": 100,
"temperature": 0.0
}'
Base model
LiquidAI/LFM2.5-1.2B-Base