LFM2.5-1.2B-Instruct LoRA Adapter

LoRA adapter for LiquidAI/LFM2.5-1.2B-Instruct fine-tuned on SmolTalk via the official SFT tutorial.

Training Details

Parameter	Value
Base model	`LiquidAI/LFM2.5-1.2B-Instruct`
Dataset	HuggingFaceTB/smoltalk (5k examples)
Rank (r)	8
Alpha	16
Dropout	0.1
Learning rate	5e-5
Epochs	1
Batch size	1

Target Modules

Attention: [q_proj, k_proj, v_proj, out_proj]
MLP (GLU): [w1, w2, w3]
Conv: [in_proj, out_proj]

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

MODEL_ID = "LiquidAI/LFM2.5-1.2B-Instruct"
LORA_ID = "LiquidAI/LFM2.5-1.2B-Instruct-smoltalk-LoRA"

tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
base_model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)

prompt = "What are some ideas for a good short story about a city not on a planet, but rather a generation ship, or on the moon of a gas giant, or somewhere else unusual?"
messages = [{"role": "user", "content": prompt}]
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(input_text, return_tensors="pt").to(base_model.device)

# Generate with base model (no LoRA)
with torch.no_grad():
    outputs = base_model.generate(**inputs, max_new_tokens=100, do_sample=False)
print("Base:", tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

# Load LoRA and generate
lora_model = PeftModel.from_pretrained(base_model, LORA_ID)
with torch.no_grad():
    outputs = lora_model.generate(**inputs, max_new_tokens=100, do_sample=False)
print("LoRA:", tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

Verification: LoRA Produces Different Outputs

With greedy decoding (do_sample=False), base and LoRA models produce different outputs:

Model	Output (first 100 chars)
Base	`What a fascinating concept! A city on a generation ship, a moon orbiting a gas g...`
LoRA	`Imagine a city on a generation ship hurtling through the vast expanse of space, ...`

vLLM Usage

Launch Server with LoRA

vllm serve LiquidAI/LFM2.5-1.2B-Instruct \
  --host 0.0.0.0 \
  --port 30000 \
  --dtype float16 \
  --enable-lora \
  --max-lora-rank 8 \
  --lora-modules "smoltalk=LiquidAI/LFM2.5-1.2B-Instruct-smoltalk-LoRA"

Call with LoRA

To use the LoRA adapter, set the model field to the LoRA adapter name (smoltalk):

curl -s http://localhost:30000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "smoltalk",
    "messages": [{"role": "user", "content": "What are some ideas for a good short story about a city not on a planet, but rather a generation ship, or on the moon of a gas giant, or somewhere else unusual?"}],
    "max_tokens": 100,
    "temperature": 0.0
  }'

Call without LoRA (base model)

curl -s http://localhost:30000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "LiquidAI/LFM2.5-1.2B-Instruct",
    "messages": [{"role": "user", "content": "What are some ideas for a good short story about a city not on a planet, but rather a generation ship, or on the moon of a gas giant, or somewhere else unusual?"}],
    "max_tokens": 100,
    "temperature": 0.0
  }'

SGLang Usage

Note: Requires the internal SGLang implementation with LFM2 LoRA support.

Launch Server with LoRA

python -m sglang.launch_server \
    --model-path LiquidAI/LFM2.5-1.2B-Instruct \
    --port 30000 \
    --enable-lora \
    --max-lora-rank 8 \
    --lora-paths "smoltalk=LiquidAI/LFM2.5-1.2B-Instruct-smoltalk-LoRA" \
    --lora-target-modules q_proj k_proj v_proj out_proj w1 w2 w3 in_proj

Call with LoRA

Option 1: Using lora_path parameter

curl -s http://localhost:30000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "LiquidAI/LFM2.5-1.2B-Instruct",
    "lora_path": "smoltalk",
    "messages": [{"role": "user", "content": "What are some ideas for a good short story about a city not on a planet, but rather a generation ship, or on the moon of a gas giant, or somewhere else unusual?"}],
    "max_tokens": 100,
    "temperature": 0.0
  }'

Option 2: Using colon syntax in model name

curl -s http://localhost:30000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "LiquidAI/LFM2.5-1.2B-Instruct:smoltalk",
    "messages": [{"role": "user", "content": "What are some ideas for a good short story about a city not on a planet, but rather a generation ship, or on the moon of a gas giant, or somewhere else unusual?"}],
    "max_tokens": 100,
    "temperature": 0.0
  }'

Call without LoRA (base model)

curl -s http://localhost:30000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "LiquidAI/LFM2.5-1.2B-Instruct",
    "messages": [{"role": "user", "content": "What are some ideas for a good short story about a city not on a planet, but rather a generation ship, or on the moon of a gas giant, or somewhere else unusual?"}],
    "max_tokens": 100,
    "temperature": 0.0
  }'

Downloads last month: 14

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for LiquidAI/LFM2.5-1.2B-Instruct-smoltalk-LoRA

Base model

LiquidAI/LFM2.5-1.2B-Base

Finetuned

LiquidAI/LFM2.5-1.2B-Instruct

Adapter

(25)

this model

LiquidAI
/

LFM2.5-1.2B-Instruct-smoltalk-LoRA

LFM2.5-1.2B-Instruct LoRA Adapter

Training Details

Target Modules

Usage

Verification: LoRA Produces Different Outputs

vLLM Usage

Launch Server with LoRA

Call with LoRA

Call without LoRA (base model)

SGLang Usage

Launch Server with LoRA

Call with LoRA

Call without LoRA (base model)

Model tree for LiquidAI/LFM2.5-1.2B-Instruct-smoltalk-LoRA

Dataset used to train LiquidAI/LFM2.5-1.2B-Instruct-smoltalk-LoRA