Instructions to use edbuildingstuff/LFM2.5-1.2B-Instruct-ertas with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use edbuildingstuff/LFM2.5-1.2B-Instruct-ertas with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="edbuildingstuff/LFM2.5-1.2B-Instruct-ertas", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("edbuildingstuff/LFM2.5-1.2B-Instruct-ertas", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("edbuildingstuff/LFM2.5-1.2B-Instruct-ertas", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use edbuildingstuff/LFM2.5-1.2B-Instruct-ertas with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "edbuildingstuff/LFM2.5-1.2B-Instruct-ertas"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "edbuildingstuff/LFM2.5-1.2B-Instruct-ertas",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/edbuildingstuff/LFM2.5-1.2B-Instruct-ertas

SGLang

How to use edbuildingstuff/LFM2.5-1.2B-Instruct-ertas with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "edbuildingstuff/LFM2.5-1.2B-Instruct-ertas" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "edbuildingstuff/LFM2.5-1.2B-Instruct-ertas",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "edbuildingstuff/LFM2.5-1.2B-Instruct-ertas" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "edbuildingstuff/LFM2.5-1.2B-Instruct-ertas",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use edbuildingstuff/LFM2.5-1.2B-Instruct-ertas with Docker Model Runner:
```
docker model run hf.co/edbuildingstuff/LFM2.5-1.2B-Instruct-ertas
```

LFM2.5-1.2B-Instruct (linear-name variant)

A drop-in, numerically identical variant of LiquidAI/LFM2.5-1.2B-Instruct whose linear sub-modules are renamed to the Llama convention, so LoRA tooling that defaults to o_proj / gate_proj / up_proj / down_proj targets the full attention + MLP surface instead of q/k/v_proj only.

This is not a new model. The weights are bit-for-bit those of the base; only the attribute and tensor names change. Verified numerically identical to the base on the same input (max absolute logit difference = 0.0).

Why this exists

LFM2 names its attention output out_proj and its SwiGLU MLP w1/w3/w2. A LoRA config that lists the standard Llama names matches only q/k/v_proj and silently skips the attention output and the MLP. This variant renames those modules so the same default trains the whole linear surface.

stock LFM2	this variant
`self_attn.out_proj`	`self_attn.o_proj`
`feed_forward.w1`	`feed_forward.gate_proj`
`feed_forward.w3`	`feed_forward.up_proj`
`feed_forward.w2`	`feed_forward.down_proj`

Conv blocks (conv.in_proj, conv.out_proj) are unchanged.

Usage

The renamed attributes come from the bundled modeling_lfm2_ertas.py, so loading needs trust_remote_code=True:

from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("<this-repo>", trust_remote_code=True)

For inference or GGUF export, use the stock base and map adapter names back (o_proj to out_proj, gate/up/down_proj to w1/w3/w2).

Attribution and license

Derived from LiquidAI/LFM2.5-1.2B-Instruct, copyright Liquid AI, distributed under the LFM Open License v1.0 (see LICENSE). This naming variant was prepared by Ertas AI for internal LoRA fine-tuning tooling. All model capabilities, weights, and credit belong to Liquid AI.

Downloads last month: 23

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for edbuildingstuff/LFM2.5-1.2B-Instruct-ertas

Base model

LiquidAI/LFM2.5-1.2B-Base

Finetuned

LiquidAI/LFM2.5-1.2B-Instruct

Finetuned

(97)

this model