Instructions to use edbuildingstuff/LFM2.5-1.2B-Instruct-ertas with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use edbuildingstuff/LFM2.5-1.2B-Instruct-ertas with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="edbuildingstuff/LFM2.5-1.2B-Instruct-ertas", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("edbuildingstuff/LFM2.5-1.2B-Instruct-ertas", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("edbuildingstuff/LFM2.5-1.2B-Instruct-ertas", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use edbuildingstuff/LFM2.5-1.2B-Instruct-ertas with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "edbuildingstuff/LFM2.5-1.2B-Instruct-ertas" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "edbuildingstuff/LFM2.5-1.2B-Instruct-ertas", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/edbuildingstuff/LFM2.5-1.2B-Instruct-ertas
- SGLang
How to use edbuildingstuff/LFM2.5-1.2B-Instruct-ertas with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "edbuildingstuff/LFM2.5-1.2B-Instruct-ertas" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "edbuildingstuff/LFM2.5-1.2B-Instruct-ertas", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "edbuildingstuff/LFM2.5-1.2B-Instruct-ertas" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "edbuildingstuff/LFM2.5-1.2B-Instruct-ertas", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use edbuildingstuff/LFM2.5-1.2B-Instruct-ertas with Docker Model Runner:
docker model run hf.co/edbuildingstuff/LFM2.5-1.2B-Instruct-ertas
LFM2.5-1.2B-Instruct (linear-name variant)
A drop-in, numerically identical variant of
LiquidAI/LFM2.5-1.2B-Instruct
whose linear sub-modules are renamed to the Llama convention, so LoRA tooling that
defaults to o_proj / gate_proj / up_proj / down_proj targets the full
attention + MLP surface instead of q/k/v_proj only.
This is not a new model. The weights are bit-for-bit those of the base; only the attribute and tensor names change. Verified numerically identical to the base on the same input (max absolute logit difference = 0.0).
Why this exists
LFM2 names its attention output out_proj and its SwiGLU MLP w1/w3/w2. A LoRA
config that lists the standard Llama names matches only q/k/v_proj and silently skips
the attention output and the MLP. This variant renames those modules so the same default
trains the whole linear surface.
| stock LFM2 | this variant |
|---|---|
self_attn.out_proj |
self_attn.o_proj |
feed_forward.w1 |
feed_forward.gate_proj |
feed_forward.w3 |
feed_forward.up_proj |
feed_forward.w2 |
feed_forward.down_proj |
Conv blocks (conv.in_proj, conv.out_proj) are unchanged.
Usage
The renamed attributes come from the bundled modeling_lfm2_ertas.py, so loading needs
trust_remote_code=True:
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("<this-repo>", trust_remote_code=True)
For inference or GGUF export, use the stock base and map adapter names back
(o_proj to out_proj, gate/up/down_proj to w1/w3/w2).
Attribution and license
Derived from LiquidAI/LFM2.5-1.2B-Instruct, copyright Liquid AI, distributed under the
LFM Open License v1.0 (see LICENSE). This naming variant was prepared by Ertas AI for
internal LoRA fine-tuning tooling. All model capabilities, weights, and credit belong to
Liquid AI.
- Downloads last month
- 23
Model tree for edbuildingstuff/LFM2.5-1.2B-Instruct-ertas
Base model
LiquidAI/LFM2.5-1.2B-Base