LFM2.5-1.2B-Instruct (linear-name variant)

A drop-in, numerically identical variant of LiquidAI/LFM2.5-1.2B-Instruct whose linear sub-modules are renamed to the Llama convention, so LoRA tooling that defaults to o_proj / gate_proj / up_proj / down_proj targets the full attention + MLP surface instead of q/k/v_proj only.

This is not a new model. The weights are bit-for-bit those of the base; only the attribute and tensor names change. Verified numerically identical to the base on the same input (max absolute logit difference = 0.0).

Why this exists

LFM2 names its attention output out_proj and its SwiGLU MLP w1/w3/w2. A LoRA config that lists the standard Llama names matches only q/k/v_proj and silently skips the attention output and the MLP. This variant renames those modules so the same default trains the whole linear surface.

stock LFM2 this variant
self_attn.out_proj self_attn.o_proj
feed_forward.w1 feed_forward.gate_proj
feed_forward.w3 feed_forward.up_proj
feed_forward.w2 feed_forward.down_proj

Conv blocks (conv.in_proj, conv.out_proj) are unchanged.

Usage

The renamed attributes come from the bundled modeling_lfm2_ertas.py, so loading needs trust_remote_code=True:

from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("<this-repo>", trust_remote_code=True)

For inference or GGUF export, use the stock base and map adapter names back (o_proj to out_proj, gate/up/down_proj to w1/w3/w2).

Attribution and license

Derived from LiquidAI/LFM2.5-1.2B-Instruct, copyright Liquid AI, distributed under the LFM Open License v1.0 (see LICENSE). This naming variant was prepared by Ertas AI for internal LoRA fine-tuning tooling. All model capabilities, weights, and credit belong to Liquid AI.

Downloads last month
23
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for edbuildingstuff/LFM2.5-1.2B-Instruct-ertas

Finetuned
(97)
this model