Instructions to use MuXodious/LFM2.5-1.2B-Base-absolute-heresy with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use MuXodious/LFM2.5-1.2B-Base-absolute-heresy with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="MuXodious/LFM2.5-1.2B-Base-absolute-heresy")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("MuXodious/LFM2.5-1.2B-Base-absolute-heresy")
model = AutoModelForCausalLM.from_pretrained("MuXodious/LFM2.5-1.2B-Base-absolute-heresy")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use MuXodious/LFM2.5-1.2B-Base-absolute-heresy with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "MuXodious/LFM2.5-1.2B-Base-absolute-heresy"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MuXodious/LFM2.5-1.2B-Base-absolute-heresy",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/MuXodious/LFM2.5-1.2B-Base-absolute-heresy

SGLang

How to use MuXodious/LFM2.5-1.2B-Base-absolute-heresy with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "MuXodious/LFM2.5-1.2B-Base-absolute-heresy" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MuXodious/LFM2.5-1.2B-Base-absolute-heresy",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "MuXodious/LFM2.5-1.2B-Base-absolute-heresy" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MuXodious/LFM2.5-1.2B-Base-absolute-heresy",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use MuXodious/LFM2.5-1.2B-Base-absolute-heresy with Docker Model Runner:
```
docker model run hf.co/MuXodious/LFM2.5-1.2B-Base-absolute-heresy
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

A newer version of this model is available: MuXodious/LFM2.5-1.2B-Base-absolute-heresy-MPOA

This is an LFM2.5-1.2B-Base fine-tune, produced through P-E-W's Heretic (v1.1.0) abliteration engine merged with the Hybrid Layer Support PR.

Note: The base model is intended for fine-tuning.

Heretication Results

Score Metric	Value	Parameter	Value
Refusals	5/100	direction_index	per layer
KL Divergence	0.0221	attn.o_proj.max_weight	1.61
Initial Refusals	71/100	attn.o_proj.max_weight_position	14.57
		attn.o_proj.min_weight	0.58
		attn.o_proj.min_weight_distance	7.94
		conv.out_proj.max_weight	1.36
		conv.out_proj.max_weight_position	9.47
		conv.out_proj.min_weight	0.94
		conv.out_proj.min_weight_distance	4.38
		mlp.down_proj.max_weight	1.68
		mlp.down_proj.max_weight_position	13.79
		mlp.down_proj.min_weight	0.27
		mlp.down_proj.min_weight_distance	2.77

Degree of Heretication

The Heresy Index weighs the resulting model's corruption by the process (KL Divergence) and its abolition of doctrine (Refusals) for a final verdict in classification.

Index Entry	Classification	Analysis
	Absolute Heresy	Less than 10/100 Refusals and 0.10 KL Divergence
	Tainted Heresy	Around 25-11/100 Refusals and/or -0.20-0.11 KL Divergence
	Impotent Heresy	Anything above 25/100 Refusals and 0.21 KL Divergence

Note: This is an arbitrary classification inspired by Warhammer 40K, having no tangible indication towards the model's performance.

Try LFM • Documentation • LEAP

LFM2.5-1.2B-Base

LFM2.5 is a new family of hybrid models designed for on-device deployment. It builds on the LFM2 architecture with extended pre-training and reinforcement learning.

Find more information about LFM2.5 in our blog post.

🗒️ Model Details

Model	Parameters	Description
LFM2.5-1.2B-Base	1.2B	Pre-trained base model for fine-tuning
LFM2.5-1.2B-Instruct	1.2B	General-purpose instruction-tuned model
LFM2.5-1.2B-JP	1.2B	Japanese-optimized chat model
LFM2.5-VL-1.6B	1.6B	Vision-language model with fast inference
LFM2.5-Audio-1.5B	1.5B	Audio-language model for speech and text I/O

LFM2.5-1.2B-Base is the pre-trained text-only checkpoint, used to create all the LFM2.5-1.2B variants. It has the following features:

Number of parameters: 1.17B
Number of layers: 16 (10 double-gated LIV convolution blocks + 6 GQA blocks)
Training budget: 28T tokens
Context length: 32,768 tokens
Vocabulary size: 65,536
Languages: English, Arabic, Chinese, French, German, Japanese, Korean, Spanish

Model	Description
LFM2.5-1.2B-Base	Original model checkpoint in native format. Best for fine-tuning or inference with Transformers and vLLM.
LFM2.5-1.2B-Base-GGUF	Quantized format for llama.cpp and compatible tools. Optimized for CPU inference and local deployment with reduced memory usage.
LFM2.5-1.2B-Base-ONNX	ONNX Runtime format for cross-platform deployment. Enables hardware-accelerated inference across diverse environments (cloud, edge, mobile).

This pre-trained checkpoint is only recommended for tasks that require heavy fine-tuning, like language-specific (e.g., Japanese) or domain-specific (e.g., medical) assistants, training on proprietary data, or experimenting with novel post-training approaches.

🏃 Inference

LFM2.5 is supported by many inference frameworks. See the Inference documentation for the full list.

Name	Description	Docs
Transformers	Simple inference with direct access to model internals.	Link
vLLM	High-throughput production deployments with GPU.	Link
llama.cpp	Cross-platform inference with CPU offloading.	Link

Here's a quick start example with transformers:

from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer

model_id = "LiquidAI/LFM2.5-1.2B-Base"
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    dtype="bfloat16",
#   attn_implementation="flash_attention_2" <- uncomment on compatible GPU
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)

prompt = "What is C. elegans?"

input_ids = tokenizer.apply_chat_template(
    [{"role": "user", "content": prompt}],
    add_generation_prompt=True,
    return_tensors="pt",
    tokenize=True,
).to(model.device)

output = model.generate(
    input_ids,
    do_sample=True,
    temperature=0.3,
    min_p=0.15,
    repetition_penalty=1.05,
    max_new_tokens=512,
    streamer=streamer,
)

🔧 Fine-tuning

We recommend fine-tuning LFM2.5 for your specific use case to achieve the best results.

Name	Description	Docs
SFT (Unsloth)	Supervised Fine-Tuning with LoRA using Unsloth.	Link
SFT (TRL)	Supervised Fine-Tuning with LoRA using TRL.	Link
DPO (TRL)	Direct Preference Optimization with LoRA using TRL.	Link

Contact

For enterprise solutions and edge deployment, contact sales@liquid.ai.

Citation

@article{liquidai2025lfm2,
  title={LFM2 Technical Report},
  author={Liquid AI},
  journal={arXiv preprint arXiv:2511.23404},
  year={2025}
}

Downloads last month: 5

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for MuXodious/LFM2.5-1.2B-Base-absolute-heresy

Base model

LiquidAI/LFM2.5-1.2B-Base

Finetuned

(34)

this model

Quantizations

2 models

Collection including MuXodious/LFM2.5-1.2B-Base-absolute-heresy

heretic-models

Collection

Madness and heresy • 69 items • Updated 21 days ago • 12

Paper for MuXodious/LFM2.5-1.2B-Base-absolute-heresy

LFM2 Technical Report

Paper • 2511.23404 • Published Nov 28, 2025 • 60