Instructions to use reaperdoesntknow/LFM2.5-8B-A1B-Opus-Distil with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use reaperdoesntknow/LFM2.5-8B-A1B-Opus-Distil with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="reaperdoesntknow/LFM2.5-8B-A1B-Opus-Distil") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("reaperdoesntknow/LFM2.5-8B-A1B-Opus-Distil") model = AutoModelForCausalLM.from_pretrained("reaperdoesntknow/LFM2.5-8B-A1B-Opus-Distil") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use reaperdoesntknow/LFM2.5-8B-A1B-Opus-Distil with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "reaperdoesntknow/LFM2.5-8B-A1B-Opus-Distil" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "reaperdoesntknow/LFM2.5-8B-A1B-Opus-Distil", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/reaperdoesntknow/LFM2.5-8B-A1B-Opus-Distil
- SGLang
How to use reaperdoesntknow/LFM2.5-8B-A1B-Opus-Distil with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "reaperdoesntknow/LFM2.5-8B-A1B-Opus-Distil" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "reaperdoesntknow/LFM2.5-8B-A1B-Opus-Distil", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "reaperdoesntknow/LFM2.5-8B-A1B-Opus-Distil" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "reaperdoesntknow/LFM2.5-8B-A1B-Opus-Distil", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use reaperdoesntknow/LFM2.5-8B-A1B-Opus-Distil with Docker Model Runner:
docker model run hf.co/reaperdoesntknow/LFM2.5-8B-A1B-Opus-Distil
CIx-LFM2.5-8B-A1B Reasoning SFT
Model Summary
This model is a fine-tuned version of LiquidAI/LFM2.5-8B-A1B, adapted on the angrygiraffe/claude-opus-4.6-4.7-reasoning-8.7k dataset for English text-generation and reasoning-style responses.
The fine-tuning run used a custom Convergent Intelligence optimizer stack, CIxOpt, designed for heterogeneous routing across parameter types. The goal of this checkpoint is to test whether a Liquid Foundation Model backbone can be adapted efficiently through targeted sparse participation rather than broad full-model modification.
This is an experimental research checkpoint intended for continued evaluation, domain adaptation, and architecture/optimizer testing.
Base Model
- Base: LiquidAI/LFM2.5-8B-A1B
- Architecture family: Liquid Foundation Model / hybrid causal language model
- Task: Causal language modeling
- Language: English
- License: Apache 2.0, inherited from the released model metadata unless otherwise restricted by upstream dependencies
Dataset
Fine-tuning data:
- angrygiraffe/claude-opus-4.6-4.7-reasoning-8.7k
The dataset was processed into chat-style training examples using tokenizer-compatible chat formatting where available. Empty or malformed examples were filtered before tokenization.
Training Method
This model was trained using the Convergent Intelligence CIxOpt optimizer framework.
Optimizer Design
CIxOpt applies heterogeneous routing based on parameter type and tensor structure:
- Lion-style sign momentum for large projection matrices
- AdamW-style updates for sensitive normalization surfaces
- Adamax-style handling available for embeddings or language-head-style parameters
- fp32 optimizer state for bf16/fp16 model safety
- Gradient centralization for eligible matrix-shaped gradients
- Discrepancy-aware caution filtering for sign updates
- Decoupled weight decay
- Gradient clipping during training
Sparse Participation Strategy
The fine-tuning strategy was designed to avoid unnecessary modification of the full pretrained backbone. Instead, training focused on selected adaptation surfaces, especially upper-layer projection and normalization modules.
The intended training philosophy was:
text freeze most pretrained structure adapt upper reasoning / response-shaping layers preserve lower representational substrate route parameter groups by optimizer behavior
This makes the checkpoint useful for studying efficient adaptation of LFM-family models under constrained compute.
Intended Use
This model is intended for:
- Research on Liquid Foundation Model fine-tuning
- Optimizer experiments with CIxOpt
- Reasoning-style text generation
- Instruction-following experiments
- Lightweight comparative evaluation against other small or sparse-adapted causal LMs
- Continued fine-tuning or domain adaptation
Example use cases:
- Analytical response generation
- Reasoning trace compression
- Technical explanation
- Experimental agent backbones
- Small-scale model behavior studies
Out-of-Scope Use
This model is not intended for high-stakes autonomous deployment without additional evaluation.
Do not use this model as the sole decision-maker for:
- Medical diagnosis
- Legal judgment
- Financial decisions
- Emergency response
- Cyber offensive automation
- Personnel screening
- Surveillance or targeting decisions
- Any setting requiring verified factual accuracy
Limitations
This is an experimental fine-tuned checkpoint. Known or expected limitations include:
- May hallucinate facts, citations, dates, or source attributions
- May inherit biases or artifacts from the base model and fine-tuning data
- May overproduce reasoning-style explanations when shorter answers are preferred
- May be sensitive to prompt formatting
- Has not been fully benchmarked across safety, factuality, coding, mathematics, or instruction-following suites
- Fine-tuning on reasoning-style data does not guarantee correct reasoning
- Sparse or targeted adaptation may leave some capabilities close to the base model while changing others unevenly
Safety Notes
Users should independently validate important outputs. Generated content may be plausible but incorrect.
For deployment-facing use, additional steps are recommended:
- Benchmark against known evaluation suites
- Run toxicity and bias evaluation
- Test refusal behavior
- Evaluate hallucination rate
- Compare against the base model
- Add domain-specific guardrails
- Use retrieval or verification for factual tasks
Example Usage
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "reaperdoesntknow/LFM2.5-8B-A1B-Opus-Distil"
tokenizer = AutoTokenizer.from_pretrained(
model_id,
trust_remote_code=True,
)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
)
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
prompt = "Explain why stable positional encoding matters for long-context language models."
inputs = tokenizer(
prompt,
return_tensors="pt",
).to(model.device)
with torch.inference_mode():
output = model.generate(
**inputs,
max_new_tokens=1024,
do_sample=True,
temperature=0.7,
top_p=0.95,
repetition_penalty=1.05,
pad_token_id=tokenizer.eos_token_id,
)
print(tokenizer.decode(output[0], skip_special_tokens=True))
Training Configuration
Approximate training configuration used:
text base_model: LiquidAI/LFM2.5-8B-A1B dataset: angrygiraffe/claude-opus-4.6-4.7-reasoning-8.7k task: causal language modeling / instruction-style SFT optimizer: CIxOpt state_dtype: fp32 weight_decay: enabled gradient_clipping: enabled chat_template: tokenizer-compatible formatting padding: max_length loss_masking: padding tokens masked with -100
Exact loss curves, benchmark scores, and hardware details should be added after evaluation.
Evaluation
Formal benchmark results have not yet been added.
Recommended evaluation targets:
- Perplexity on held-out validation data
- MT-Bench-style instruction following
- IFEval
- GSM8K or similar lightweight reasoning checks
- MMLU-style knowledge evaluation
- TruthfulQA-style hallucination testing
- Human preference comparison against the base model
- Side-by-side testing against smaller LFM2.5 checkpoints
Citation
Base model:
bibtex @misc{liquidai_lfm25, title = {LFM2.5-8B-A1B}, author = {Liquid AI}, year = {2025}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/LiquidAI/LFM2.5-8B-A1B}} }
Fine-tuning dataset:
bibtex @misc{angrygiraffe_reasoning_dataset, title = {claude-opus-4.6-4.7-reasoning-8.7k}, author = {angrygiraffe}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/datasets/angrygiraffe/claude-opus-4.6-4.7-reasoning-8.7k}} }
Author / Maintainer
Fine-tuning and optimizer experimentation by: Convergent Intelligence LLC | Research & Development Divisions Research and development focus: AI systems, intelligence analysis, mathematical frameworks, optimizer design, and efficient model adaptation.
Disclaimer
This model is provided for research and experimentation. It should not be treated as a verified expert system. Outputs require human review, especially in factual, technical, legal, medical, financial, operational, or safety-critical contexts.
- Downloads last month
- 104