--- license: other language: - en - ar - zh - fr - de - ja - ko - es pipeline_tag: text-generation library_name: transformers tags: - liquid - lfm2.5 - edge ---
Liquid AI
Try LFMDocumentationLEAP
# LFM2.5-1.2B-Base LFM2.5 is a new family of hybrid models designed for **on-device deployment**. It builds on the LFM2 architecture with extended pre-training and reinforcement learning. Find more information about LFM2.5 in our [blog post](https://www.liquid.ai/blog/introducing-lfm2-5-the-next-generation-of-on-device-ai). ## 🗒️ Model Details | Model | Parameters | Description | |-------|------------|-------------| | [**LFM2.5-1.2B-Base**](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Base) | 1.2B | Pre-trained base model for fine-tuning | | [LFM2.5-1.2B-Instruct](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct) | 1.2B | General-purpose instruction-tuned model | | [LFM2.5-1.2B-JP](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP) | 1.2B | Japanese-optimized chat model | | [LFM2.5-VL-1.6B](https://huggingface.co/LiquidAI/LFM2.5-VL-1.6B) | 1.6B | Vision-language model with fast inference | | [LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) | 1.5B | Audio-language model for speech and text I/O | LFM2.5-1.2B-Base is the pre-trained text-only checkpoint, used to create all the LFM2.5-1.2B variants. It has the following features: - **Number of parameters**: 1.17B - **Number of layers**: 16 (10 double-gated LIV convolution blocks + 6 GQA blocks) - **Training budget**: 28T tokens - **Context length**: 32,768 tokens - **Vocabulary size**: 65,536 - **Languages**: English, Arabic, Chinese, French, German, Japanese, Korean, Spanish | Model | Description | |-------|-------------| | [**LFM2.5-1.2B-Base**](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct) | Original model checkpoint in native format. Best for fine-tuning or inference with Transformers and vLLM. | | [LFM2.5-1.2B-Base-GGUF](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct-GGUF) | Quantized format for llama.cpp and compatible tools. Optimized for CPU inference and local deployment with reduced memory usage. | | [LFM2.5-1.2B-Base-ONNX](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct-ONNX) | ONNX Runtime format for cross-platform deployment. Enables hardware-accelerated inference across diverse environments (cloud, edge, mobile). | This pre-trained checkpoint is only recommended for tasks that require heavy fine-tuning, like language-specific (e.g., Japanese) or domain-specific (e.g., medical) assistants, training on proprietary data, or experimenting with novel post-training approaches. ## 🏃 Inference LFM2.5 is supported by many inference frameworks. See the [Inference documentation](https://docs.liquid.ai/lfm/inference/transformers) for the full list. | Name | Description | Docs | Notebook | |------|-------------|------|----------| | [Transformers](https://github.com/huggingface/transformers) | Simple inference with direct access to model internals. | Link | Colab link | | [vLLM](https://github.com/vllm-project/vllm) | High-throughput production deployments with GPU. | Link | Colab link | | [llama.cpp](https://github.com/ggml-org/llama.cpp) | Cross-platform inference with CPU offloading. | Link | Colab link | Here's a quick start example with `transformers`: ```python from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer model_id = "LiquidAI/LFM2.5-1.2B-Base" model = AutoModelForCausalLM.from_pretrained( model_id, device_map="auto", dtype="bfloat16", # attn_implementation="flash_attention_2" <- uncomment on compatible GPU ) tokenizer = AutoTokenizer.from_pretrained(model_id) streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True) prompt = "What is C. elegans?" input_ids = tokenizer.apply_chat_template( [{"role": "user", "content": prompt}], add_generation_prompt=True, return_tensors="pt", tokenize=True, ).to(model.device) output = model.generate( input_ids, do_sample=True, temperature=0.3, min_p=0.15, repetition_penalty=1.05, max_new_tokens=512, streamer=streamer, ) ``` ## 🔧 Fine-tuning We recommend fine-tuning LFM2.5 for your specific use case to achieve the best results. | Name | Description | Docs | Notebook | |------|-------------|------|----------| | SFT ([Unsloth](https://github.com/unslothai/unsloth)) | Supervised Fine-Tuning with LoRA using Unsloth. | Link | Colab link | | SFT ([TRL](https://github.com/huggingface/trl)) | Supervised Fine-Tuning with LoRA using TRL. | Link | Colab link | | DPO ([TRL](https://github.com/huggingface/trl)) | Direct Preference Optimization with LoRA using TRL. | Link | Colab link | ## Contact For enterprise solutions and edge deployment, contact [sales@liquid.ai](mailto:sales@liquid.ai). ## Citation ```bibtex @article{liquidai2025lfm2, title={LFM2 Technical Report}, author={Liquid AI}, journal={arXiv preprint arXiv:2511.23404}, year={2025} } ```