Update README.md
Browse files
README.md
CHANGED
|
@@ -74,6 +74,7 @@ LFM2.5-1.2B-Instruct is a general-purpose text-only model with the following fea
|
|
| 74 |
| [**LFM2.5-1.2B-Instruct**](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct) | Original model checkpoint in native format. Best for fine-tuning or inference with Transformers and vLLM. |
|
| 75 |
| [LFM2.5-1.2B-Instruct-GGUF](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct-GGUF) | Quantized format for llama.cpp and compatible tools. Optimized for CPU inference and local deployment with reduced memory usage. |
|
| 76 |
| [LFM2.5-1.2B-Instruct-ONNX](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct-ONNX) | ONNX Runtime format for cross-platform deployment. Enables hardware-accelerated inference across diverse environments (cloud, edge, mobile). |
|
|
|
|
| 77 |
|
| 78 |
We recommend using it for agentic tasks, data extraction, and RAG. It is not recommended for knowledge-intensive tasks and programming.
|
| 79 |
|
|
@@ -182,7 +183,7 @@ We compared LFM2.5-1.2B-Instruct with relevant sub-2B models on a diverse suite
|
|
| 182 |
| Model | GPQA | MMLU-Pro | IFEval | IFBench | Multi-IF | AIME25 | BFCLv3 |
|
| 183 |
|-------|------|----------|--------|---------|----------|--------|--------|
|
| 184 |
| **LFM2.5-1.2B-Instruct** | 38.89 | 44.35 | 86.23 | 47.33 | 60.98 | 14.00 | 49.12 |
|
| 185 |
-
| Qwen3-1.7B | 34.85 | 42.91 | 73.68 | 21.33 | 56.48 | 9.33 | 46.30 |
|
| 186 |
| Granite 4.0-1B | 24.24 | 33.53 | 79.61 | 21.00 | 43.65 | 3.33 | 52.43 |
|
| 187 |
| Llama 3.2 1B Instruct | 16.57 | 20.80 | 52.37 | 15.93 | 30.16 | 0.33 | 21.44 |
|
| 188 |
| Gemma 3 1B IT | 24.24 | 14.04 | 63.25 | 20.47 | 44.31 | 1.00 | 16.64 |
|
|
@@ -199,9 +200,9 @@ In addition, we are partnering with AMD, Qualcomm, and Nexa AI to bring the LFM2
|
|
| 199 |
|
| 200 |
| Device | Inference | Framework | Model | Prefill (tok/s) | Decode (tok/s) | Memory (GB) |
|
| 201 |
| ---------------------------------------------------- | --------- | ---------------- | -------------------- | --------------- | -------------- | ----------- |
|
| 202 |
-
| Qualcomm Snapdragon® X Elite | NPU | NexaML | LFM2.5-1.2B-
|
| 203 |
-
| Qualcomm Snapdragon® Gen4 (ROG Phone9 Pro) | NPU | NexaML | LFM2.5-1.2B-
|
| 204 |
-
| Qualcomm Snapdragon® Gen4 (Samsung Galaxy S25 Ultra) | CPU | llama.cpp (Q4_0) | LFM2.5-1.2B-
|
| 205 |
| Qualcomm Snapdragon® Gen4 (Samsung Galaxy S25 Ultra) | CPU | llama.cpp (Q4_0) | Qwen3-1.7B | 181 | 40 | 1306MB |
|
| 206 |
|
| 207 |
These capabilities unlock new deployment scenarios across various devices, including vehicles, mobile devices, laptops, IoT devices, and embedded systems.
|
|
|
|
| 74 |
| [**LFM2.5-1.2B-Instruct**](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct) | Original model checkpoint in native format. Best for fine-tuning or inference with Transformers and vLLM. |
|
| 75 |
| [LFM2.5-1.2B-Instruct-GGUF](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct-GGUF) | Quantized format for llama.cpp and compatible tools. Optimized for CPU inference and local deployment with reduced memory usage. |
|
| 76 |
| [LFM2.5-1.2B-Instruct-ONNX](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct-ONNX) | ONNX Runtime format for cross-platform deployment. Enables hardware-accelerated inference across diverse environments (cloud, edge, mobile). |
|
| 77 |
+
| [LFM2.5-1.2B-Instruct-MLX](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct-MLX-8bit) | MLX format for Apple Silicon. Optimized for fast inference on Mac devices using the MLX framework. |
|
| 78 |
|
| 79 |
We recommend using it for agentic tasks, data extraction, and RAG. It is not recommended for knowledge-intensive tasks and programming.
|
| 80 |
|
|
|
|
| 183 |
| Model | GPQA | MMLU-Pro | IFEval | IFBench | Multi-IF | AIME25 | BFCLv3 |
|
| 184 |
|-------|------|----------|--------|---------|----------|--------|--------|
|
| 185 |
| **LFM2.5-1.2B-Instruct** | 38.89 | 44.35 | 86.23 | 47.33 | 60.98 | 14.00 | 49.12 |
|
| 186 |
+
| Qwen3-1.7B (instruct)| 34.85 | 42.91 | 73.68 | 21.33 | 56.48 | 9.33 | 46.30 |
|
| 187 |
| Granite 4.0-1B | 24.24 | 33.53 | 79.61 | 21.00 | 43.65 | 3.33 | 52.43 |
|
| 188 |
| Llama 3.2 1B Instruct | 16.57 | 20.80 | 52.37 | 15.93 | 30.16 | 0.33 | 21.44 |
|
| 189 |
| Gemma 3 1B IT | 24.24 | 14.04 | 63.25 | 20.47 | 44.31 | 1.00 | 16.64 |
|
|
|
|
| 200 |
|
| 201 |
| Device | Inference | Framework | Model | Prefill (tok/s) | Decode (tok/s) | Memory (GB) |
|
| 202 |
| ---------------------------------------------------- | --------- | ---------------- | -------------------- | --------------- | -------------- | ----------- |
|
| 203 |
+
| Qualcomm Snapdragon® X Elite | NPU | NexaML | LFM2.5-1.2B-Instruct | 2591 | 63 | 0.9GB |
|
| 204 |
+
| Qualcomm Snapdragon® Gen4 (ROG Phone9 Pro) | NPU | NexaML | LFM2.5-1.2B-Instruct | 4391 | 82 | 0.9GB |
|
| 205 |
+
| Qualcomm Snapdragon® Gen4 (Samsung Galaxy S25 Ultra) | CPU | llama.cpp (Q4_0) | LFM2.5-1.2B-Instruct | 335 | 70 | 719MB |
|
| 206 |
| Qualcomm Snapdragon® Gen4 (Samsung Galaxy S25 Ultra) | CPU | llama.cpp (Q4_0) | Qwen3-1.7B | 181 | 40 | 1306MB |
|
| 207 |
|
| 208 |
These capabilities unlock new deployment scenarios across various devices, including vehicles, mobile devices, laptops, IoT devices, and embedded systems.
|