simaai
/

LFM2-VL-1.6B-a16w4

@@ -9,7 +9,8 @@ tags:
 - generative_ai
 - embedded
 - sima
-- liquid
 pipeline_tag: image-text-to-text
 base_model: LiquidAI/LFM2-VL-1.6B
 ---
@@ -17,26 +18,23 @@ base_model: LiquidAI/LFM2-VL-1.6B
 ## Overview
-This repository contains the **LFM2-VL-1.6B** model, optimized and compiled for the **SiMa.ai Modalix** platform.
 - **Model Architecture:** LFM2-VL (1.6B parameters)
 - **Quantization:** Hybrid
   - **Prompt Processing:** A16W8 (16-bit activations, 8-bit weights)
   - **Token Generation:** A16W4 (16-bit activations, 4-bit weights)
 - **Maximum context length:** 2048
-- **Input Resolution:** 512x512
 - **Source Model:** [LiquidAI/LFM2-VL-1.6B](https://huggingface.co/LiquidAI/LFM2-VL-1.6B)
-### Important: This model is compiled with an input resolution of 448×448.
- If you want to use a different resolution (e.g., 512×512), you must recompile the model with the desired input size.
 ## Performance
-The following performance metrics were measured with an image and a text prompt of 50 tokens.
 | Model | Precision | Device | Response Rate (tokens/sec) | Time To First Token (sec) |
 |---|---|---|---|---|
-| LFM2-VL-1.6B-a16w4 | A16W8/A16W4 | Modalix | 55 tokens/sec | 0.45 sec |
 ## Prerequisites
@@ -51,18 +49,20 @@ To run this model, you need:
 Follow these steps to deploy the model to your Modalix device.
-On your Modalix device, install the LLiMa beta runtime using the `sima-cli`:
 ```bash
 # Create a directory for LLiMa
 cd /media/nvme
-mkdir -p llima
 cd llima
-# Install the LLiMa Beta runtime code
-sima-cli install -v 2.1.0 tools/llima -t full
 ```
-> **Note:** To only download the LLiMa runtime code, select **🚫 Skip** when prompted.
 ### 2. Download the Model
@@ -70,8 +70,7 @@ Download the compiled model assets from this repository directly to your device.
 ```bash
 # Download the model to a local directory
-cd /media/nvme/llima/models
-hf download simaai/LFM2-VL-1.6B-a16w4 --local-dir LFM2-VL-1.6B-a16w4
 ```
 Alternatively, you can download the compiled model to a Host and copy it to the Modalix device:
@@ -87,9 +86,8 @@ scp -r LFM2-VL-1.6B-a16w4 sima@<modalix-ip>:/media/nvme/llima/models/
 ```text
 /media/nvme/llima/
 ├── run.sh
-├── simaai-genai-demo/        # The demo app
 └── models/
-    └── LFM2-VL-1.6B-a16w4/   # Compiled model
 ```
 ## Usage
@@ -99,7 +97,7 @@ scp -r LFM2-VL-1.6B-a16w4 sima@<modalix-ip>:/media/nvme/llima/models/
 Navigate to the demo directory and start the application:
 ```bash
-cd /media/nvme/llima/simaai-genai-demo
 ./run.sh
 ```
@@ -115,8 +113,7 @@ https://<modalix-ip>:5000/
 To use OpenAI-compatible API, run the model in API mode:
 ```bash
-cd /media/nvme/llima/simaai-genai-demo
-./run.sh --httponly --api-only
 ```
 You can interact with it using `curl` or Python.
@@ -124,11 +121,17 @@ You can interact with it using `curl` or Python.
 **Example: Chat Completion**
 ```bash
-# Note: You need to replace <YOUR_BASE64_STRING_HERE> with an actual base64 encoded image string.
 curl -N -k -X POST "https://<modalix-ip>:9998/v1/chat/completions" \
   -H "Content-Type: application/json" \
   -d '{
     "messages": [
       {
         "role": "user",
         "content": [
@@ -140,12 +143,11 @@ curl -N -k -X POST "https://<modalix-ip>:9998/v1/chat/completions" \
           },
           {
             "type": "text",
-            "text": "Describe this image"
           }
         ]
       }
-    ],
-    "stream": true
   }'
 ```
 *Replace \<modalix-ip\> with the IP address of your Modalix device.*
@@ -159,11 +161,11 @@ curl -N -k -X POST "https://<modalix-ip>:9998/v1/chat/completions" \
 ## Troubleshooting
 - **`sima-cli` not found**: Ensure that sima-cli is installed on your Modalix device.
-- **Model can't be run**: Verify the model directory is exactly inside `/media/nvme/llima/models` and not nested (e.g., `/media/nvme/llima/models/LFM2-VL-1.6B-a16w4/LFM2-VL-1.6B-a16w4/`).
 - **Permission Denied**: Ensure you have read/write permissions for the `/media/nvme` directory.
 ## Resources
 - [SiMa.ai Documentation](https://docs.sima.ai)
 - [SiMa.ai Hugging Face Organization](https://huggingface.co/simaai)
-- [Liquid AI Website](https://liquid.ai/)

 - generative_ai
 - embedded
 - sima
+- liquidai
+- lfm2
 pipeline_tag: image-text-to-text
 base_model: LiquidAI/LFM2-VL-1.6B
 ---
 ## Overview
+This repository contains the **LFM2-VL-1.6B-a16w4** model, optimized and compiled for the **SiMa.ai Modalix** platform.
 - **Model Architecture:** LFM2-VL (1.6B parameters)
 - **Quantization:** Hybrid
   - **Prompt Processing:** A16W8 (16-bit activations, 8-bit weights)
   - **Token Generation:** A16W4 (16-bit activations, 4-bit weights)
 - **Maximum context length:** 2048
+- **Input Resolution:** 512x512 (Fixed)
 - **Source Model:** [LiquidAI/LFM2-VL-1.6B](https://huggingface.co/LiquidAI/LFM2-VL-1.6B)
 ## Performance
+The following performance metrics were measured with an image and a text prompt of 7 tokens.
 | Model | Precision | Device | Response Rate (tokens/sec) | Time To First Token (sec) |
 |---|---|---|---|---|
+| LFM2-VL-1.6B-a16w4 | A16W8/A16W4 | Modalix | 81.09 tokens/sec | 0.28 sec |
 ## Prerequisites
 Follow these steps to deploy the model to your Modalix device.
+### 1. Install LLiMa Demo Application
+> **Note:** This is a **one-time setup**. If you have already installed the LLiMa demo application (e.g. for another model), you can skip this step and continue with model download.
+On your Modalix device, install the LLiMa demo application using the `sima-cli`:
 ```bash
 # Create a directory for LLiMa
 cd /media/nvme
+mkdir llima
 cd llima
+# Install the LLiMa runtime code
+sima-cli install -v 2.1.0 tools/llima -t selection
 ```
+> **Note:** To only download the LLiMa runtime code, select **Demo Web App** when prompted.
 ### 2. Download the Model
 ```bash
 # Download the model to a local directory
+llima pull LFM2-VL-1.6B-a16w4
 ```
 Alternatively, you can download the compiled model to a Host and copy it to the Modalix device:
 ```text
 /media/nvme/llima/
 ├── run.sh
 └── models/
+    └── LFM2-VL-1.6B-a16w4/   # The compiled model
 ```
 ## Usage
 Navigate to the demo directory and start the application:
 ```bash
+cd /media/nvme/llima/
 ./run.sh
 ```
 To use OpenAI-compatible API, run the model in API mode:
 ```bash
+llima run LFM2-VL-1.6B-a16w4 --mode web
 ```
 You can interact with it using `curl` or Python.
 **Example: Chat Completion**
 ```bash
+# Note: Replace <YOUR_BASE64_STRING_HERE> with an actual base64 encoded image string.
 curl -N -k -X POST "https://<modalix-ip>:9998/v1/chat/completions" \
   -H "Content-Type: application/json" \
   -d '{
+    "model": "sima-vlm",
+    "stream": true,
     "messages": [
+      {
+        "role": "system",
+        "content": "You are a helpful assistant."
+      },
       {
         "role": "user",
         "content": [
           },
           {
             "type": "text",
+            "text": "Describe the image in two sentences."
           }
         ]
       }
+    ]
   }'
 ```
 *Replace \<modalix-ip\> with the IP address of your Modalix device.*
 ## Troubleshooting
 - **`sima-cli` not found**: Ensure that sima-cli is installed on your Modalix device.
+- **Model can't be run**: Verify the model directory is exactly inside `/media/nvme/llima/models/` and not nested (e.g., `/media/nvme/llima/models/LFM2-VL-1.6B-a16w4/LFM2-VL-1.6B-a16w4`).
 - **Permission Denied**: Ensure you have read/write permissions for the `/media/nvme` directory.
 ## Resources
 - [SiMa.ai Documentation](https://docs.sima.ai)
 - [SiMa.ai Hugging Face Organization](https://huggingface.co/simaai)
+- [Liquid AI Website](https://liquid.ai/)