Update README.md
Browse files
README.md
CHANGED
|
@@ -9,7 +9,8 @@ tags:
|
|
| 9 |
- generative_ai
|
| 10 |
- embedded
|
| 11 |
- sima
|
| 12 |
-
-
|
|
|
|
| 13 |
pipeline_tag: image-text-to-text
|
| 14 |
base_model: LiquidAI/LFM2-VL-1.6B
|
| 15 |
---
|
|
@@ -17,26 +18,23 @@ base_model: LiquidAI/LFM2-VL-1.6B
|
|
| 17 |
|
| 18 |
## Overview
|
| 19 |
|
| 20 |
-
This repository contains the **LFM2-VL-1.6B** model, optimized and compiled for the **SiMa.ai Modalix** platform.
|
| 21 |
|
| 22 |
- **Model Architecture:** LFM2-VL (1.6B parameters)
|
| 23 |
- **Quantization:** Hybrid
|
| 24 |
- **Prompt Processing:** A16W8 (16-bit activations, 8-bit weights)
|
| 25 |
- **Token Generation:** A16W4 (16-bit activations, 4-bit weights)
|
| 26 |
- **Maximum context length:** 2048
|
| 27 |
-
- **Input Resolution:** 512x512
|
| 28 |
- **Source Model:** [LiquidAI/LFM2-VL-1.6B](https://huggingface.co/LiquidAI/LFM2-VL-1.6B)
|
| 29 |
|
| 30 |
-
### Important: This model is compiled with an input resolution of 448Γ448.
|
| 31 |
-
If you want to use a different resolution (e.g., 512Γ512), you must recompile the model with the desired input size.
|
| 32 |
-
|
| 33 |
## Performance
|
| 34 |
|
| 35 |
-
The following performance metrics were measured with an image and a text prompt of
|
| 36 |
|
| 37 |
| Model | Precision | Device | Response Rate (tokens/sec) | Time To First Token (sec) |
|
| 38 |
|---|---|---|---|---|
|
| 39 |
-
| LFM2-VL-1.6B-a16w4 | A16W8/A16W4 | Modalix |
|
| 40 |
|
| 41 |
|
| 42 |
## Prerequisites
|
|
@@ -51,18 +49,20 @@ To run this model, you need:
|
|
| 51 |
|
| 52 |
Follow these steps to deploy the model to your Modalix device.
|
| 53 |
|
|
|
|
|
|
|
| 54 |
|
| 55 |
-
On your Modalix device, install the LLiMa
|
| 56 |
|
| 57 |
```bash
|
| 58 |
# Create a directory for LLiMa
|
| 59 |
cd /media/nvme
|
| 60 |
-
mkdir
|
| 61 |
cd llima
|
| 62 |
-
# Install the LLiMa
|
| 63 |
-
sima-cli install -v 2.1.0 tools/llima -t
|
| 64 |
```
|
| 65 |
-
> **Note:** To only download the LLiMa runtime code, select **
|
| 66 |
|
| 67 |
### 2. Download the Model
|
| 68 |
|
|
@@ -70,8 +70,7 @@ Download the compiled model assets from this repository directly to your device.
|
|
| 70 |
|
| 71 |
```bash
|
| 72 |
# Download the model to a local directory
|
| 73 |
-
|
| 74 |
-
hf download simaai/LFM2-VL-1.6B-a16w4 --local-dir LFM2-VL-1.6B-a16w4
|
| 75 |
```
|
| 76 |
|
| 77 |
Alternatively, you can download the compiled model to a Host and copy it to the Modalix device:
|
|
@@ -87,9 +86,8 @@ scp -r LFM2-VL-1.6B-a16w4 sima@<modalix-ip>:/media/nvme/llima/models/
|
|
| 87 |
```text
|
| 88 |
/media/nvme/llima/
|
| 89 |
βββ run.sh
|
| 90 |
-
βββ simaai-genai-demo/ # The demo app
|
| 91 |
βββ models/
|
| 92 |
-
βββ LFM2-VL-1.6B-a16w4/ #
|
| 93 |
```
|
| 94 |
|
| 95 |
## Usage
|
|
@@ -99,7 +97,7 @@ scp -r LFM2-VL-1.6B-a16w4 sima@<modalix-ip>:/media/nvme/llima/models/
|
|
| 99 |
Navigate to the demo directory and start the application:
|
| 100 |
|
| 101 |
```bash
|
| 102 |
-
cd /media/nvme/llima/
|
| 103 |
./run.sh
|
| 104 |
```
|
| 105 |
|
|
@@ -115,8 +113,7 @@ https://<modalix-ip>:5000/
|
|
| 115 |
|
| 116 |
To use OpenAI-compatible API, run the model in API mode:
|
| 117 |
```bash
|
| 118 |
-
|
| 119 |
-
./run.sh --httponly --api-only
|
| 120 |
```
|
| 121 |
|
| 122 |
You can interact with it using `curl` or Python.
|
|
@@ -124,11 +121,17 @@ You can interact with it using `curl` or Python.
|
|
| 124 |
**Example: Chat Completion**
|
| 125 |
|
| 126 |
```bash
|
| 127 |
-
# Note:
|
| 128 |
curl -N -k -X POST "https://<modalix-ip>:9998/v1/chat/completions" \
|
| 129 |
-H "Content-Type: application/json" \
|
| 130 |
-d '{
|
|
|
|
|
|
|
| 131 |
"messages": [
|
|
|
|
|
|
|
|
|
|
|
|
|
| 132 |
{
|
| 133 |
"role": "user",
|
| 134 |
"content": [
|
|
@@ -140,12 +143,11 @@ curl -N -k -X POST "https://<modalix-ip>:9998/v1/chat/completions" \
|
|
| 140 |
},
|
| 141 |
{
|
| 142 |
"type": "text",
|
| 143 |
-
"text": "Describe
|
| 144 |
}
|
| 145 |
]
|
| 146 |
}
|
| 147 |
-
]
|
| 148 |
-
"stream": true
|
| 149 |
}'
|
| 150 |
```
|
| 151 |
*Replace \<modalix-ip\> with the IP address of your Modalix device.*
|
|
@@ -159,11 +161,11 @@ curl -N -k -X POST "https://<modalix-ip>:9998/v1/chat/completions" \
|
|
| 159 |
## Troubleshooting
|
| 160 |
|
| 161 |
- **`sima-cli` not found**: Ensure that sima-cli is installed on your Modalix device.
|
| 162 |
-
- **Model can't be run**: Verify the model directory is exactly inside `/media/nvme/llima/models` and not nested (e.g., `/media/nvme/llima/models/LFM2-VL-1.6B-a16w4/LFM2-VL-1.6B-a16w4
|
| 163 |
- **Permission Denied**: Ensure you have read/write permissions for the `/media/nvme` directory.
|
| 164 |
|
| 165 |
## Resources
|
| 166 |
|
| 167 |
- [SiMa.ai Documentation](https://docs.sima.ai)
|
| 168 |
- [SiMa.ai Hugging Face Organization](https://huggingface.co/simaai)
|
| 169 |
-
- [Liquid AI Website](https://liquid.ai/)
|
|
|
|
| 9 |
- generative_ai
|
| 10 |
- embedded
|
| 11 |
- sima
|
| 12 |
+
- liquidai
|
| 13 |
+
- lfm2
|
| 14 |
pipeline_tag: image-text-to-text
|
| 15 |
base_model: LiquidAI/LFM2-VL-1.6B
|
| 16 |
---
|
|
|
|
| 18 |
|
| 19 |
## Overview
|
| 20 |
|
| 21 |
+
This repository contains the **LFM2-VL-1.6B-a16w4** model, optimized and compiled for the **SiMa.ai Modalix** platform.
|
| 22 |
|
| 23 |
- **Model Architecture:** LFM2-VL (1.6B parameters)
|
| 24 |
- **Quantization:** Hybrid
|
| 25 |
- **Prompt Processing:** A16W8 (16-bit activations, 8-bit weights)
|
| 26 |
- **Token Generation:** A16W4 (16-bit activations, 4-bit weights)
|
| 27 |
- **Maximum context length:** 2048
|
| 28 |
+
- **Input Resolution:** 512x512 (Fixed)
|
| 29 |
- **Source Model:** [LiquidAI/LFM2-VL-1.6B](https://huggingface.co/LiquidAI/LFM2-VL-1.6B)
|
| 30 |
|
|
|
|
|
|
|
|
|
|
| 31 |
## Performance
|
| 32 |
|
| 33 |
+
The following performance metrics were measured with an image and a text prompt of 7 tokens.
|
| 34 |
|
| 35 |
| Model | Precision | Device | Response Rate (tokens/sec) | Time To First Token (sec) |
|
| 36 |
|---|---|---|---|---|
|
| 37 |
+
| LFM2-VL-1.6B-a16w4 | A16W8/A16W4 | Modalix | 81.09 tokens/sec | 0.28 sec |
|
| 38 |
|
| 39 |
|
| 40 |
## Prerequisites
|
|
|
|
| 49 |
|
| 50 |
Follow these steps to deploy the model to your Modalix device.
|
| 51 |
|
| 52 |
+
### 1. Install LLiMa Demo Application
|
| 53 |
+
> **Note:** This is a **one-time setup**. If you have already installed the LLiMa demo application (e.g. for another model), you can skip this step and continue with model download.
|
| 54 |
|
| 55 |
+
On your Modalix device, install the LLiMa demo application using the `sima-cli`:
|
| 56 |
|
| 57 |
```bash
|
| 58 |
# Create a directory for LLiMa
|
| 59 |
cd /media/nvme
|
| 60 |
+
mkdir llima
|
| 61 |
cd llima
|
| 62 |
+
# Install the LLiMa runtime code
|
| 63 |
+
sima-cli install -v 2.1.0 tools/llima -t selection
|
| 64 |
```
|
| 65 |
+
> **Note:** To only download the LLiMa runtime code, select **Demo Web App** when prompted.
|
| 66 |
|
| 67 |
### 2. Download the Model
|
| 68 |
|
|
|
|
| 70 |
|
| 71 |
```bash
|
| 72 |
# Download the model to a local directory
|
| 73 |
+
llima pull LFM2-VL-1.6B-a16w4
|
|
|
|
| 74 |
```
|
| 75 |
|
| 76 |
Alternatively, you can download the compiled model to a Host and copy it to the Modalix device:
|
|
|
|
| 86 |
```text
|
| 87 |
/media/nvme/llima/
|
| 88 |
βββ run.sh
|
|
|
|
| 89 |
βββ models/
|
| 90 |
+
βββ LFM2-VL-1.6B-a16w4/ # The compiled model
|
| 91 |
```
|
| 92 |
|
| 93 |
## Usage
|
|
|
|
| 97 |
Navigate to the demo directory and start the application:
|
| 98 |
|
| 99 |
```bash
|
| 100 |
+
cd /media/nvme/llima/
|
| 101 |
./run.sh
|
| 102 |
```
|
| 103 |
|
|
|
|
| 113 |
|
| 114 |
To use OpenAI-compatible API, run the model in API mode:
|
| 115 |
```bash
|
| 116 |
+
llima run LFM2-VL-1.6B-a16w4 --mode web
|
|
|
|
| 117 |
```
|
| 118 |
|
| 119 |
You can interact with it using `curl` or Python.
|
|
|
|
| 121 |
**Example: Chat Completion**
|
| 122 |
|
| 123 |
```bash
|
| 124 |
+
# Note: Replace <YOUR_BASE64_STRING_HERE> with an actual base64 encoded image string.
|
| 125 |
curl -N -k -X POST "https://<modalix-ip>:9998/v1/chat/completions" \
|
| 126 |
-H "Content-Type: application/json" \
|
| 127 |
-d '{
|
| 128 |
+
"model": "sima-vlm",
|
| 129 |
+
"stream": true,
|
| 130 |
"messages": [
|
| 131 |
+
{
|
| 132 |
+
"role": "system",
|
| 133 |
+
"content": "You are a helpful assistant."
|
| 134 |
+
},
|
| 135 |
{
|
| 136 |
"role": "user",
|
| 137 |
"content": [
|
|
|
|
| 143 |
},
|
| 144 |
{
|
| 145 |
"type": "text",
|
| 146 |
+
"text": "Describe the image in two sentences."
|
| 147 |
}
|
| 148 |
]
|
| 149 |
}
|
| 150 |
+
]
|
|
|
|
| 151 |
}'
|
| 152 |
```
|
| 153 |
*Replace \<modalix-ip\> with the IP address of your Modalix device.*
|
|
|
|
| 161 |
## Troubleshooting
|
| 162 |
|
| 163 |
- **`sima-cli` not found**: Ensure that sima-cli is installed on your Modalix device.
|
| 164 |
+
- **Model can't be run**: Verify the model directory is exactly inside `/media/nvme/llima/models/` and not nested (e.g., `/media/nvme/llima/models/LFM2-VL-1.6B-a16w4/LFM2-VL-1.6B-a16w4`).
|
| 165 |
- **Permission Denied**: Ensure you have read/write permissions for the `/media/nvme` directory.
|
| 166 |
|
| 167 |
## Resources
|
| 168 |
|
| 169 |
- [SiMa.ai Documentation](https://docs.sima.ai)
|
| 170 |
- [SiMa.ai Hugging Face Organization](https://huggingface.co/simaai)
|
| 171 |
+
- [Liquid AI Website](https://liquid.ai/)
|