ykhrustalev commited on
Commit
4bb28dd
·
verified ·
1 Parent(s): 50e04ac

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +67 -10
README.md CHANGED
@@ -20,19 +20,20 @@ tags:
20
  - lfm2.5
21
  - onnx
22
  - onnxruntime
 
23
  base_model:
24
  - LiquidAI/LFM2.5-1.2B-Instruct
25
  ---
26
 
27
  <div align="center">
28
- <img
29
- src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/2b08LKpev0DNEk6DlnWkY.png"
30
- alt="Liquid AI"
31
  style="width: 100%; max-width: 100%; height: auto; display: inline-block; margin-bottom: 0.5em; margin-top: 0.5em;"
32
  />
33
  <div style="display: flex; justify-content: center; gap: 0.5em; margin-bottom: 1em;">
34
- <a href="https://playground.liquid.ai/"><strong>Try LFM</strong></a> •
35
- <a href="https://docs.liquid.ai/lfm"><strong>Documentation</strong></a> •
36
  <a href="https://leap.liquid.ai/"><strong>LEAP</strong></a>
37
  </div>
38
  </div>
@@ -45,11 +46,14 @@ LFM2.5 is a hybrid architecture combining multiplicative gates and short convolu
45
 
46
  ## Recommended Variants
47
 
48
- | Precision | Size | Use Case |
49
- |-----------|------|----------|
50
- | Q4 | ~1.2GB | Recommended for most uses |
51
- | FP16 | ~2.4GB | Higher quality |
52
- | Q8 | ~1.7GB | Balance of quality and size |
 
 
 
53
 
54
  ## Model Files
55
 
@@ -142,6 +146,59 @@ for step in range(100): # max tokens
142
  print(tokenizer.decode(generated_tokens, skip_special_tokens=True))
143
  ```
144
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
145
  ## License
146
 
147
  This model is released under the [LFM 1.0 License](LICENSE).
 
20
  - lfm2.5
21
  - onnx
22
  - onnxruntime
23
+ - webgpu
24
  base_model:
25
  - LiquidAI/LFM2.5-1.2B-Instruct
26
  ---
27
 
28
  <div align="center">
29
+ <img
30
+ src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/2b08LKpev0DNEk6DlnWkY.png"
31
+ alt="Liquid AI"
32
  style="width: 100%; max-width: 100%; height: auto; display: inline-block; margin-bottom: 0.5em; margin-top: 0.5em;"
33
  />
34
  <div style="display: flex; justify-content: center; gap: 0.5em; margin-bottom: 1em;">
35
+ <a href="https://playground.liquid.ai/"><strong>Try LFM</strong></a> •
36
+ <a href="https://docs.liquid.ai/lfm"><strong>Documentation</strong></a> •
37
  <a href="https://leap.liquid.ai/"><strong>LEAP</strong></a>
38
  </div>
39
  </div>
 
46
 
47
  ## Recommended Variants
48
 
49
+ | Precision | Size | Platform | Use Case |
50
+ |-----------|------|----------|----------|
51
+ | Q4 | ~1.2GB | WebGPU, Server | Recommended for most uses |
52
+ | FP16 | ~2.4GB | WebGPU, Server | Higher quality |
53
+ | Q8 | ~1.7GB | Server only | Balance of quality and size |
54
+
55
+ - **WebGPU**: Use Q4 or FP16 (Q8 not supported)
56
+ - **Server**: All variants supported
57
 
58
  ## Model Files
59
 
 
146
  print(tokenizer.decode(generated_tokens, skip_special_tokens=True))
147
  ```
148
 
149
+ ## WebGPU (Browser)
150
+
151
+ ### Installation
152
+
153
+ ```bash
154
+ npm install @huggingface/transformers
155
+ ```
156
+
157
+ ### Enable WebGPU
158
+
159
+ WebGPU is required for browser inference. To enable:
160
+
161
+ 1. **Chrome/Edge**: Navigate to `chrome://flags/#enable-unsafe-webgpu`, enable, and restart
162
+ 2. **Verify**: Check `chrome://gpu` for "WebGPU" status
163
+ 3. **Test**: Run `navigator.gpu.requestAdapter()` in DevTools console
164
+
165
+ ### Inference
166
+
167
+ ```javascript
168
+ import { AutoModelForCausalLM, AutoTokenizer, TextStreamer } from "@huggingface/transformers";
169
+
170
+ const modelId = "LiquidAI/LFM2.5-1.2B-Instruct-ONNX";
171
+
172
+ // Load model and tokenizer
173
+ const tokenizer = await AutoTokenizer.from_pretrained(modelId);
174
+ const model = await AutoModelForCausalLM.from_pretrained(modelId, {
175
+ device: "webgpu",
176
+ dtype: "q4", // or "fp16"
177
+ });
178
+
179
+ // Prepare input
180
+ const messages = [{ role: "user", content: "What is the capital of France?" }];
181
+ const input = tokenizer.apply_chat_template(messages, {
182
+ add_generation_prompt: true,
183
+ return_dict: true,
184
+ });
185
+
186
+ // Generate with streaming
187
+ const streamer = new TextStreamer(tokenizer, { skip_prompt: true });
188
+ const output = await model.generate({
189
+ ...input,
190
+ max_new_tokens: 256,
191
+ do_sample: false,
192
+ streamer,
193
+ });
194
+
195
+ console.log(tokenizer.decode(output[0], { skip_special_tokens: true }));
196
+ ```
197
+
198
+ ### WebGPU Notes
199
+
200
+ - Supported: Q4, FP16 (Q8 not supported on WebGPU)
201
+
202
  ## License
203
 
204
  This model is released under the [LFM 1.0 License](LICENSE).