Vet-Rate Vision Phi (q4f32_1 WebGPU)
A WebGPU-optimized compilation of Microsoft's Phi-3.5 Vision model for browser-based inference.
Model Description
This is a quantized (q4f32_1) version of microsoft/Phi-3.5-vision-instruct compiled for WebGPU using MLC-LLM. It's designed for use with WebLLM in standard browsers without requiring experimental Chrome flags.
Key Features
- ๐ Browser-native: Runs entirely in-browser via WebGPU
- ๐ท Vision capable: Supports image understanding and analysis
- โก Optimized: q4f32_1 quantization for efficient memory usage
- ๐ Privacy-first: All processing happens locally on your device
Technical Specifications
| Property | Value |
|---|---|
| Base Model | microsoft/Phi-3.5-vision-instruct |
| Quantization | q4f32_1 (int4, float32 model dtype) |
| Model Size | ~2.6 GB (quantized weights) |
| WASM Library | 6.6 MB |
| Context Window | 131,072 tokens |
| Parameters | ~4B |
| Vision Encoder | CLIP ViT-L/14 (336px) |
Usage with WebLLM
import { CreateMLCEngine } from "@mlc-ai/web-llm";
const engine = await CreateMLCEngine("Vet-Rate-org/Vet-Rate-Vision-Phi");
// Text-only chat
const response = await engine.chat.completions.create({
messages: [{ role: "user", content: "Hello!" }]
});
// Vision chat (with image)
const response = await engine.chat.completions.create({
messages: [{
role: "user",
content: [
{ type: "image_url", image_url: { url: "data:image/jpeg;base64,..." } },
{ type: "text", text: "What do you see in this image?" }
]
}]
});
- Downloads last month
- 8
Model tree for Vet-Rate-org/Vet-Rate-Vision-Phi
Base model
microsoft/Phi-3.5-vision-instruct