Vet-Rate Vision Phi (q4f32_1 WebGPU)

A WebGPU-optimized compilation of Microsoft's Phi-3.5 Vision model for browser-based inference.

Model Description

This is a quantized (q4f32_1) version of microsoft/Phi-3.5-vision-instruct compiled for WebGPU using MLC-LLM. It's designed for use with WebLLM in standard browsers without requiring experimental Chrome flags.

Key Features

🚀 Browser-native: Runs entirely in-browser via WebGPU
📷 Vision capable: Supports image understanding and analysis
⚡ Optimized: q4f32_1 quantization for efficient memory usage
🔒 Privacy-first: All processing happens locally on your device

Technical Specifications

Property	Value
Base Model	microsoft/Phi-3.5-vision-instruct
Quantization	q4f32_1 (int4, float32 model dtype)
Model Size	~2.6 GB (quantized weights)
WASM Library	6.6 MB
Context Window	131,072 tokens
Parameters	~4B
Vision Encoder	CLIP ViT-L/14 (336px)

Usage with WebLLM

import { CreateMLCEngine } from "@mlc-ai/web-llm";

const engine = await CreateMLCEngine("Vet-Rate-org/Vet-Rate-Vision-Phi");

// Text-only chat
const response = await engine.chat.completions.create({
  messages: [{ role: "user", content: "Hello!" }]
});

// Vision chat (with image)
const response = await engine.chat.completions.create({
  messages: [{
    role: "user",
    content: [
      { type: "image_url", image_url: { url: "data:image/jpeg;base64,..." } },
      { type: "text", text: "What do you see in this image?" }
    ]
  }]
});

Downloads last month: 1

Model tree for Vet-Rate-org/Vet-Rate-Vision-Phi

Base model

microsoft/Phi-3.5-vision-instruct

Finetuned

(23)

this model