Running Featured 100 Supertonic TTS WebGPU ⚡ 100 Blazingly fast text-to-speech 100% locally in your browser
Running Featured 101 FunctionGemma Physics Playground 🕹 101 Use natural language to solve fun physics simulation puzzles
glogwa68/Qwen3-0.6B-DISTILL-glm-4.7-think-GGUF Text Generation • 0.8B • Updated Dec 24, 2025 • 486 • 1
view post Post 2434 On-device AI framework ecosystem is blooming these days:1. llama.cpp - All things Whisper, LLMs & VLMs - run across Metal, CUDA and other backends (AMD/ NPU etc)https://github.com/ggerganov/llama.cpp2. MLC - Deploy LLMs across platforms especially WebGPU (fastest WebGPU LLM implementation out there)https://github.com/mlc-ai/web-llm3. MLX - Arguably the fastest general purpose framework (Mac only) - Supports all major Image Generation (Flux, SDXL, etc), Transcription (Whisper), LLMshttps://github.com/ml-explore/mlx-examples4. Candle - Cross-platform general purpose framework written in Rust - wide coverage across model categorieshttps://github.com/huggingface/candleHonorable mentions:1. Transformers.js - Javascript (WebGPU) implementation built on top of ONNXruntimewebhttps://github.com/xenova/transformers.js2. Mistral rs - Rust implementation for LLMs & VLMs, built on top of Candlehttps://github.com/EricLBuehler/mistral.rs3. Ratchet - Cross platform, rust based WebGPU framework built for battle-tested deploymentshttps://github.com/huggingface/ratchet4. Zml - Cross platform, Zig based ML frameworkhttps://github.com/zml/zmlLooking forward to how the ecosystem would look 1 year from now - Quite bullish on the top 4 atm - but open source ecosystem changes quite a bit! 🤗Also, which frameworks did I miss? 1 reply · 👍 11 11 🔥 3 3 + Reply
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B Text Generation • 2B • Updated Feb 24, 2025 • 691k • • 1.44k