denis's picture

1 17

denis

dananev

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Reinforcement Learning via Self-Distillation

liked a Space 1 day ago

webml-community/Supertonic-TTS-WebGPU

liked a Space 1 day ago

webml-community/FunctionGemma-Physics-Playground

View all activity

Organizations

None yet

upvoted a paper 1 day ago

Reinforcement Learning via Self-Distillation

Paper • 2601.20802 • Published 2 days ago • 20

liked 2 Spaces 1 day ago

Supertonic TTS WebGPU

Blazingly fast text-to-speech 100% locally in your browser

FunctionGemma Physics Playground

Use natural language to solve fun physics simulation puzzles

liked a model 1 day ago

suayptalha/Arcana-Qwen3-2.4B-A0.6B

Text Generation • Updated May 13, 2025 • 61 • 33

liked a model 2 days ago

glogwa68/Qwen3-0.6B-DISTILL-glm-4.7-think-GGUF

Text Generation • 0.8B • Updated Dec 24, 2025 • 486 • 1

liked 4 models 4 days ago

Qwen/Qwen3-1.7B

Text Generation • 2B • Updated Jul 26, 2025 • 3.48M • • 410

Qwen/Qwen3-32B

Text Generation • 33B • Updated Jul 26, 2025 • 1.65M • • 634

Qwen/Qwen3-30B-A3B

Text Generation • 31B • Updated Jul 26, 2025 • 631k • • 844

unsloth/DeepSeek-R1-0528-Qwen3-8B

Text Generation • 8B • Updated Jun 10, 2025 • 4.26k • • 18

reacted to reach-vb's post with 👍 about 1 year ago

Post

2434

On-device AI framework ecosystem is blooming these days:

1. llama.cpp - All things Whisper, LLMs & VLMs - run across Metal, CUDA and other backends (AMD/ NPU etc)
https://github.com/ggerganov/llama.cpp

2. MLC - Deploy LLMs across platforms especially WebGPU (fastest WebGPU LLM implementation out there)
https://github.com/mlc-ai/web-llm

3. MLX - Arguably the fastest general purpose framework (Mac only) - Supports all major Image Generation (Flux, SDXL, etc), Transcription (Whisper), LLMs
https://github.com/ml-explore/mlx-examples

4. Candle - Cross-platform general purpose framework written in Rust - wide coverage across model categories
https://github.com/huggingface/candle

Honorable mentions:

1. Transformers.js - Javascript (WebGPU) implementation built on top of ONNXruntimeweb
https://github.com/xenova/transformers.js

2. Mistral rs - Rust implementation for LLMs & VLMs, built on top of Candle
https://github.com/EricLBuehler/mistral.rs

3. Ratchet - Cross platform, rust based WebGPU framework built for battle-tested deployments
https://github.com/huggingface/ratchet

4. Zml - Cross platform, Zig based ML framework
https://github.com/zml/zml

Looking forward to how the ecosystem would look 1 year from now - Quite bullish on the top 4 atm - but open source ecosystem changes quite a bit! 🤗

Also, which frameworks did I miss?

1 reply

·

liked a model about 1 year ago

MaziyarPanahi/Llama-3.2-1B-Instruct-GGUF

Text Generation • 1B • Updated Sep 25, 2024 • 145k • 17

liked a Space about 1 year ago

GGUF My Repo

Create quantized models from Hugging Face repos

liked 7 models about 1 year ago

deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

Text Generation • 2B • Updated Feb 24, 2025 • 691k • • 1.44k

deepseek-ai/DeepSeek-R1

Text Generation • 685B • Updated Mar 27, 2025 • 410k • • 13k

PrunaAI/Phi-3-mini-4k-instruct-GGUF-Imatrix-smashed

4B • Updated Aug 2, 2024 • 492 • 6

deepseek-ai/DeepSeek-V3

Text Generation • Updated Mar 27, 2025 • 841k • • 4.02k

Qwen/Qwen2.5-0.5B-Instruct-GGUF

Text Generation • 0.6B • Updated Sep 20, 2024 • 37.7k • 70

Qwen/Qwen2.5-0.5B-Instruct

Text Generation • 0.5B • Updated Sep 25, 2024 • 1.86M • 451

Qwen/Qwen2.5-0.5B-Instruct-AWQ

Text Generation • 0.6B • Updated Oct 9, 2024 • 1.41k • 10