Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

One-click Deployment

Inference Endpoints

Microsoft Foundry

Amazon SageMaker AI

Misc

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

351

Base only

Active filters: cuda

prism-ml/Ternary-Bonsai-27B-gguf

Text Generation • 4B • Updated 15 days ago • 734k • • 1.14k

prism-ml/Bonsai-27B-gguf

Text Generation • 4B • Updated 16 days ago • 2.55M • 719

prism-ml/Bonsai-27B-mlx-1bit

Text Generation • 2B • Updated 19 days ago • 204k • 199

prism-ml/Ternary-Bonsai-27B-mlx-2bit

Text Generation • 3B • Updated 19 days ago • 197k • 163

badtheorylabs/BTL-3-Compact

Text Generation • 8B • Updated 9 days ago • 2.44k • 37

EschaLabs/escha-runtime-qwen3moe

Updated 3 days ago • 6

prism-ml/Bonsai-8B-gguf

Text Generation • 8B • Updated Apr 18 • 66k • 762

owensong/Inflect-Nano-v2-ONNX

Text-to-Speech • Updated 7 days ago • 71 • 9

owensong/Inflect-Micro-v2-ONNX

Text-to-Speech • Updated 7 days ago • 138 • 10

ussoewwin/Flash-Attention-2_for_Windows

Updated 15 days ago • 126

prism-ml/Bonsai-1.7B-gguf

Text Generation • 2B • Updated Apr 18 • 44.7k • 84

neroued/Qwen3.6-27B-nvfp4-NInfer

Image-Text-to-Text • Updated 3 days ago • 245 • 3

prism-ml/bonsai-image-ternary-4B-gemlite-2bit

Text-to-Image • Updated Jun 1 • 4.6k • 130

ValiantLabs/gpt-oss-20b-ShiningValiant3

Text Generation • 21B • Updated Dec 9, 2025 • 26 • 20

TheStageAI/thewhisper-large-v3-turbo

Automatic Speech Recognition • 0.8B • Updated 4 days ago • 1.94k • 26

Thelocallab/CUDA-wheels

Updated 3 days ago • 1

prism-ml/Bonsai-4B-gguf

Text Generation • 4B • Updated Apr 18 • 17.8k • 53

prism-ml/bonsai-image-binary-4B-gemlite-1bit

Text-to-Image • Updated Jun 1 • 410 • 48

ESpeech/milfer_denoiser_v1.0

Audio-to-Audio • Updated Jun 1 • 6

neroued/Qwen3.6-27B-NInfer

Image-Text-to-Text • Updated 3 days ago • 2.54k • 4

neroued/Qwen3.6-35B-A3B-NInfer

Image-Text-to-Text • Updated 3 days ago • 5.81k • 2

petr567/Ornith-1.0-35B-MTP-Strix-Halo-Hybrid-GGUF

Text Generation • 36B • Updated 14 days ago • 990 • 7

drbaph/CrisperWhisper2.0-GGML

Automatic Speech Recognition • Updated 6 days ago • 1

SwinliQ-AIs/Bonsai-27B-gguf

Text Generation • 4B • Updated 3 days ago • 240 • 1

elinas/chronos-13b-4bit

Text Generation • Updated Jun 23, 2023 • 12 • 23

CalderaAI/13B-Ouroboros-GPTQ4bit-128g-CUDA

Text Generation • Updated Jul 20, 2023 • 6

marcorez8/llama-cpp-python-windows-blackwell-cuda

Updated Jun 2, 2025 • 4

Prince-1/sarvam-m-onnx

Updated Jun 14, 2025 • 1

ValiantLabs/Qwen3-8B-ShiningValiant3

Text Generation • 8B • Updated Dec 9, 2025 • 9 • 3

mradermacher/Qwen3-8B-ShiningValiant3-GGUF

8B • Updated Jul 8, 2025 • 581 • 2