Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

speculative-decoding

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

725

Base only

Active filters: speculative-decoding

RedHatAI/Llama-4-Maverick-17B-128E-Instruct-speculator.eagle3

2B • Updated Apr 8 • 30

HathoraResearch/qwen3_30b_moe_eagle3-ultra-1k-sample

0.2B • Updated Sep 10, 2025 • 4 • 2

husj576/GTO-llama31-instruct-8B

Text Generation • Updated Mar 4 • 13

nm-testing/Llama4-Maverick-Eagle3-Speculators-64k-vocab

Updated Oct 22, 2025 • 5

jukofyork/Mistral-Large-Instruct-2411-DRAFT-0.4B-v3.0

0.4B • Updated Oct 28, 2025 • 2

jukofyork/Mistral-Large-Instruct-2411-DRAFT-0.4B-v3.0-GGUF

0.4B • Updated Oct 28, 2025 • 12

mradermacher/Mistral-Large-Instruct-2411-DRAFT-0.4B-v3.0-GGUF

0.4B • Updated Oct 29, 2025 • 104

jukofyork/command-a-03-2025-DRAFT-0.8B-v3.0

0.8B • Updated Oct 29, 2025 • 2 • 1

jukofyork/command-a-03-2025-DRAFT-0.8B-v3.0-GGUF

0.8B • Updated Oct 29, 2025 • 14

taobao-mnn/Qwen3-4B-Instruct-2507-Eagle3

Text Generation • 0.2B • Updated Oct 30, 2025 • 67 • 1

taobao-mnn/Qwen3-VL-2B-Instruct-Eagle3

Text Generation • 0.1B • Updated Oct 31, 2025 • 210 • 6

mradermacher/command-a-03-2025-DRAFT-0.8B-v3.0-GGUF

0.8B • Updated Oct 30, 2025 • 32

mradermacher/command-a-03-2025-DRAFT-0.8B-v3.0-i1-GGUF

0.8B • Updated Dec 10, 2025 • 95

taobao-mnn/Qwen3-VL-4B-Instruct-Eagle3

Text Generation • 0.2B • Updated Nov 3, 2025 • 35

taobao-mnn/Qwen3-VL-2B-Thinking-Eagle3

Text Generation • 0.1B • Updated Nov 3, 2025 • 9

taobao-mnn/Qwen3-VL-4B-Thinking-Eagle3

Text Generation • 0.2B • Updated Nov 10, 2025 • 6 • 1

taobao-mnn/Qwen3-4B-Thinking-2507-Eagle

Text Generation • 0.2B • Updated Nov 10, 2025 • 376 • 1

Zjcxy-SmartAI/Eagle3-Qwen3-8B-zh

Text Generation • Updated Dec 15, 2025 • 303 • 3

taobao-mnn/Qwen3-VL-8B-Instruct-Eagle3

Text Generation • 0.4B • Updated Nov 14, 2025 • 6.48k • 1

JinnP/SGLang-EAGLE3-Qwen3-Coder-30B-A3B-Instruct

Text Generation • 0.2B • Updated Nov 25, 2025 • 10 • 2

thomaskiefer/EAGLE3-Apertus-8B-Instruct-2509

Text Generation • 0.5B • Updated Nov 30, 2025 • 3

RyeCatcher/speculative-decoding-cross-domain-analysis

Updated Nov 30, 2025

loganrobbins/parallel-decoder-transformer

Text Generation • Updated Dec 12, 2025 • 1

z-lab/Qwen3-8B-DFlash-b16

Text Generation • 1B • Updated Apr 7 • 97.7k • 25

z-lab/Qwen3-4B-DFlash-b16

Text Generation • 0.5B • Updated Apr 7 • 39.5k • 27

valcore/DSSD-Llama3-8B

Updated Jan 8 • 3

valcore/DSSD-Qwen3-0.6B

Updated Jan 8 • 5

yschoi31/DeepSeek-V2-Lite-Eagle3

0.2B • Updated Jan 13 • 25

yschoi31/Qwen1.5-MoE-A2.7B-Eagle3

0.1B • Updated Jan 13 • 27

ruv/ruvltra-small

Text Generation • 0.5B • Updated Mar 28 • 29