Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

speculative-decoding

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

760

Base only

Active filters: speculative-decoding

CosmicRaisins/GLM-5.2-MTP-INT4-15pct

Text Generation • 9B • Updated 5 days ago • 311 • 3

nerkyor/Qwen3.6-27B-DSV4Pro-Thinking-Distill-FP8

Text Generation • 28B • Updated about 11 hours ago • 288 • 3

jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY

Text Generation • 27B • Updated 4 days ago • 453 • 3

skinnyctax/Ornith-1.0-35B-Q6_K-Frankenstein-MTP-GGUF

Text Generation • 0.8B • Updated 2 days ago • 1.35k • 3

lightseekorg/kimi-k2.5-eagle3

3B • Updated Mar 16 • 87.3k • 15

sakamakismile/Huihui-Qwen3.6-27B-abliterated-NVFP4-MTP

Text Generation • 17B • Updated 28 days ago • 56.5k • 63

AEON-7/Qwen3.6-27B-AEON-Ultimate-Uncensored-Multimodal-NVFP4-MTP

Text Generation • 20B • Updated about 14 hours ago • 36.5k • 21

z-lab/gemma-4-26B-A4B-it-DFlash

Text Generation • 0.4B • Updated May 9 • 20k • 53

Youssofal/Qwen3.6-27B-MTPLX-Optimized-Speed

Text Generation • 5B • Updated 22 days ago • 25.2k • 49

rdtand/Qwen3.6-27B-PrismaSCOUT-Blackwell-NVFP4-BF16-vllm

17B • Updated May 4 • 66.5k • 31

kasimat/Qwen3.6-27B-AEON-Ultimate-Uncensored-FP8-MTP

Image-Text-to-Text • 28B • Updated May 6 • 54.6k • 18

z-lab/MiniMax-M2.7-DFlash

Text Generation • 0.6B • Updated 3 days ago • 619 • 25

canada-quant/DeepSeek-V4-Flash-W4A16-FP8-MTP

Text Generation • 51B • Updated 30 days ago • 15.3k • 15

Jackrong/Qwopus3.6-35B-A3B-v1-MTP-GGUF

Image-Text-to-Text • Updated May 28 • 41k • 49

dealignai/Qwen3.6-35B-A3B-MXFP4-CRACK-MTP

Image-Text-to-Text • 6B • Updated May 24 • 7.53k • 26

Jackrong/Qwopus3.5-4B-Coder

Text Generation • 5B • Updated May 28 • 10.2k • 15

PiehSoft/Qwen3.6-40B-Deckard-MTP

Text Generation • 39B • Updated 10 days ago • 2.94k • 12

Janvitos/gemma-4-12B-it-qat-assistant-MTP-Q8_0-GGUF

0.4B • Updated 22 days ago • 45.6k • 30

ManiacLabs/DeepSeek-V4-Flash-EAGLE3.1

Text Generation • Updated 21 days ago • 247 • 6

plunderstruck/Qwen3.6-35B-A3B-MTP-ROCmFP4-GGUF

0.4B • Updated 8 days ago • 5.52k • 6

modal-labs/Qwen3.5-397B-A17B-DFlash

Text Generation • 1B • Updated 13 days ago • 575 • 5

pearsonkyle/Qwopus3.6-27B-Coder-imatrix-MTP-GGUF

Text Generation • 27B • Updated 3 days ago • 5.23k • 4

k-l-lambda/kimi-k2.7-code-eagle3-mla

2B • Updated 4 days ago • 2.38k • 2

AEON-7/Qwen3.6-27B-AEON-Ultimate-Uncensored-Multimodal-MLX-8bit

Image-Text-to-Text • 8B • Updated about 14 hours ago • 925 • 2

inferencerlabs/GLM-5.2-MTP-MLX-Q4

Image-Text-to-Text • 2B • Updated 4 days ago • 1.53k • 2

nerkyor/Qwen3.6-27B-DSV4Pro-Thinking-Distill-NVFP4

Text Generation • 19B • Updated about 13 hours ago • 101 • 2

wang-yang/Ornith-1.0-35B-MTPLX

Text Generation • 35B • Updated 2 days ago • 68 • 2

wang-yang/Ornith-1.0-35B-MTP-GGUF

Text Generation • 36B • Updated 1 day ago • 8 • 2

protoLabsAI/Ornith-1.0-9B-MTP

Updated 1 day ago • 2

neko-legends/Ornith-1.0-35B-AEON-Ultimate-Uncensored-NVFP4-GGUF-MTP

Text Generation • 36B • Updated about 4 hours ago • 2