Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

178

Base only

Active filters: gpu

ConfidentialMind/gte-multilingual-reranker-base-onnx-op14-opt-gpu

Sentence Similarity • Updated Jul 7, 2025 • 8

ConfidentialMind/gte-multilingual-reranker-base-onnx-op19-opt-gpu

Sentence Similarity • Updated Jul 7, 2025 • 17

langutang/protege-lg

Robotics • Updated Apr 26, 2025

sbeierle/fame-pytorch-kit

Updated Apr 28, 2025

excribe/classifer_sgd_longformer_4099

Text Classification • 0.1B • Updated May 6, 2025 • 5

lilbablo/humigencev2

Text Generation • Updated Oct 1, 2025

AhmedAyman/k2-think-cuda-1505

Text Generation • Updated Oct 26, 2025 • 3

Eltamuan/Gravitas-Torch-2.8-Blackwell-Edition

Updated Nov 3, 2025

magiccodingman/Qwen3-4B-Instruct-2507-MXFP4-Hybrid-GGUF

Text Generation • 4B • Updated Dec 3, 2025 • 114

magiccodingman/Qwen3-4B-Thinking-2507-MXFP4-Hybrid-GGUF

Text Generation • 4B • Updated Dec 3, 2025 • 37 • 1

magiccodingman/Qwen3-4B-Thinking-2507-Unsloth-MXFP4-Hybrid-GGUF

Text Generation • 4B • Updated Dec 3, 2025 • 32 • 1

magiccodingman/Qwen3-4B-Instruct-2507-Unsloth-MXFP4-Hybrid-GGUF

Text Generation • 4B • Updated Dec 3, 2025 • 109 • 2

magiccodingman/Seed-OSS-36B-Instruct-Unsloth-MXFP4-Hybrid-GGUF

Text Generation • 36B • Updated Dec 3, 2025 • 14 • 1

magiccodingman/Granite-4.0-H-350M-Unsloth-MXFP4-Hybrid-GGUF

Text Generation • 0.3B • Updated Dec 2, 2025 • 23

magiccodingman/Granite-4.0-H-1B-Unsloth-MXFP4-Hybrid-GGUF

Text Generation • 1B • Updated Dec 3, 2025 • 36

magiccodingman/Apriel-1.5-15b-Thinker-Unsloth-MXFP4-Hybrid-GGUF

Text Generation • 14B • Updated Apr 25 • 291

magiccodingman/Qwen3-VL-8B-Thinking-Unsloth-MXFP4-Hybrid-GGUF

Text Generation • 8B • Updated Apr 25 • 326 • 1

magiccodingman/Qwen3-VL-8B-Instruct-Unsloth-MXFP4-Hybrid-GGUF

Text Generation • 8B • Updated Dec 3, 2025 • 137 • 2

magiccodingman/Qwen3-VL-32B-Thinking-Unsloth-MXFP4-Hybrid-GGUF

Text Generation • 33B • Updated Dec 3, 2025 • 47

magiccodingman/Granite-4.0-H-350M-Unsloth-MagicQuant-Hybrid-GGUF

Text Generation • 0.3B • Updated Apr 25 • 99 • 1

magiccodingman/Qwen3-4B-Instruct-2507-Unsloth-MagicQuant-Hybrid-GGUF

Text Generation • 4B • Updated Apr 25 • 541 • 8

magiccodingman/Qwen3-4B-Thinking-2507-Unsloth-MagicQuant-Hybrid-GGUF

Text Generation • 4B • Updated Apr 25 • 501 • 2

magiccodingman/Qwen3-30B-A3B-Thinking-2507-unsloth-MagicQuant-Hybrid-GGUF

Text Generation • 31B • Updated Apr 25 • 511 • 5

magiccodingman/Qwen3-30B-A3B-Instruct-2507-unsloth-MagicQuant-Hybrid-GGUF

Text Generation • 31B • Updated Apr 25 • 457 • 4

magiccodingman/Seed-OSS-36B-Instruct-unsloth-MagicQuant-Hybrid-GGUF

Text Generation • 36B • Updated Apr 25 • 1.03k • 10

magiccodingman/Apriel-1.5-15b-Thinker-unsloth-MagicQuant-Hybrid-GGUF

Text Generation • 14B • Updated Apr 25 • 369 • 2

Stan31/quantumflow-prototypes

Updated Dec 19, 2025

Jong-Seong/qwen3-next-gb10-guide

Hellohal2064/vllm-dgx-spark-gb10

Text Generation • Updated Jan 6 • 5

Jens-Duttke/DepthPro-ONNX-HighPerf

Depth Estimation • Updated Jan 30 • 6 • 1