Inference Providers
Active filters: 8-bit
FakeRockert543/gemma-4-31b-it-MLX-8bit
Image-Text-to-Text
• 10B • Updated • 1.47k
• 2
prithivMLmods/gemma-4-31B-it-NVFP4
Image-Text-to-Text
• 20B • Updated • 2.22k
• 2
prithivMLmods/gemma-4-31B-it-MXFP4
Image-Text-to-Text
• 19B • Updated • 177
• 2
nightmedia/gemma-4-31B-it-Claude-Opus-Distill-mxfp8-mlx
Image-Text-to-Text
• 9B • Updated • 1.83k
• 2
natfii/Qwen3.5-27B-NVFP4-Opus-GB10
Text Generation
• 16B • Updated • 609
• 2
Neural-ICE/Gemma-4-26B-A4B-it-NVFP4
Text Generation
• 15B • Updated • 805
• 2
ecastera/eva-mistral-catmacaroni-7b-spanish
Text Generation
• 7B • Updated • 14
• 3
MaziyarPanahi/Meta-Llama-3-8B-Instruct-GGUF
Text Generation
• 8B • Updated • 106k
• 102
MaziyarPanahi/Mistral-7B-Instruct-v0.3-GGUF
Text Generation
• 7B • Updated • 120k
• 135
MaziyarPanahi/Llama-3-Groq-8B-Tool-Use-GGUF
Text Generation
• 8B • Updated • 515
• 7
hxbgsyxh/bitnet_b1_58-3B_bitblas
0.9B • Updated • 4
• 1
HF1BitLLM/Llama3-8B-1.58-100B-tokens
Text Generation
• 3B • Updated • 2.86k
• 209
RedHatAI/Llama-3.2-1B-Instruct-quantized.w8a8
Text Generation
• 1B • Updated • 12.8k
• 8
tiiuae/Falcon3-10B-Instruct-1.58bit
Text Generation
• 3B • Updated • 677
• 25
Text Generation
• 397B • Updated • 5.54k
• 274
nvidia/Llama-4-Scout-17B-16E-Instruct-NVFP4
56B • Updated • 63.9k
• 29
nvidia/DeepSeek-V3-0324-NVFP4
Text Generation
• 397B • Updated • 51.9k
• 16
nvidia/DeepSeek-R1-0528-NVFP4
Text Generation
• 397B • Updated • 4.77k
• 43
Thomaschtl/8bit_agressive
0.6B • Updated • 6
• 1
baidu/ERNIE-4.5-300B-A47B-FP8-Paddle
Text Generation
• 300B • Updated • 7
• 17
baidu/ERNIE-4.5-300B-A47B-2Bits-Paddle
Text Generation
• 92B • Updated • 6
• 20
nvidia/Qwen3-235B-A22B-NVFP4
Text Generation
• 133B • Updated • 3.45k
• 15
nvidia/DeepSeek-R1-0528-NVFP4-v2
Text Generation
• 394B • Updated • 585k
• 22
nvidia/DeepSeek-R1-NVFP4-v2
Text Generation
• 394B • Updated • 4.9k
• 6
mlx-community/Qwen3-Coder-30B-A3B-Instruct-8bit
Text Generation
• Updated • 1.09k
• 5
lmstudio-community/gpt-oss-20b-MLX-8bit
Text Generation
• 21B • Updated • 4.74k
• 51
lmstudio-community/gpt-oss-120b-MLX-8bit
Text Generation
• 117B • Updated • 59.9k
• 13
mlx-community/LFM2-VL-450M-8bit
Image-Text-to-Text
• 0.2B • Updated • 811
• 11
mlx-community/LFM2-VL-1.6B-8bit
Image-Text-to-Text
• 0.7B • Updated • 50
• 10
nvidia/Phi-4-reasoning-plus-NVFP4
8B • Updated • 1.48k
• 8