Inference Providers
Active filters: kv-cache
flovflo/turboquant-mlx-qwen35-kv
caiovicentino1/Qwen3.5-9B-Claude-Opus-PolarQuant-Q5
Text Generation
• 9B • Updated • 572
• 1
caiovicentino1/Qwopus3.5-9B-v3-PolarQuant-Q5
Text Generation
• 9B • Updated • 454
• 1
fromthesky/PLDR-LLM-v51-104M
Text Generation
• 0.1B • Updated • 2
fromthesky/PLDR-LLM-v51-110M-1
Text Generation
• 0.1B • Updated • 6
fromthesky/PLDR-LLM-v51-110M-2
Text Generation
• 0.1B • Updated • 3
fromthesky/PLDR-LLM-v51-110M-3
Text Generation
• 0.1B • Updated • 7
fromthesky/PLDR-LLM-v51-110M-4
Text Generation
• 0.1B • Updated • 5
fromthesky/PLDR-LLM-v51-110M-5
Text Generation
• 0.1B • Updated • 2
fromthesky/PLDR-LLM-v51-DAG-110M
Text Generation
• 0.1B • Updated • 1
fromthesky/PLDR-LLM-v51G-106M-1
Text Generation
• 0.1B • Updated • 2
fromthesky/PLDR-LLM-v51G-106M-2
Text Generation
• 0.1B • Updated • 5
fromthesky/PLDR-LLM-v51G-106M-3
Text Generation
• 0.1B • Updated • 5
fromthesky/PLDR-LLM-v51G-106M-test
Text Generation
• 0.1B • Updated • 1
fromthesky/PLDR-LLM-v52-81M-FT-SC-1
Text Classification
• 81M • Updated • 2
fromthesky/PLDR-LLM-v52-81M-FT-QA-1
Question Answering
• 81M • Updated • 1
fromthesky/PLDR-LLM-v52-81M-FT-TC-1
Token Classification
• 81M • Updated • 4
fromthesky/PLDR-LLM-v52-110M-1
Text Generation
• 0.1B • Updated • 2
nm-testing/Llama-3.1-8B-Instruct-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Llama-3.1-8B-Instruct-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Llama-3.1-8B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Llama-3.1-8B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Qwen3-32B-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Qwen3-32B-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Qwen3-32B-FP8-dynamic-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Qwen3-32B-FP8-dynamic-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Llama-3.3-70B-Instruct-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Llama-3.3-70B-Instruct-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Llama-3.3-70B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Llama-3.3-70B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Head
Updated