CPU-Hybrid-MoE/DeepSeek-R1-0528-GPU-FP8-GPTQ4 Text Generation • 106B • Updated about 1 month ago • 22
RedHatAI/Llama-4-Maverick-17B-128E-Instruct-FP8-block Text Generation • 402B • Updated Oct 27 • 19 • 1
nm-testing/TinyLlama-1.1B-Chat-v1.0-W8A8_channel_weight_static_per_tensor-e2e 1B • Updated 10 days ago • 25
nm-testing/TinyLlama-1.1B-Chat-v1.0-W8A8_tensor_weight_static_per_tensor_act-e2e 1B • Updated 10 days ago • 21
nm-testing/TinyLlama-1.1B-Chat-v1.0-kv_cache_default_gptq_tinyllama-e2e 0.3B • Updated 9 days ago • 21