nm-testing/Qwen3-0.6B-W4A16-G128
0.2B • Updated • 2
nm-testing/Llama-3.2-1B-Instruct-DEBUG-STRAWBERRY
nm-testing/Llama-3.2-1B-Instruct-DEBUG-COUNTER
nm-testing/TinyLlama-1.1B-compressed-tensors-kv-cache-scheme
Text Generation
• 0.4B • Updated • 330
nm-testing/TinyLlama-1.1B-Chat-v1.0-kvcache-fp8-attn_head
nm-testing/TinyLlama-1.1B-Chat-v1.0-kvcache-fp8-tensor
1B • Updated • 7.08k
nm-testing/Meta-Llama-3-8B-Instruct-awq-NVFP4
nm-testing/testing-llama3.1.8b-2layer-eagle3
nm-testing/CDH-test-nvfp4-awq
5B • Updated • 2
nm-testing/granite-4.0-h-small-FP8-dynamic
Text Generation
• 32B • Updated • 10
nm-testing/tinysmokeqwen3moe-W4A16-first-only-CTstable
2.54M • Updated • 18.9k
nm-testing/Llama-3.3-70B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Llama-3.3-70B-Instruct-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Llama-3.3-70B-Instruct-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Llama-3.3-70B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Qwen3-32B-FP8-dynamic-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Qwen3-32B-FP8-dynamic-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Qwen3-32B-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Qwen3-32B-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Llama-3.1-8B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Llama-3.1-8B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Llama-3.1-8B-Instruct-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Llama-3.1-8B-Instruct-QKV-Cache-FP8-Per-Head
Updated
nm-testing/DeepSeek-R1-Distill-Qwen-32B-NVFP4
Text Generation
• 19B • Updated • 781
• 2
nm-testing/tinysmokeqwen3moe-W4A16-first-only
2.54M • Updated • 42
nm-testing/tinysmokeqwen3moe
2.93M • Updated • 3
nm-testing/Meta-Llama-3-8B-Instruct-MXFP4
5B • Updated • 4
nm-testing/granite-4.0-h-small-FP8-block
Text Generation
• 32B • Updated • 5
nm-testing/Llama-3.1-8B-Instruct-QKV-Cache-FP8
8B • Updated • 2
nm-testing/Llama3_2_1B_speculator.eagle3
0.4B • Updated • 42.2k