LLM Compressor testing - a nm-testing Collection

nm-testing 's Collections

KV Cache Quantization

FP8-Block Quantized Models

LLM Compressor testing

Speculators testing

Sparse-Llama-3.1-8B-2of4

LLM Compressor testing

updated Nov 17, 2025

nm-testing/tinysmokellama-3.2

354k • Updated 19 days ago • 86.3k
nm-testing/llama2.c-stories42M-pruned2.4

Updated Oct 29, 2025 • 324
nm-testing/tinyllama-fp8-dynamic-compressed

1B • Updated Oct 9, 2024 • 487
nm-testing/tinyllama-w4a16-compressed

1B • Updated Oct 9, 2024 • 1.17k
nm-testing/tinyllama-w8a8-compressed

1B • Updated Oct 9, 2024 • 769
nm-testing/tinyllama-w8a16-dense

1B • Updated Mar 7 • 581
nm-testing/TinyLlama-1.1B-Chat-v1.0-FP8-Dynamic-compressed

1B • Updated Jan 14, 2025 • 456
nm-testing/TinyLlama-1.1B-Chat-v1.0-FP8-Dynamic-uncompressed

1B • Updated Jan 14, 2025 • 156
nm-testing/TinyLlama-1.1B-Chat-v1.0-W4A16-G128-compressed

1B • Updated Jan 14, 2025 • 399
nm-testing/TinyLlama-1.1B-Chat-v1.0-W4A16-G128-uncompressed

1B • Updated Jan 14, 2025 • 163
nm-testing/TinyLlama-1.1B-Chat-v1.0-W8A8-Dynamic-Per-Token-compressed

1B • Updated Jan 14, 2025 • 382
nm-testing/TinyLlama-1.1B-Chat-v1.0-W8A8-Dynamic-Per-Token-uncompressed

1B • Updated Jan 14, 2025 • 158
nm-testing/TinyLlama-1.1B-Chat-v1.0-W8A16-G128-compressed

1B • Updated Jan 14, 2025 • 433
nm-testing/TinyLlama-1.1B-Chat-v1.0-W8A16-G128-uncompressed

1B • Updated Jan 14, 2025 • 164