inference-optimization/Llama-3.2-1B-Instruct-6.5-bits-mode-heuristic-per-tensor 1B • Updated 14 days ago • 35
inference-optimization/Llama-3.2-1B-Instruct-6-bits-mode-noise-per-tensor 1B • Updated 14 days ago • 40
inference-optimization/Llama-3.2-1B-Instruct-6-bits-mode-hybrid-per-tensor 1B • Updated 14 days ago • 34
inference-optimization/Llama-3.2-1B-Instruct-6-bits-mode-heuristic-per-tensor 1B • Updated 14 days ago • 35
inference-optimization/Llama-3.2-1B-Instruct-5.5-bits-mode-noise-per-tensor 1B • Updated 14 days ago • 35
inference-optimization/Llama-3.2-1B-Instruct-5.5-bits-mode-hybrid-per-tensor 1B • Updated 14 days ago • 35
inference-optimization/Llama-3.2-1B-Instruct-5.5-bits-mode-heuristic-per-tensor 1B • Updated 14 days ago • 37
inference-optimization/Llama-3.2-1B-Instruct-5-bits-mode-noise-per-tensor 1B • Updated 14 days ago • 35
inference-optimization/Llama-3.2-1B-Instruct-5-bits-mode-hybrid-per-tensor 1B • Updated 14 days ago • 38
inference-optimization/Llama-3.2-1B-Instruct-5-bits-mode-heuristic-per-tensor 1B • Updated 14 days ago • 34
inference-optimization/Meta-Llama-3-8B-Instruct-spinquantR1R2R4-w4a16-gptq 2B • Updated 15 days ago • 73
inference-optimization/Meta-Llama-3-8B-Instruct-spinquantR1R2R4-w4a16-qmod 2B • Updated 15 days ago • 18
inference-optimization/Meta-Llama-3-8B-Instruct-spinquantR1R2R4-nvfp4-qmod 5B • Updated 15 days ago • 21
inference-optimization/Meta-Llama-3-8B-Instruct-spinquantR1R2R4-nvfp4-gptq 5B • Updated 15 days ago • 22