NexaAI
/

DeepSeek-R1-Distill-Llama-8B-NexaQuant

Model card Files Files and versions

alanzhuly commited on Feb 12, 2025

Commit

17cf515

·

verified ·

1 Parent(s): 93bff76

Update README.md

Files changed (1) hide show

README.md +5 -2

README.md CHANGED Viewed

@@ -27,16 +27,19 @@ We’ve solved the trade-off by quantizing the DeepSeek R1 Distilled model to on
 Here’s a comparison of how a standard Q4_K_M and NexaQuant-4Bit handle a common investment banking brain teaser question. NexaQuant excels in accuracy while shrinking the model file size by 4 times.
 Prompt: A Common Investment Banking BrainTeaser Question
-There is a 6x8 rectangular chocolate bar made up of small 1x1 bits. We want to break it into the 48 bits. We can break one piece of chocolate horizontally or vertically, but cannot break two pieces together! What is the minimum number of breaks required?
 Right Answer: 1/4
 <div align="center">
-  <img src="https://cdn-uploads.huggingface.co/production/uploads/6618e0424dbef6bd3c72f89a/Su_cnRibwFMDmo2lznQky.png" width="80%" alt="Example" />
 </div>
 ## Benchmarks
 The benchmarks show that NexaQuant’s 4-bit model preserves the reasoning capacity of the original 16-bit model, delivering uncompromised performance in a significantly smaller memory & storage footprint. Model's general capacity is also greatly improved by NexaQuant.

 Here’s a comparison of how a standard Q4_K_M and NexaQuant-4Bit handle a common investment banking brain teaser question. NexaQuant excels in accuracy while shrinking the model file size by 4 times.
 Prompt: A Common Investment Banking BrainTeaser Question
+A stick is broken into 3 parts, by choosing 2 points randomly along its length. With what probability can it form a triangle?
 Right Answer: 1/4
 <div align="center">
+  <img src="https://cdn-uploads.huggingface.co/production/uploads/6618e0424dbef6bd3c72f89a/jOtgsAnr6nttS0mnu0snZ.png" width="80%" alt="Example" />
 </div>
 ## Benchmarks
 The benchmarks show that NexaQuant’s 4-bit model preserves the reasoning capacity of the original 16-bit model, delivering uncompromised performance in a significantly smaller memory & storage footprint. Model's general capacity is also greatly improved by NexaQuant.