RiverkanIT
/

Ling-mini-2.0-Quantized

Text Generation

Model card Files Files and versions

riverkan commited on Sep 16, 2025

Commit

6e45b1c

·

verified ·

1 Parent(s): 5aa2b05

Update README.md

Files changed (1) hide show

README.md +11 -1

README.md CHANGED Viewed

@@ -131,9 +131,19 @@ In interactive mode (`-i`), simply paste your question and press Enter. The chat
 ## Performance (CPU)
 - Q4_0 on AMD Ryzen 5 5600G with Radeon Graphics (3.90 GHz): ~35 tokens/second (output), measured in a typical chat generation scenario.
 - Actual throughput varies with prompt length, context size, threads, OS, and build flags.
-- Q8_0 generally yields higher quality but slightly lower token/s vs Q4_0 on the same CPU.
 ## Which file should I choose?

 ## Performance (CPU)
+```bash
+./build/bin/main -m ling-mini-2.0-q4.bin --seed 1"
+```
 - Q4_0 on AMD Ryzen 5 5600G with Radeon Graphics (3.90 GHz): ~35 tokens/second (output), measured in a typical chat generation scenario.
+```bash
+./build/bin/main -m ling-mini-2.0-q8.bin --seed 1"
+```
+- Q8_0 on AMD Ryzen 5 5600G with Radeon Graphics (3.90 GHz): ~20 tokens/second (output), measured in a typical chat generation scenario.
+Notes:
 - Actual throughput varies with prompt length, context size, threads, OS, and build flags.
 ## Which file should I choose?