FoolDev commited on
Commit
5c19c97
·
1 Parent(s): b5a3fb0

Modelfile: log measured Z13 bench data point (10.14 tok/s @ Q3_K_S)

Browse files

The hardware-notes block lists ASUS ROG Flow Z13-class machines
(32 GB unified) as a borderline target. README claims ~10 tok/s for
this device but the Modelfile itself only carried estimates. This
commit records the actual measurement that backs the README claim:

Hardware: ASUS ROG Flow Z13 GZ302EA-RU004W
AMD Ryzen AI Max+ 395 + Radeon 8060S iGPU
32 GB unified LPDDR5X, ROCm gfx1151
Service env: OLLAMA_FLASH_ATTENTION=1, OLLAMA_KV_CACHE_TYPE=q8_0
Quant: Q3_K_S
num_ctx: 16384 (Modelfile default, no override)

bench.sh 3-prompt mix:
short (mergesort one-paragraph, 161 tokens): 10.37 tok/s
medium (mergesort full, 1051 tokens): 10.31 tok/s
long (Bloom filter 120w, 6868 tokens): 10.11 tok/s
aggregate (8080 tokens): 10.14 tok/s

Previous '32 GB unified-memory laptops -> borderline' row stays in
place because it covers the Q4_K_M case; the new line attaches a
hard number to the Q3_K_S configuration that does work.

Files changed (2) hide show
  1. CHANGELOG.md +10 -0
  2. Modelfile +6 -0
CHANGELOG.md CHANGED
@@ -7,6 +7,16 @@ and documentation**, not the underlying base model.
7
 
8
  ## [Unreleased]
9
 
 
 
 
 
 
 
 
 
 
 
10
  ### Fixed
11
  - `Makefile`: propagate `TAG` to `scripts/build.sh`. The build target
12
  declares `TAG ?= janus-27b` at the top of the file (and lists it in
 
7
 
8
  ## [Unreleased]
9
 
10
+ ### Added
11
+ - `Modelfile` hardware notes: log a measured data point for the
12
+ ASUS ROG Flow Z13 GZ302EA (Ryzen AI Max+ 395 / Radeon 8060S iGPU,
13
+ 32 GB unified, ROCm gfx1151) — Q3_K_S at num_ctx 16384 reads
14
+ 10.14 tok/s aggregate over the bench.sh 3-prompt mix
15
+ (10.37 / 10.31 / 10.11 short/medium/long, 8080 tokens / 796.5 s).
16
+ Backs up the README's ~10 tok/s reference number with the exact
17
+ measurement that produced it. The previous `âš  32 GB unified-memory
18
+ laptops — borderline` row stays because it covers the Q4_K_M case.
19
+
20
  ### Fixed
21
  - `Makefile`: propagate `TAG` to `scripts/build.sh`. The build target
22
  declares `TAG ?= janus-27b` at the top of the file (and lists it in
Modelfile CHANGED
@@ -123,3 +123,9 @@ Behavior rules:
123
  # ✓ Linux box with 32 GB+ RAM (CPU-only) — ~1-3 tok/s
124
  # âš  32 GB unified-memory laptops — borderline at Q4, try Q3_K_S
125
  # (~12 GB) and trim num_ctx
 
 
 
 
 
 
 
123
  # ✓ Linux box with 32 GB+ RAM (CPU-only) — ~1-3 tok/s
124
  # âš  32 GB unified-memory laptops — borderline at Q4, try Q3_K_S
125
  # (~12 GB) and trim num_ctx
126
+ #
127
+ # Measured data point (ASUS ROG Flow Z13 GZ302EA, Ryzen AI Max+ 395 +
128
+ # Radeon 8060S iGPU, 32 GB unified, ROCm gfx1151, OLLAMA_FLASH_ATTENTION=1,
129
+ # OLLAMA_KV_CACHE_TYPE=q8_0):
130
+ # Q3_K_S, num_ctx 16384, 3-prompt mix → 10.14 tok/s aggregate
131
+ # (8080 tokens / 796.5 s; 10.37 / 10.31 / 10.11 short/medium/long)