FoolDev Claude Opus 4.7 commited on
Commit
8a225df
·
1 Parent(s): 4208793

CHANGELOG: log the README "27B vs 35B receipts" and Ollama-vision tightening

Browse files

Two doc commits today (bec5589, 4208793) weren't reflected in the
Unreleased section. Adding them under Changed for parity with how
prior README work was logged in this section — the receipts entry
captures the measured 27B/35B perf comparison, the vision entry
captures the corrected understanding of where the architecture error
actually fires.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Files changed (1) hide show
  1. CHANGELOG.md +15 -0
CHANGELOG.md CHANGED
@@ -49,6 +49,21 @@ and documentation**, not the underlying base model.
49
  now points 32 GB users at `make build QUANT=Q3_K_S` directly.
50
 
51
  ### Changed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
52
  - README: added a TL;DR section right after the intro paragraph so
53
  someone scanning the page gets a working command without scrolling
54
  past Why / What's here / Architecture.
 
49
  now points 32 GB users at `make build QUANT=Q3_K_S` directly.
50
 
51
  ### Changed
52
+ - README "Why a 27B variant?": the "slower per token than 35B-A3B"
53
+ claim now ships with a measured comparison — 27B Q3_K_S clocks ~10
54
+ tok/s vs ~27 tok/s for the 35B at ~Q4 on the same Ryzen AI Max+ 395
55
+ / Radeon 8060S iGPU (`make bench`). Same hardware, asymmetric quants
56
+ (35B doesn't ship a Q3_K_S), but the ~2.7x ratio is what you'd
57
+ expect from ~3B vs 27B active params.
58
+ - README + `examples/README.md` Vision sections: tightened the
59
+ Ollama-0.22 failure-mode description. Previously implied that
60
+ `ollama create` itself errors with `unknown model architecture`.
61
+ Empirically (against dense `qwen35` 27B + `mmproj-F16`), `ollama
62
+ create` succeeds, `ollama show` even reports the `vision`
63
+ capability with a CLIP projector attached, and the architecture
64
+ error only fires from the runner on the first inference request —
65
+ at which point it blocks text inference too. Matches the upstream
66
+ issue's "blocks ALL inference" phrasing.
67
  - README: added a TL;DR section right after the intro paragraph so
68
  someone scanning the page gets a working command without scrolling
69
  past Why / What's here / Architecture.