jedisct1 commited on
Commit
c55021c
·
verified ·
1 Parent(s): 1d3db99

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -75,7 +75,7 @@ The first plain `Q2_K` family candidate was small enough, but it was not reliabl
75
  - embeddings and output tensors stay higher precision because they are important for token identity and exact syntax
76
  - attention tensors are protected because tool-call and code prompts are structure-heavy
77
  - the dense first FFN is protected because early-layer representation quality matters disproportionately after heavy quantization
78
- - MoE down-expert tensors use `Q3_K`; this was kept from the known-good imatrix recipe rather than isolated as the only required choice
79
 
80
  That is why this is still a Q2-class build, but not the smallest possible Q2 build.
81
 
@@ -107,7 +107,7 @@ The important point is not that these small harnesses prove universal coding abi
107
 
108
  ## Tool-calling validation
109
 
110
- Tool calling was exercised in realistic agent loops rather than only checking toy single-call examples. The harness used for this validation was [Swival](https://swival.dev). Nothing in the build is tied to it, and any OpenAI-compatible agent harness should work just as well.
111
 
112
  Validation included:
113
 
 
75
  - embeddings and output tensors stay higher precision because they are important for token identity and exact syntax
76
  - attention tensors are protected because tool-call and code prompts are structure-heavy
77
  - the dense first FFN is protected because early-layer representation quality matters disproportionately after heavy quantization
78
+ - MoE down-expert tensors use `Q3_K`, which was a better quality/memory tradeoff than pushing all expert down-projections lower
79
 
80
  That is why this is still a Q2-class build, but not the smallest possible Q2 build.
81
 
 
107
 
108
  ## Tool-calling validation
109
 
110
+ Tool calling was exercised in realistic agent loops rather than only checking toy single-call examples. The harness used for this validation was [Swival](https://swival.dev). Nothing in the build is tied to it, and any OpenAI-compatible agent harness is likely to work in much the same way, but Swival is the only one that has actually been put through its paces here.
111
 
112
  Validation included:
113