Update README.md
Browse files
README.md
CHANGED
|
@@ -233,6 +233,36 @@ if __name__ == "__main__":
|
|
| 233 |
print(f"{row['D']:5d} {row['avg_cv']:8.4f} {row['in_band_pct']:5.1f}% {row['status']}")
|
| 234 |
```
|
| 235 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 236 |
## Implications for Architecture Design
|
| 237 |
|
| 238 |
The band is not a training outcome. It is a geometric property of dimensionality. This means:
|
|
|
|
| 233 |
print(f"{row['D']:5d} {row['avg_cv']:8.4f} {row['in_band_pct']:5.1f}% {row['status']}")
|
| 234 |
```
|
| 235 |
|
| 236 |
+
## Large Vocabulary Ablation
|
| 237 |
+
|
| 238 |
+
The CV is consistent with the findings and deterministically sample-capable for validity and conjunctive utility.
|
| 239 |
+
|
| 240 |
+
```
|
| 241 |
+
D=32 fixed. CV across vocab sizes.
|
| 242 |
+
Pool capped at 512 for fair comparison.
|
| 243 |
+
============================================================
|
| 244 |
+
V= 32 D=32 CV=0.2578 0.1s 0MB
|
| 245 |
+
V= 512 D=32 CV=0.2615 0.0s 0MB
|
| 246 |
+
V= 8,192 D=32 CV=0.2578 0.0s 1MB
|
| 247 |
+
V= 65,536 D=32 CV=0.2663 0.0s 8MB
|
| 248 |
+
V= 131,072 D=32 CV=0.2590 0.0s 17MB
|
| 249 |
+
V= 500,000 D=32 CV=0.2745 0.1s 64MB
|
| 250 |
+
V= 1,000,000 D=32 CV=0.2645 0.2s 128MB
|
| 251 |
+
V= 4,000,000 D=32 CV=0.2541 0.9s 512MB
|
| 252 |
+
V=13,000,000 D=32 CV=0.2681 2.9s 1664MB
|
| 253 |
+
|
| 254 |
+
============================================================
|
| 255 |
+
Now uncapped pool (sample from ALL embeddings):
|
| 256 |
+
============================================================
|
| 257 |
+
V= 512 D=32 CV=0.2591 pool=512
|
| 258 |
+
V= 8,192 D=32 CV=0.2427 pool=8192
|
| 259 |
+
V= 65,536 D=32 CV=0.2684 pool=65536
|
| 260 |
+
V= 500,000 D=32 CV=0.2562 pool=500000
|
| 261 |
+
```
|
| 262 |
+
|
| 263 |
+
|
| 264 |
+
|
| 265 |
+
|
| 266 |
## Implications for Architecture Design
|
| 267 |
|
| 268 |
The band is not a training outcome. It is a geometric property of dimensionality. This means:
|