Update app.py
Browse files
app.py
CHANGED
|
@@ -328,7 +328,7 @@ Crucially, the exact moment the threshold hit **31%**, performance collapsed (-2
|
|
| 328 |
### 6. Conclusion & Core Findings
|
| 329 |
1. **Multiple-Choice Interfaces Distort Calibration:** When standard token generation heads are trapped by layout options, internal confidence drops predictably into a narrow **25% to 29% band**.
|
| 330 |
2. **Blind Ensembles Generalize Poorly:** Standard majority voting across different inference tracks penalizes the unique correct responses hidden inside sequence likelihood strings.
|
| 331 |
-
3. **The Optimal Architecture:** The most robust execution pipeline for this system is an **Unsupervised Entropy-Gate Router**. By trusting standard token choices when confidence is
|
| 332 |
""")
|
| 333 |
|
| 334 |
# --- Reactive Event Loop ---
|
|
|
|
| 328 |
### 6. Conclusion & Core Findings
|
| 329 |
1. **Multiple-Choice Interfaces Distort Calibration:** When standard token generation heads are trapped by layout options, internal confidence drops predictably into a narrow **25% to 29% band**.
|
| 330 |
2. **Blind Ensembles Generalize Poorly:** Standard majority voting across different inference tracks penalizes the unique correct responses hidden inside sequence likelihood strings.
|
| 331 |
+
3. **The Optimal Architecture:** The most robust execution pipeline for this system is an **Unsupervised Entropy-Gate Router**. By trusting standard token choices when confidence is 29%, and falling back to the position-blind Perplexity engine when confidence drops below 29%, the pipeline maximizes the model's performance without degrading base performance across unseen data distributions.
|
| 332 |
""")
|
| 333 |
|
| 334 |
# --- Reactive Event Loop ---
|