Upload folder using huggingface_hub
Browse files
v2_corrected_evaluation/IDENTITY_V2_REPORT_20260122_144008.md
ADDED
|
@@ -0,0 +1,89 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Identity Reconstruction V2: Fixed Evaluation
|
| 2 |
+
|
| 3 |
+
**Date:** 2026-01-22T14:40:08.858456
|
| 4 |
+
|
| 5 |
+
## Critical Fix
|
| 6 |
+
|
| 7 |
+
The previous version had inverted PASS/FAIL logic:
|
| 8 |
+
- β Old: "PASS if curve is steep" (shape-based)
|
| 9 |
+
- β New: "PASS if preserved_rate >= 50% at K >= L/2" (performance-based)
|
| 10 |
+
|
| 11 |
+
## Results
|
| 12 |
+
|
| 13 |
+
### Collapsed (No Regularization)
|
| 14 |
+
|
| 15 |
+
| Checkpoint | c6a8102da1ad8c77 |
|
| 16 |
+
|------------|---|
|
| 17 |
+
| tau range | [2.4, 9.8] |
|
| 18 |
+
| tau mean | 6.6 |
|
| 19 |
+
|
| 20 |
+
| K | Preserved Rate | Mean Retention |
|
| 21 |
+
|---|----------------|----------------|
|
| 22 |
+
| 0 | 100% β | 100.0% |
|
| 23 |
+
| 64 | 20% β | 32.2% |
|
| 24 |
+
| 128 | 0% β | 15.1% |
|
| 25 |
+
| 256 | 20% β | 34.6% |
|
| 26 |
+
| 512 | 20% β | 28.7% |
|
| 27 |
+
| 1,024 | 0% β | 8.9% |
|
| 28 |
+
| 2,048 | 0% β | 23.7% |
|
| 29 |
+
| 4,096 | 20% β | 29.6% |
|
| 30 |
+
| 8,192 | 20% β | 27.9% |
|
| 31 |
+
|
| 32 |
+
**Verdict:** FAIL
|
| 33 |
+
**Basin Width:** 0 (0.0% of L=4096)
|
| 34 |
+
**Explanation:** Identity collapses by K=0. No meaningful long-range coherence.
|
| 35 |
+
|
| 36 |
+
### Regularized
|
| 37 |
+
|
| 38 |
+
| Checkpoint | c3ca1ce88b2083bf |
|
| 39 |
+
|------------|---|
|
| 40 |
+
| tau range | [0.5, 6931.1] |
|
| 41 |
+
| tau mean | 433.9 |
|
| 42 |
+
|
| 43 |
+
| K | Preserved Rate | Mean Retention |
|
| 44 |
+
|---|----------------|----------------|
|
| 45 |
+
| 0 | 100% β | 100.0% |
|
| 46 |
+
| 64 | 100% β | 88.2% |
|
| 47 |
+
| 128 | 100% β | 77.6% |
|
| 48 |
+
| 256 | 100% β | 67.4% |
|
| 49 |
+
| 512 | 40% β | 49.0% |
|
| 50 |
+
| 1,024 | 0% β | 29.8% |
|
| 51 |
+
| 2,048 | 0% β | 19.5% |
|
| 52 |
+
| 4,096 | 0% β | 15.5% |
|
| 53 |
+
| 8,192 | 0% β | 17.6% |
|
| 54 |
+
|
| 55 |
+
**Verdict:** FAIL
|
| 56 |
+
**Basin Width:** 256 (6.2% of L=4096)
|
| 57 |
+
**Explanation:** Identity collapses by K=256. No meaningful long-range coherence.
|
| 58 |
+
|
| 59 |
+
## Comparison
|
| 60 |
+
|
| 61 |
+
| Metric | Collapsed | Regularized |
|
| 62 |
+
|--------|-----------|-------------|
|
| 63 |
+
| Verdict | FAIL | FAIL |
|
| 64 |
+
| Basin Width | 0 | 256 |
|
| 65 |
+
| Basin Width Ratio | 0.0% | 6.2% |
|
| 66 |
+
|
| 67 |
+
**Improvement:** YES
|
| 68 |
+
**Improvement Factor:** infx
|
| 69 |
+
|
| 70 |
+
## Per-Oscillator Half-Lives
|
| 71 |
+
|
| 72 |
+
### Collapsed
|
| 73 |
+
```
|
| 74 |
+
[8.191648388447712, 5.5110275180164185, 8.868783359291054, 7.578944232474908, 2.7534187831011967, 9.80497881309404, 8.08911761592282, 8.288514442215634, 3.0249090614043674, 5.603087503164535, 4.966384193860649, 9.414119910788811, 7.150920960645315, 8.582092906166643, 5.54731359061865, 3.8179097742782147, 6.436678296126676, 2.510538048833402, 8.62104937594066, 7.053315192976514, 8.064701920682985, 4.836207745038946, 9.765584195159231, 9.144968970577576, 8.227067976590098, 3.557109662815741, 5.733768029816274, 2.35043012629783, 3.234315936540382, 7.464391625939635, 7.958097247262541, 9.740077859473676]
|
| 75 |
+
```
|
| 76 |
+
|
| 77 |
+
### Regularized
|
| 78 |
+
```
|
| 79 |
+
[0.5986606020374855, 0.5028913517239931, 1.0160728924654787, 0.5986606020347446, 0.47764157767363097, 1.0160728924446252, 0.5986606020368003, 0.5986606020354299, 0.48694254072470705, 0.5028913517239931, 0.5028913517227875, 1.016072892460572, 0.47764157767363097, 0.7330948926419327, 0.5028913517179652, 0.5025812461450115, 0.7330948926535658, 6931.125226233421, 0.5028913517227875, 0.7330948926435946, 0.5986606020368003, 0.5028913517203764, 1.0160728924507585, 1.0160728924507585, 0.5986606020368003, 0.5025812461377809, 0.4869425407241158, 6931.125226233421, 0.4869425407264809, 4.8104604306517835, 0.5986606020368003, 1.0160728924507585]
|
| 80 |
+
```
|
| 81 |
+
|
| 82 |
+
## Honest Assessment
|
| 83 |
+
|
| 84 |
+
The regularizer improves basin width, but the improvement is **marginal**.
|
| 85 |
+
Basin width is still far below the sequence length.
|
| 86 |
+
|
| 87 |
+
---
|
| 88 |
+
|
| 89 |
+
*Report generated by identity_reconstruction_experiment_v2.py*
|
v2_corrected_evaluation/identity_v2_20260122_144008.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|