Spaces:
Sleeping
Sleeping
Create README_stage5.md
Browse files- README_stage5.md +59 -0
README_stage5.md
ADDED
|
@@ -0,0 +1,59 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Stage Five — ViT-Small/B32 (ImageNet Subset) Energy-Scaling Validation
|
| 2 |
+
|
| 3 |
+
**Rendered Frame Theory (RFT)**
|
| 4 |
+
Author: Liam S. Grinstead
|
| 5 |
+
Date: Oct‑2025
|
| 6 |
+
|
| 7 |
+
---
|
| 8 |
+
|
| 9 |
+
## 📄 Abstract
|
| 10 |
+
Stage Five scales RFT from ViT‑Tiny to ViT‑Small/B32, testing whether coherence‑linked efficiency persists at higher depth and embedding dimension. Using a consistent telemetry schema (drift, flux, E_ret, coherence, J/step, ΔT), RFT (DCLR + Ψ–Ω) is compared with Adam under matched conditions. Results show reduced energy per step and stable drift/flux at comparable accuracy, confirming that RFT’s efficiency gains hold as model capacity increases.
|
| 11 |
+
|
| 12 |
+
---
|
| 13 |
+
|
| 14 |
+
## 🎯 Objective
|
| 15 |
+
Validate that RFT’s energy and stability advantages generalise to ViT‑Small/B32 by measuring J/step, drift, flux, and accuracy on an ImageNet‑like workload, with bf16 autocast where available and identical hyperparameters across modes.
|
| 16 |
+
|
| 17 |
+
---
|
| 18 |
+
|
| 19 |
+
## ⚙️ Methodology
|
| 20 |
+
- **Model:** ViT‑Small, patch size 32, dim 384, depth 12, heads 6, MLP ratio 4
|
| 21 |
+
- **Data:** ImageNet‑subset via ImageFolder (recommended), or synthetic fallback for quick verification
|
| 22 |
+
- **Setup:** Python 3.10, PyTorch ≥ 2.1, A100/H100 (bf16 autocast if available), seed 1234
|
| 23 |
+
- **Metrics:** Loss, accuracy, J/step (NVML if present; proxy otherwise), drift, flux, energy‑retention (E_ret), coherence (coh), ΔT
|
| 24 |
+
- **Parity:** Same batch size, learning rate, and number of steps across RFT and BASE
|
| 25 |
+
- **Orbital Coupler:** Ψ–Ω drift/flux synchronisation each iteration
|
| 26 |
+
- **Optimisers:** DCLR (RFT) vs Adam (BASE)
|
| 27 |
+
|
| 28 |
+
---
|
| 29 |
+
|
| 30 |
+
## 📊 Results
|
| 31 |
+
- **RFT (DCLR + Ψ–Ω):** Reduced energy per step compared to Adam, with tightly bounded drift and smooth flux.
|
| 32 |
+
- **Baseline (Adam):** Higher J/step and less stable drift/flux behaviour at matched accuracy.
|
| 33 |
+
- **Synthetic fallback:** Reproduced the same qualitative efficiency pattern, confirming that gains arise from optimiser–telemetry dynamics rather than dataset artefacts.
|
| 34 |
+
|
| 35 |
+
---
|
| 36 |
+
|
| 37 |
+
## 💡 Discussion
|
| 38 |
+
Scaling from ViT‑Tiny to ViT‑Small/B32 preserves RFT’s advantages in attention‑heavy architectures. The energy reduction with stable drift/flux strengthens the claim that coherence‑linked control is architecture‑agnostic and scales with depth and embedding dimension.
|
| 39 |
+
|
| 40 |
+
---
|
| 41 |
+
|
| 42 |
+
## ✅ Conclusion
|
| 43 |
+
RFT maintains its efficiency and stability benefits at ViT‑Small/B32 scale, validating the energy‑scaling hypothesis and setting the stage for ViT‑Base and multi‑modal fusion in later stages.
|
| 44 |
+
|
| 45 |
+
---
|
| 46 |
+
|
| 47 |
+
## 📂 Reproducibility
|
| 48 |
+
- **Script:** `stage5.py`
|
| 49 |
+
- **Log Output:** `stage5_vit_small_b32.jsonl`
|
| 50 |
+
- **Seed:** 1234
|
| 51 |
+
- **Hardware:** A100/H100 (CPU fallback supported)
|
| 52 |
+
- **Sealing:** All runs sealed with SHA‑512 hashes
|
| 53 |
+
|
| 54 |
+
---
|
| 55 |
+
|
| 56 |
+
## 🚀 Usage
|
| 57 |
+
- **RFT mode:**
|
| 58 |
+
```bash
|
| 59 |
+
python stage5.py --mode RFT --steps 1000 --batch 256 --lr 5e-4 --data_dir /path/to/imagenet_subset
|