Upload progress.md with huggingface_hub
Browse files- progress.md +29 -0
progress.md
ADDED
|
@@ -0,0 +1,29 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Progress Report
|
| 2 |
+
|
| 3 |
+
## Task: PlainMLP vs ResMLP Comparison on Distant Identity Task
|
| 4 |
+
|
| 5 |
+
- [x] Step 1: Setup project directory - DONE
|
| 6 |
+
- [x] Step 2: Implement PlainMLP architecture - DONE
|
| 7 |
+
- [x] Step 3: Implement ResMLP architecture - DONE
|
| 8 |
+
- [x] Step 4: Generate synthetic identity data - DONE
|
| 9 |
+
- [x] Step 5: Train both models for 500 steps - DONE
|
| 10 |
+
- [x] Step 6: Capture activation/gradient statistics - DONE
|
| 11 |
+
- [x] Step 7: Generate all 4 plots - DONE
|
| 12 |
+
- [x] Step 8: Create summary report - IN PROGRESS
|
| 13 |
+
|
| 14 |
+
## Key Results
|
| 15 |
+
|
| 16 |
+
| Metric | PlainMLP | ResMLP |
|
| 17 |
+
|--------|----------|--------|
|
| 18 |
+
| Final Loss | 0.3123 | 0.0630 |
|
| 19 |
+
| Improvement | - | **5.0x** |
|
| 20 |
+
| Gradient Range | [7.6e-3, 1.0e-2] | [1.9e-3, 3.8e-3] |
|
| 21 |
+
| Activation Std Range | [0.36, 0.95] | [0.13, 0.18] |
|
| 22 |
+
|
| 23 |
+
## Files Generated
|
| 24 |
+
- `experiment_final.py` - Main experiment code
|
| 25 |
+
- `results.json` - Numerical results
|
| 26 |
+
- `plots/training_loss.png` - Training loss comparison
|
| 27 |
+
- `plots/gradient_magnitude.png` - Per-layer gradient norms
|
| 28 |
+
- `plots/activation_mean.png` - Per-layer activation means
|
| 29 |
+
- `plots/activation_std.png` - Per-layer activation stds
|