File size: 1,082 Bytes
1da57e1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# Progress Report

## Task: PlainMLP vs ResMLP Comparison on Distant Identity Task

- [x] Step 1: Setup project directory - DONE
- [x] Step 2: Implement PlainMLP architecture - DONE
- [x] Step 3: Implement ResMLP architecture - DONE
- [x] Step 4: Generate synthetic identity data - DONE
- [x] Step 5: Train both models for 500 steps - DONE
- [x] Step 6: Capture activation/gradient statistics - DONE
- [x] Step 7: Generate all 4 plots - DONE
- [x] Step 8: Create summary report - IN PROGRESS

## Key Results

| Metric | PlainMLP | ResMLP |
|--------|----------|--------|
| Final Loss | 0.3123 | 0.0630 |
| Improvement | - | **5.0x** |
| Gradient Range | [7.6e-3, 1.0e-2] | [1.9e-3, 3.8e-3] |
| Activation Std Range | [0.36, 0.95] | [0.13, 0.18] |

## Files Generated
- `experiment_final.py` - Main experiment code
- `results.json` - Numerical results
- `plots/training_loss.png` - Training loss comparison
- `plots/gradient_magnitude.png` - Per-layer gradient norms
- `plots/activation_mean.png` - Per-layer activation means
- `plots/activation_std.png` - Per-layer activation stds