AmberLJC commited on
Commit
1da57e1
·
verified ·
1 Parent(s): 9ec0e35

Upload progress.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. progress.md +29 -0
progress.md ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Progress Report
2
+
3
+ ## Task: PlainMLP vs ResMLP Comparison on Distant Identity Task
4
+
5
+ - [x] Step 1: Setup project directory - DONE
6
+ - [x] Step 2: Implement PlainMLP architecture - DONE
7
+ - [x] Step 3: Implement ResMLP architecture - DONE
8
+ - [x] Step 4: Generate synthetic identity data - DONE
9
+ - [x] Step 5: Train both models for 500 steps - DONE
10
+ - [x] Step 6: Capture activation/gradient statistics - DONE
11
+ - [x] Step 7: Generate all 4 plots - DONE
12
+ - [x] Step 8: Create summary report - IN PROGRESS
13
+
14
+ ## Key Results
15
+
16
+ | Metric | PlainMLP | ResMLP |
17
+ |--------|----------|--------|
18
+ | Final Loss | 0.3123 | 0.0630 |
19
+ | Improvement | - | **5.0x** |
20
+ | Gradient Range | [7.6e-3, 1.0e-2] | [1.9e-3, 3.8e-3] |
21
+ | Activation Std Range | [0.36, 0.95] | [0.13, 0.18] |
22
+
23
+ ## Files Generated
24
+ - `experiment_final.py` - Main experiment code
25
+ - `results.json` - Numerical results
26
+ - `plots/training_loss.png` - Training loss comparison
27
+ - `plots/gradient_magnitude.png` - Per-layer gradient norms
28
+ - `plots/activation_mean.png` - Per-layer activation means
29
+ - `plots/activation_std.png` - Per-layer activation stds