resmlp_comparison / progress.md
AmberLJC's picture
Upload progress.md with huggingface_hub
1da57e1 verified

Progress Report

Task: PlainMLP vs ResMLP Comparison on Distant Identity Task

  • Step 1: Setup project directory - DONE
  • Step 2: Implement PlainMLP architecture - DONE
  • Step 3: Implement ResMLP architecture - DONE
  • Step 4: Generate synthetic identity data - DONE
  • Step 5: Train both models for 500 steps - DONE
  • Step 6: Capture activation/gradient statistics - DONE
  • Step 7: Generate all 4 plots - DONE
  • Step 8: Create summary report - IN PROGRESS

Key Results

Metric PlainMLP ResMLP
Final Loss 0.3123 0.0630
Improvement - 5.0x
Gradient Range [7.6e-3, 1.0e-2] [1.9e-3, 3.8e-3]
Activation Std Range [0.36, 0.95] [0.13, 0.18]

Files Generated

  • experiment_final.py - Main experiment code
  • results.json - Numerical results
  • plots/training_loss.png - Training loss comparison
  • plots/gradient_magnitude.png - Per-layer gradient norms
  • plots/activation_mean.png - Per-layer activation means
  • plots/activation_std.png - Per-layer activation stds