AmberLJC
/

gradient_clipping_experiment

Model card Files Files and versions

gradient_clipping_experiment / todo.md

AmberLJC's picture

Upload todo.md with huggingface_hub

d9e1a5d verified 9 days ago

|

history blame contribute delete

866 Bytes

Gradient Clipping Experiment

Objective

Demonstrate how gradient clipping stabilizes training by preventing sudden large weight updates caused by rare, high-loss data points.

Task Breakdown

Step 1: Implement simple PyTorch model (Embedding + Linear)
Step 2: Create imbalanced synthetic dataset (990 'A', 10 'B' targets)
Step 3: Training loop WITHOUT gradient clipping - record metrics
Step 4: Training loop WITH gradient clipping (threshold=1.0) - record metrics
Step 5: Generate comparison plots
Step 6: Write summary report with findings

Key Metrics to Track

Training loss per step
L2 norm of gradients (before clipping)
L2 norm of model weights

Expected Outcome

Without clipping: Spiky gradient norms when encountering rare 'B' samples
With clipping: Bounded gradient norms, more stable training