Momentum (β): Acts like a physical ball with weight. It keeps moving in the same direction, helping it "roll" through flat valleys and over small local pits.
The Challenge: Rosenbrock and Rastrigin are "non-convex" or have "vanishing gradients." Vanilla GD is often too weak to reach the center without help!
💡 Pro Tip: If the ball gets stuck or vibrates wildly, you must adjust the learning rate or momentum. Or turn on Adaptive Rate to let the algorithm handle it!