| ********* | |
| Callbacks | |
| ********* | |
| Exponential Moving Average (EMA) | |
| ================================ | |
| During training, EMA maintains a moving average of the trained parameters. | |
| EMA parameters can produce significantly better results and faster convergence for a variety of different domains and models. | |
| EMA is a simple calculation. EMA Weights are pre-initialized with the model weights at the start of training. | |
| Every training update, the EMA weights are updated based on the new model weights. | |
| .. math:: | |
| ema_w = ema_w * decay + model_w * (1-decay) | |
| Enabling EMA is straightforward. We can pass the additional argument to the experiment manager at runtime. | |
| .. code-block:: bash | |
| python examples/asr/asr_ctc/speech_to_text_ctc.py \ | |
| model.train_ds.manifest_filepath=/path/to/my/train/manifest.json \ | |
| model.validation_ds.manifest_filepath=/path/to/my/validation/manifest.json \ | |
| trainer.devices=2 \ | |
| trainer.accelerator='gpu' \ | |
| trainer.max_epochs=50 \ | |
| exp_manager.ema.enable=True # pass this additional argument to enable EMA | |
| To change the decay rate, pass the additional argument. | |
| .. code-block:: bash | |
| python examples/asr/asr_ctc/speech_to_text_ctc.py \ | |
| ... | |
| exp_manager.ema.enable=True \ | |
| exp_manager.ema.decay=0.999 | |
| We also offer other helpful arguments. | |
| .. list-table:: | |
| :header-rows: 1 | |
| * - Argument | |
| - Description | |
| * - `exp_manager.ema.validate_original_weights=True` | |
| - Validate the original weights instead of EMA weights. | |
| * - `exp_manager.ema.every_n_steps=2` | |
| - Apply EMA every N steps instead of every step. | |
| * - `exp_manager.ema.cpu_offload=True` | |
| - Offload EMA weights to CPU. May introduce significant slow-downs. | |