rwightman
/

timm-optim-caution

Model card Files Files and versions

rwightman commited on Dec 4, 2024

Commit

9ed845b

·

verified ·

1 Parent(s): 89bea5c

Update README.md

Files changed (1) hide show

README.md +8 -0

README.md CHANGED Viewed

@@ -12,6 +12,14 @@ The runs were all performed training a smaller ViT (`vit_wee_patch16_reg1_gap_25
 So far I have results for `adamw`, `laprop`, and `mars`. You can find full results in sub-folders by optimizer names. In all of these runs, the experiments with 'c' prefix in the name have caution enabled.
 # LaProp
 |optim                       |best_epoch|train_loss        |eval_loss         |eval_top1        |eval_top5        |lr                    |

 So far I have results for `adamw`, `laprop`, and `mars`. You can find full results in sub-folders by optimizer names. In all of these runs, the experiments with 'c' prefix in the name have caution enabled.
+This is what the 'caution' addition looks like in an optimizer:
+```python
+    mask = (exp_avg * grad > 0).to(grad.dtype)
+    mask.div_(mask.mean().clamp_(min=1e-3))
+    exp_avg = exp_avg * mask
+```
 # LaProp
 |optim                       |best_epoch|train_loss        |eval_loss         |eval_top1        |eval_top5        |lr                    |