rwightman
/

timm-optim-caution

Model card Files Files and versions

rwightman commited on Dec 4, 2024

Commit

58638f1

·

verified ·

1 Parent(s): 9ed845b

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -10,7 +10,7 @@ This repo contains summaries of several sets of experiments comparing a number o
 The runs were all performed training a smaller ViT (`vit_wee_patch16_reg1_gap_256`) for 200 epochs (10M samples seen) from scratch on the `timm` 'mini-imagenet' dataset, a 100 class subset of imagenet with same image sizes as originals.
-So far I have results for `adamw`, `laprop`, and `mars`. You can find full results in sub-folders by optimizer names. In all of these runs, the experiments with 'c' prefix in the name have caution enabled.
 This is what the 'caution' addition looks like in an optimizer:
 ```python

 The runs were all performed training a smaller ViT (`vit_wee_patch16_reg1_gap_256`) for 200 epochs (10M samples seen) from scratch on the `timm` 'mini-imagenet' dataset, a 100 class subset of imagenet with same image sizes as originals.
+So far I have results for `adamw`, `laprop`, and `mars` (https://huggingface.co/papers/2411.10438). You can find full results in sub-folders by optimizer names. In all of these runs, the experiments with 'c' prefix in the name have caution enabled.
 This is what the 'caution' addition looks like in an optimizer:
 ```python