ump success

SHA256: ed282780c3599cbaed972ac9c9ef58a4e4048ede41f358da8ae88f4c00d0b757
Pointer size: 131 Bytes
Size of remote file: 953 kB

Files changed (2) hide show

d-adaptation/notes.md CHANGED Viewed

@@ -10,7 +10,8 @@ UMP ends up with image generations that look like a single brown square, still t
 As noted in the same github issue, alpha/rank scaling modifies the gradient update to become smaller and thus d-adaptation to boost the learning rate. This could be the reason why it goes bad.
 UMP redone at dim 8 alpha 8 showed recognizable character but still significantly degraded aesthetics and prompt coherence.
 ## Dim
 128 dim shows some local noisy patterns. Reranking the model to a lower dim from 128 doesn't get rid of it. Converting the weights of the last up block in the unet does but also causes a noticable change in the generated character. Obviously you could reduce the last up block by a smaller amount.

 As noted in the same github issue, alpha/rank scaling modifies the gradient update to become smaller and thus d-adaptation to boost the learning rate. This could be the reason why it goes bad.
 UMP redone at dim 8 alpha 8 showed recognizable character but still significantly degraded aesthetics and prompt coherence.
+After redoing UMP at dim 8 alpha 8 with less cosine restarts (16->9), the results are much better.
+Consine restarts would likely affect how much time we spend at a high learning rate which could be the reason for blowing the model apart.
 ## Dim
 128 dim shows some local noisy patterns. Reranking the model to a lower dim from 128 doesn't get rid of it. Converting the weights of the last up block in the unet does but also causes a noticable change in the generated character. Obviously you could reduce the last up block by a smaller amount.

d-adaptation/ump45_(girls'_frontline)/dim8_alpha8_less_restarts_success.jpg ADDED Viewed