nousr
/

conditioned-prior

Model card Files Files and versions

xet

Community

nousr commited on Jun 8, 2022

Commit

ee6e0e8

1 Parent(s): 10edf38

Update README.md

Browse files

you have 7 days from this commit to backup old models before they get culled

Files changed (1) hide show

README.md +50 -37

README.md CHANGED Viewed

@@ -6,60 +6,73 @@ license: mit
 ## Training Details
 Training details can be found here: https://wandb.ai/nousr_laion/conditioned-prior/reports/LAION-DALLE2-PyTorch-Prior--VmlldzoyMDI2OTIx
 ## Source Code
-Models are diffusion trainers from https://github.com/lucidrains/DALLE2-pytorch with defaults specified in the train_diffusion_prior.py script
 ## Community: LAION
 Join Us!: https://discord.gg/uPMftTmrvS
 ---
 # Models
-```
-depth=12
-d_model=768
-clip = OpenAIClipAdapter(clip_choice=["ViT-L/14" | "ViT-B/32"])
-```
 ### Loading the models might look something like this:
 ```python
-def load_diffusion_model(dprior_path, device, clip_choice):
-    loaded_obj = torch.load(str(dprior_path), map_location='cpu')
-    if clip_choice == "ViT-B/32":
-        dim = 512
-    else:
-        dim = 768
     prior_network = DiffusionPriorNetwork(
-        dim=dim,
-        depth=12,
         dim_head=64,
-        heads=12,
-        normformer=True
-    ).to(device)
     diffusion_prior = DiffusionPrior(
         net=prior_network,
-        clip=OpenAIClipAdapter(clip_choice),
-        image_embed_dim=dim,
         timesteps=1000,
         cond_drop_prob=0.1,
         loss_type="l2",
-    ).to(device)
-    diffusion_prior.load_state_dict(loaded_obj["model"], strict=True)
-    diffusion_prior = DiffusionPriorTrainer(
-                      diffusion_prior = diffusion_prior,
-                      lr = 1.1e-4,
-                      wd = 6.02e-2,
-                      max_grad_norm = 0.5,
-                      amp = False,
-                  ).to(device)
-    diffusion_prior.optimizer.load_state_dict(loaded_obj['optimizer'])
-    diffusion_prior.scaler.load_state_dict(loaded_obj['scaler'])
-    return diffusion_prior
 ```

 ## Training Details
 Training details can be found here: https://wandb.ai/nousr_laion/conditioned-prior/reports/LAION-DALLE2-PyTorch-Prior--VmlldzoyMDI2OTIx
 ## Source Code
+Models are diffusion trainers from https://github.com/lucidrains/DALLE2-pytorch
 ## Community: LAION
 Join Us!: https://discord.gg/uPMftTmrvS
 ---
 # Models
+The repo currently has many models (most of which are actually pretty bad). I recommend using the latest ema checkpoints for now.
+> **_DISCLAIMER_**: **I will be removing many of the older models**. They were trained on older versions of the repo and massively under perform recent models. **If for whatever reason you want an old model please make a backup** (you have 7 days from this README commit timestamp).
 ### Loading the models might look something like this:
+> Note: This repo's documentation will get an overhaul \~soon\~. If you're reading this, and having issues loading checkpoints, please reach out on LAION.
 ```python
+import torch
+from dalle2_pytorch import DiffusionPrior, DiffusionPriorNetwork, OpenAIClipAdapter
+from dalle2_pytorch.trainer import DiffusionPriorTrainer
+def load_diffusion_model(dprior_path, device):
+    # If you are getting issues with size mismatches, it's likely this configuration
     prior_network = DiffusionPriorNetwork(
+        dim=768,
+        depth=24,
         dim_head=64,
+        heads=32,
+        normformer=True,
+        attn_dropout=5e-2,
+        ff_dropout=5e-2,
+        num_time_embeds=1,
+        num_image_embeds=1,
+        num_text_embeds=1,
+        num_timesteps=1000,
+        ff_mult=4
+    )
+    # currently, only ViT-L/14 models are being trained
     diffusion_prior = DiffusionPrior(
         net=prior_network,
+        clip=OpenAIClipAdapter("ViT-L/14"),
+        image_embed_dim=768,
         timesteps=1000,
         cond_drop_prob=0.1,
         loss_type="l2",
+        condition_on_text_encodings=True,
+    )
+    # this will load the entire trainer
+    # If you only want EMA weights for inference you will need to extract them yourself for now
+    # (if you beat me to writing a nice function for that please make a PR on Github!)
+    trainer = DiffusionPriorTrainer(
+        diffusion_prior=diffusion_prior,
+        lr=1.1e-4,
+        wd=6.02e-2,
+        max_grad_norm=0.5,
+        amp=False,
+        group_wd_params=True,
+        use_ema=True,
+        device=device,
+        accelerator=None,
+    )
+    trainer.load(dprior_path)
+    return trainer
 ```