vijayakumaran92's picture
Add files using upload-large-folder tool
dfaef4b verified
scepter [INFO] 2025-06-24 07:35:13,340 [File: logger.py Function: init_logger at line 85] Running task with log file: /home/Ubuntu/Downloads/Unmodel/ACE_plus/./examples/exp_example/20250624073511/std_log.txt
scepter [WARNING] 2025-06-24 07:35:13,437 [File: import_utils.py Function: import_module at line 325] ('DATASETS', 'ACEPlusDataset') not found in ast index file
scepter [INFO] 2025-06-24 07:35:13,438 [File: ace_plus_dataset.py Function: read_data_list at line 151] subject has 5 samples.
scepter [INFO] 2025-06-24 07:35:13,439 [File: registry.py Function: __init__ at line 185] Built dataloader with len 9223372036854775807
scepter [WARNING] 2025-06-24 07:35:13,439 [File: import_utils.py Function: import_module at line 325] ('DATASETS', 'ACEPlusDataset') not found in ast index file
scepter [INFO] 2025-06-24 07:35:13,439 [File: ace_plus_dataset.py Function: read_data_list at line 151] subject has 5 samples.
scepter [INFO] 2025-06-24 07:35:13,439 [File: registry.py Function: __init__ at line 185] Built dataloader with len 5
scepter [INFO] 2025-06-24 07:36:29,593 [File: flux.py Function: load_pretrained_model at line 450] Restored from /home/Ubuntu/Downloads/Unmodel/Reference_models/flux1-fill-dev.safetensors with 0 missing and 0 unexpected keys
scepter [INFO] 2025-06-24 07:36:29,611 [File: ace_plus_ldm.py Function: construct_network at line 62] all parameters:11.90B
scepter [INFO] 2025-06-24 07:36:30,231 [File: ae_module.py Function: construct_model at line 76] AE Module XFORMERS_IS_AVAILBLE : True
scepter [INFO] 2025-06-24 07:36:31,018 [File: ae_kl.py Function: init_from_ckpt at line 400] Restored from /home/Ubuntu/Downloads/Unmodel/Reference_models/ae.safetensors with 0 missing and 0 unexpected keys
scepter [INFO] 2025-06-24 07:36:42,689 [File: diffusion_solver.py Function: add_tuner at line 788] [('base_model.model.double_blocks.0.img_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.0.img_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.0.img_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.0.img_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.0.img_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.0.img_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.0.img_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.0.img_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.0.img_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.0.img_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.0.txt_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.0.txt_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.0.txt_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.0.txt_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.0.txt_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.0.txt_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.0.txt_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.0.txt_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.0.txt_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.0.txt_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.1.img_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.1.img_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.1.img_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.1.img_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.1.img_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.1.img_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.1.img_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.1.img_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.1.img_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.1.img_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.1.txt_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.1.txt_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.1.txt_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.1.txt_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.1.txt_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.1.txt_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.1.txt_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.1.txt_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.1.txt_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.1.txt_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.2.img_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.2.img_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.2.img_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.2.img_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.2.img_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.2.img_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.2.img_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.2.img_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.2.img_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.2.img_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.2.txt_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.2.txt_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.2.txt_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.2.txt_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.2.txt_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.2.txt_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.2.txt_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.2.txt_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.2.txt_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.2.txt_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.3.img_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.3.img_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.3.img_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.3.img_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.3.img_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.3.img_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.3.img_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.3.img_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.3.img_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.3.img_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.3.txt_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.3.txt_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.3.txt_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.3.txt_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.3.txt_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.3.txt_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.3.txt_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.3.txt_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.3.txt_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.3.txt_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.4.img_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.4.img_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.4.img_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.4.img_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.4.img_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.4.img_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.4.img_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.4.img_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.4.img_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.4.img_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.4.txt_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.4.txt_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.4.txt_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.4.txt_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.4.txt_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.4.txt_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.4.txt_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.4.txt_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.4.txt_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.4.txt_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.5.img_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.5.img_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.5.img_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.5.img_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.5.img_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.5.img_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.5.img_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.5.img_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.5.img_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.5.img_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.5.txt_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.5.txt_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.5.txt_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.5.txt_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.5.txt_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.5.txt_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.5.txt_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.5.txt_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.5.txt_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.5.txt_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.6.img_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.6.img_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.6.img_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.6.img_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.6.img_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.6.img_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.6.img_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.6.img_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.6.img_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.6.img_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.6.txt_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.6.txt_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.6.txt_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.6.txt_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.6.txt_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.6.txt_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.6.txt_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.6.txt_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.6.txt_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.6.txt_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.7.img_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.7.img_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.7.img_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.7.img_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.7.img_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.7.img_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.7.img_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.7.img_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.7.img_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.7.img_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.7.txt_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.7.txt_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.7.txt_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.7.txt_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.7.txt_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.7.txt_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.7.txt_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.7.txt_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.7.txt_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.7.txt_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.8.img_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.8.img_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.8.img_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.8.img_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.8.img_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.8.img_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.8.img_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.8.img_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.8.img_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.8.img_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.8.txt_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.8.txt_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.8.txt_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.8.txt_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.8.txt_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.8.txt_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.8.txt_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.8.txt_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.8.txt_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.8.txt_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.9.img_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.9.img_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.9.img_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.9.img_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.9.img_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.9.img_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.9.img_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.9.img_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.9.img_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.9.img_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.9.txt_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.9.txt_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.9.txt_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.9.txt_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.9.txt_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.9.txt_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.9.txt_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.9.txt_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.9.txt_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.9.txt_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.10.img_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.10.img_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.10.img_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.10.img_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.10.img_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.10.img_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.10.img_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.10.img_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.10.img_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.10.img_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.10.txt_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.10.txt_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.10.txt_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.10.txt_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.10.txt_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.10.txt_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.10.txt_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.10.txt_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.10.txt_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.10.txt_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.11.img_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.11.img_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.11.img_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.11.img_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.11.img_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.11.img_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.11.img_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.11.img_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.11.img_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.11.img_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.11.txt_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.11.txt_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.11.txt_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.11.txt_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.11.txt_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.11.txt_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.11.txt_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.11.txt_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.11.txt_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.11.txt_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.12.img_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.12.img_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.12.img_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.12.img_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.12.img_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.12.img_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.12.img_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.12.img_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.12.img_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.12.img_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.12.txt_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.12.txt_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.12.txt_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.12.txt_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.12.txt_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.12.txt_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.12.txt_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.12.txt_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.12.txt_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.12.txt_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.13.img_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.13.img_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.13.img_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.13.img_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.13.img_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.13.img_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.13.img_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.13.img_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.13.img_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.13.img_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.13.txt_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.13.txt_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.13.txt_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.13.txt_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.13.txt_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.13.txt_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.13.txt_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.13.txt_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.13.txt_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.13.txt_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.14.img_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.14.img_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.14.img_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.14.img_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.14.img_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.14.img_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.14.img_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.14.img_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.14.img_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.14.img_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.14.txt_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.14.txt_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.14.txt_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.14.txt_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.14.txt_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.14.txt_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.14.txt_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.14.txt_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.14.txt_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.14.txt_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.15.img_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.15.img_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.15.img_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.15.img_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.15.img_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.15.img_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.15.img_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.15.img_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.15.img_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.15.img_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.15.txt_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.15.txt_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.15.txt_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.15.txt_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.15.txt_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.15.txt_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.15.txt_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.15.txt_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.15.txt_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.15.txt_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.16.img_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.16.img_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.16.img_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.16.img_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.16.img_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.16.img_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.16.img_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.16.img_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.16.img_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.16.img_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.16.txt_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.16.txt_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.16.txt_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.16.txt_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.16.txt_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.16.txt_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.16.txt_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.16.txt_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.16.txt_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.16.txt_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.17.img_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.17.img_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.17.img_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.17.img_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.17.img_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.17.img_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.17.img_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.17.img_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.17.img_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.17.img_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.17.txt_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.17.txt_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.17.txt_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.17.txt_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.17.txt_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.17.txt_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.17.txt_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.17.txt_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.17.txt_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.17.txt_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.18.img_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.18.img_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.18.img_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.18.img_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.18.img_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.18.img_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.18.img_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.18.img_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.18.img_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.18.img_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.18.txt_mod.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.18.txt_mod.lin.lora_B.0_SwiftLoRA.weight', torch.Size([18432, 64])), ('base_model.model.double_blocks.18.txt_attn.qkv.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.18.txt_attn.qkv.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.double_blocks.18.txt_attn.proj.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.18.txt_attn.proj.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.double_blocks.18.txt_mlp.0.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.double_blocks.18.txt_mlp.0.lora_B.0_SwiftLoRA.weight', torch.Size([12288, 64])), ('base_model.model.double_blocks.18.txt_mlp.2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 12288])), ('base_model.model.double_blocks.18.txt_mlp.2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.0.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.0.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.0.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.0.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.0.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.0.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.1.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.1.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.1.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.1.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.1.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.1.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.2.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.2.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.2.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.2.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.2.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.2.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.3.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.3.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.3.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.3.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.3.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.3.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.4.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.4.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.4.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.4.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.4.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.4.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.5.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.5.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.5.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.5.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.5.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.5.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.6.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.6.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.6.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.6.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.6.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.6.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.7.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.7.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.7.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.7.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.7.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.7.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.8.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.8.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.8.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.8.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.8.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.8.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.9.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.9.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.9.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.9.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.9.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.9.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.10.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.10.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.10.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.10.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.10.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.10.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.11.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.11.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.11.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.11.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.11.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.11.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.12.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.12.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.12.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.12.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.12.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.12.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.13.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.13.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.13.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.13.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.13.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.13.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.14.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.14.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.14.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.14.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.14.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.14.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.15.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.15.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.15.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.15.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.15.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.15.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.16.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.16.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.16.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.16.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.16.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.16.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.17.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.17.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.17.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.17.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.17.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.17.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.18.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.18.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.18.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.18.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.18.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.18.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.19.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.19.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.19.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.19.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.19.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.19.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.20.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.20.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.20.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.20.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.20.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.20.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.21.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.21.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.21.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.21.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.21.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.21.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.22.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.22.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.22.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.22.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.22.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.22.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.23.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.23.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.23.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.23.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.23.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.23.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.24.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.24.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.24.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.24.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.24.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.24.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.25.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.25.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.25.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.25.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.25.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.25.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.26.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.26.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.26.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.26.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.26.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.26.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.27.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.27.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.27.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.27.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.27.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.27.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.28.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.28.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.28.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.28.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.28.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.28.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.29.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.29.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.29.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.29.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.29.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.29.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.30.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.30.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.30.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.30.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.30.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.30.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.31.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.31.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.31.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.31.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.31.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.31.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.32.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.32.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.32.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.32.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.32.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.32.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.33.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.33.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.33.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.33.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.33.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.33.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.34.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.34.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.34.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.34.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.34.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.34.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.35.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.35.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.35.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.35.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.35.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.35.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.36.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.36.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.36.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.36.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.36.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.36.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64])), ('base_model.model.single_blocks.37.linear1.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.37.linear1.lora_B.0_SwiftLoRA.weight', torch.Size([21504, 64])), ('base_model.model.single_blocks.37.linear2.lora_A.0_SwiftLoRA.weight', torch.Size([64, 15360])), ('base_model.model.single_blocks.37.linear2.lora_B.0_SwiftLoRA.weight', torch.Size([3072, 64])), ('base_model.model.single_blocks.37.modulation.lin.lora_A.0_SwiftLoRA.weight', torch.Size([64, 3072])), ('base_model.model.single_blocks.37.modulation.lin.lora_B.0_SwiftLoRA.weight', torch.Size([9216, 64]))]
scepter [INFO] 2025-06-24 07:36:42,703 [File: diffusion_solver.py Function: print_model_params_status at line 996] Load trainable params 306315264 / 17178094051 = 1.78%, train part: {'model.double_blocks': 171835392, 'model.single_blocks': 134479872}.
scepter [INFO] 2025-06-24 07:36:42,703 [File: diffusion_solver.py Function: print_model_params_status at line 1000] Load frozen params 16871778787 / 17178094051 = 98.22%, frozen part: {'model': 11902587968, 'first_stage_model': 83819683, 'cond_stage_model': 4885371136}.
scepter [INFO] 2025-06-24 07:37:18,925 [File: diffusion_solver.py Function: set_up at line 230] SwiftModel(
(base_model): LatentDiffusionACEPlus LatentDiffusionACEPlus(
(model): FluxMRModiACEPlus FluxMRModiACEPlus(
(pe_embedder): EmbedND()
(img_in): Linear(in_features=448, out_features=3072, bias=True)
(time_in): MLPEmbedder(
(in_layer): Linear(in_features=256, out_features=3072, bias=True)
(silu): SiLU()
(out_layer): Linear(in_features=3072, out_features=3072, bias=True)
)
(vector_in): MLPEmbedder(
(in_layer): Linear(in_features=768, out_features=3072, bias=True)
(silu): SiLU()
(out_layer): Linear(in_features=3072, out_features=3072, bias=True)
)
(guidance_in): MLPEmbedder(
(in_layer): Linear(in_features=256, out_features=3072, bias=True)
(silu): SiLU()
(out_layer): Linear(in_features=3072, out_features=3072, bias=True)
)
(txt_in): Linear(in_features=4096, out_features=3072, bias=True)
(double_blocks): ModuleList(
(0-18): 19 x DoubleStreamBlock(
(img_mod): Modulation(
(lin): lora.Linear(
(base_layer): Linear(in_features=3072, out_features=18432, bias=True)
(lora_dropout): ModuleDict(
(0_SwiftLoRA): Identity()
)
(lora_A): ModuleDict(
(0_SwiftLoRA): Linear(in_features=3072, out_features=64, bias=False)
)
(lora_B): ModuleDict(
(0_SwiftLoRA): Linear(in_features=64, out_features=18432, bias=False)
)
(lora_embedding_A): ParameterDict()
(lora_embedding_B): ParameterDict()
(lora_magnitude_vector): ModuleDict()
)
)
(img_norm1): LayerNorm((3072,), eps=1e-06, elementwise_affine=False)
(img_attn): SelfAttention(
(qkv): lora.Linear(
(base_layer): Linear(in_features=3072, out_features=9216, bias=True)
(lora_dropout): ModuleDict(
(0_SwiftLoRA): Identity()
)
(lora_A): ModuleDict(
(0_SwiftLoRA): Linear(in_features=3072, out_features=64, bias=False)
)
(lora_B): ModuleDict(
(0_SwiftLoRA): Linear(in_features=64, out_features=9216, bias=False)
)
(lora_embedding_A): ParameterDict()
(lora_embedding_B): ParameterDict()
(lora_magnitude_vector): ModuleDict()
)
(norm): QKNorm(
(query_norm): RMSNorm()
(key_norm): RMSNorm()
)
(proj): lora.Linear(
(base_layer): Linear(in_features=3072, out_features=3072, bias=True)
(lora_dropout): ModuleDict(
(0_SwiftLoRA): Identity()
)
(lora_A): ModuleDict(
(0_SwiftLoRA): Linear(in_features=3072, out_features=64, bias=False)
)
(lora_B): ModuleDict(
(0_SwiftLoRA): Linear(in_features=64, out_features=3072, bias=False)
)
(lora_embedding_A): ParameterDict()
(lora_embedding_B): ParameterDict()
(lora_magnitude_vector): ModuleDict()
)
)
(img_norm2): LayerNorm((3072,), eps=1e-06, elementwise_affine=False)
(img_mlp): Sequential(
(0): lora.Linear(
(base_layer): Linear(in_features=3072, out_features=12288, bias=True)
(lora_dropout): ModuleDict(
(0_SwiftLoRA): Identity()
)
(lora_A): ModuleDict(
(0_SwiftLoRA): Linear(in_features=3072, out_features=64, bias=False)
)
(lora_B): ModuleDict(
(0_SwiftLoRA): Linear(in_features=64, out_features=12288, bias=False)
)
(lora_embedding_A): ParameterDict()
(lora_embedding_B): ParameterDict()
(lora_magnitude_vector): ModuleDict()
)
(1): GELU(approximate='tanh')
(2): lora.Linear(
(base_layer): Linear(in_features=12288, out_features=3072, bias=True)
(lora_dropout): ModuleDict(
(0_SwiftLoRA): Identity()
)
(lora_A): ModuleDict(
(0_SwiftLoRA): Linear(in_features=12288, out_features=64, bias=False)
)
(lora_B): ModuleDict(
(0_SwiftLoRA): Linear(in_features=64, out_features=3072, bias=False)
)
(lora_embedding_A): ParameterDict()
(lora_embedding_B): ParameterDict()
(lora_magnitude_vector): ModuleDict()
)
)
(txt_mod): Modulation(
(lin): lora.Linear(
(base_layer): Linear(in_features=3072, out_features=18432, bias=True)
(lora_dropout): ModuleDict(
(0_SwiftLoRA): Identity()
)
(lora_A): ModuleDict(
(0_SwiftLoRA): Linear(in_features=3072, out_features=64, bias=False)
)
(lora_B): ModuleDict(
(0_SwiftLoRA): Linear(in_features=64, out_features=18432, bias=False)
)
(lora_embedding_A): ParameterDict()
(lora_embedding_B): ParameterDict()
(lora_magnitude_vector): ModuleDict()
)
)
(txt_norm1): LayerNorm((3072,), eps=1e-06, elementwise_affine=False)
(txt_attn): SelfAttention(
(qkv): lora.Linear(
(base_layer): Linear(in_features=3072, out_features=9216, bias=True)
(lora_dropout): ModuleDict(
(0_SwiftLoRA): Identity()
)
(lora_A): ModuleDict(
(0_SwiftLoRA): Linear(in_features=3072, out_features=64, bias=False)
)
(lora_B): ModuleDict(
(0_SwiftLoRA): Linear(in_features=64, out_features=9216, bias=False)
)
(lora_embedding_A): ParameterDict()
(lora_embedding_B): ParameterDict()
(lora_magnitude_vector): ModuleDict()
)
(norm): QKNorm(
(query_norm): RMSNorm()
(key_norm): RMSNorm()
)
(proj): lora.Linear(
(base_layer): Linear(in_features=3072, out_features=3072, bias=True)
(lora_dropout): ModuleDict(
(0_SwiftLoRA): Identity()
)
(lora_A): ModuleDict(
(0_SwiftLoRA): Linear(in_features=3072, out_features=64, bias=False)
)
(lora_B): ModuleDict(
(0_SwiftLoRA): Linear(in_features=64, out_features=3072, bias=False)
)
(lora_embedding_A): ParameterDict()
(lora_embedding_B): ParameterDict()
(lora_magnitude_vector): ModuleDict()
)
)
(txt_norm2): LayerNorm((3072,), eps=1e-06, elementwise_affine=False)
(txt_mlp): Sequential(
(0): lora.Linear(
(base_layer): Linear(in_features=3072, out_features=12288, bias=True)
(lora_dropout): ModuleDict(
(0_SwiftLoRA): Identity()
)
(lora_A): ModuleDict(
(0_SwiftLoRA): Linear(in_features=3072, out_features=64, bias=False)
)
(lora_B): ModuleDict(
(0_SwiftLoRA): Linear(in_features=64, out_features=12288, bias=False)
)
(lora_embedding_A): ParameterDict()
(lora_embedding_B): ParameterDict()
(lora_magnitude_vector): ModuleDict()
)
(1): GELU(approximate='tanh')
(2): lora.Linear(
(base_layer): Linear(in_features=12288, out_features=3072, bias=True)
(lora_dropout): ModuleDict(
(0_SwiftLoRA): Identity()
)
(lora_A): ModuleDict(
(0_SwiftLoRA): Linear(in_features=12288, out_features=64, bias=False)
)
(lora_B): ModuleDict(
(0_SwiftLoRA): Linear(in_features=64, out_features=3072, bias=False)
)
(lora_embedding_A): ParameterDict()
(lora_embedding_B): ParameterDict()
(lora_magnitude_vector): ModuleDict()
)
)
)
)
(single_blocks): ModuleList(
(0-37): 38 x SingleStreamBlock(
(linear1): lora.Linear(
(base_layer): Linear(in_features=3072, out_features=21504, bias=True)
(lora_dropout): ModuleDict(
(0_SwiftLoRA): Identity()
)
(lora_A): ModuleDict(
(0_SwiftLoRA): Linear(in_features=3072, out_features=64, bias=False)
)
(lora_B): ModuleDict(
(0_SwiftLoRA): Linear(in_features=64, out_features=21504, bias=False)
)
(lora_embedding_A): ParameterDict()
(lora_embedding_B): ParameterDict()
(lora_magnitude_vector): ModuleDict()
)
(linear2): lora.Linear(
(base_layer): Linear(in_features=15360, out_features=3072, bias=True)
(lora_dropout): ModuleDict(
(0_SwiftLoRA): Identity()
)
(lora_A): ModuleDict(
(0_SwiftLoRA): Linear(in_features=15360, out_features=64, bias=False)
)
(lora_B): ModuleDict(
(0_SwiftLoRA): Linear(in_features=64, out_features=3072, bias=False)
)
(lora_embedding_A): ParameterDict()
(lora_embedding_B): ParameterDict()
(lora_magnitude_vector): ModuleDict()
)
(norm): QKNorm(
(query_norm): RMSNorm()
(key_norm): RMSNorm()
)
(pre_norm): LayerNorm((3072,), eps=1e-06, elementwise_affine=False)
(mlp_act): GELU(approximate='tanh')
(modulation): Modulation(
(lin): lora.Linear(
(base_layer): Linear(in_features=3072, out_features=9216, bias=True)
(lora_dropout): ModuleDict(
(0_SwiftLoRA): Identity()
)
(lora_A): ModuleDict(
(0_SwiftLoRA): Linear(in_features=3072, out_features=64, bias=False)
)
(lora_B): ModuleDict(
(0_SwiftLoRA): Linear(in_features=64, out_features=9216, bias=False)
)
(lora_embedding_A): ParameterDict()
(lora_embedding_B): ParameterDict()
(lora_magnitude_vector): ModuleDict()
)
)
)
)
(final_layer): LastLayer(
(norm_final): LayerNorm((3072,), eps=1e-06, elementwise_affine=False)
(linear): Linear(in_features=3072, out_features=64, bias=True)
(adaLN_modulation): Sequential(
(0): SiLU()
(1): Linear(in_features=3072, out_features=6144, bias=True)
)
)
)
(first_stage_model): AutoencoderKLFlux AutoencoderKLFlux(
(encoder): Encoder Encoder(
(conv_in): Conv2d(3, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(down): ModuleList(
(0): Module(
(block): ModuleList(
(0-1): 2 x ResnetBlock(
(norm1): GroupNorm(32, 128, eps=1e-06, affine=True)
(conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(norm2): GroupNorm(32, 128, eps=1e-06, affine=True)
(dropout): Dropout(p=0.0, inplace=False)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
)
(attn): ModuleList()
(downsample): Downsample(
(conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2))
)
)
(1): Module(
(block): ModuleList(
(0): ResnetBlock(
(norm1): GroupNorm(32, 128, eps=1e-06, affine=True)
(conv1): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(norm2): GroupNorm(32, 256, eps=1e-06, affine=True)
(dropout): Dropout(p=0.0, inplace=False)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(nin_shortcut): Conv2d(128, 256, kernel_size=(1, 1), stride=(1, 1))
)
(1): ResnetBlock(
(norm1): GroupNorm(32, 256, eps=1e-06, affine=True)
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(norm2): GroupNorm(32, 256, eps=1e-06, affine=True)
(dropout): Dropout(p=0.0, inplace=False)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
)
(attn): ModuleList()
(downsample): Downsample(
(conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2))
)
)
(2): Module(
(block): ModuleList(
(0): ResnetBlock(
(norm1): GroupNorm(32, 256, eps=1e-06, affine=True)
(conv1): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(norm2): GroupNorm(32, 512, eps=1e-06, affine=True)
(dropout): Dropout(p=0.0, inplace=False)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(nin_shortcut): Conv2d(256, 512, kernel_size=(1, 1), stride=(1, 1))
)
(1): ResnetBlock(
(norm1): GroupNorm(32, 512, eps=1e-06, affine=True)
(conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(norm2): GroupNorm(32, 512, eps=1e-06, affine=True)
(dropout): Dropout(p=0.0, inplace=False)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
)
(attn): ModuleList()
(downsample): Downsample(
(conv): Conv2d(512, 512, kernel_size=(3, 3), stride=(2, 2))
)
)
(3): Module(
(block): ModuleList(
(0-1): 2 x ResnetBlock(
(norm1): GroupNorm(32, 512, eps=1e-06, affine=True)
(conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(norm2): GroupNorm(32, 512, eps=1e-06, affine=True)
(dropout): Dropout(p=0.0, inplace=False)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
)
(attn): ModuleList()
)
)
(mid): Module(
(block_1): ResnetBlock(
(norm1): GroupNorm(32, 512, eps=1e-06, affine=True)
(conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(norm2): GroupNorm(32, 512, eps=1e-06, affine=True)
(dropout): Dropout(p=0.0, inplace=False)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
(attn_1): MemoryEfficientAttention(
(norm): GroupNorm(32, 512, eps=1e-06, affine=True)
(q): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1))
(k): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1))
(v): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1))
(proj_out): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1))
)
(block_2): ResnetBlock(
(norm1): GroupNorm(32, 512, eps=1e-06, affine=True)
(conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(norm2): GroupNorm(32, 512, eps=1e-06, affine=True)
(dropout): Dropout(p=0.0, inplace=False)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
)
(norm_out): GroupNorm(32, 512, eps=1e-06, affine=True)
(conv_out): Conv2d(512, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
(decoder): Decoder Decoder(
(conv_in): Conv2d(16, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(mid): Module(
(block_1): ResnetBlock(
(norm1): GroupNorm(32, 512, eps=1e-06, affine=True)
(conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(norm2): GroupNorm(32, 512, eps=1e-06, affine=True)
(dropout): Dropout(p=0.0, inplace=False)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
(attn_1): MemoryEfficientAttention(
(norm): GroupNorm(32, 512, eps=1e-06, affine=True)
(q): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1))
(k): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1))
(v): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1))
(proj_out): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1))
)
(block_2): ResnetBlock(
(norm1): GroupNorm(32, 512, eps=1e-06, affine=True)
(conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(norm2): GroupNorm(32, 512, eps=1e-06, affine=True)
(dropout): Dropout(p=0.0, inplace=False)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
)
(up): ModuleList(
(0): Module(
(block): ModuleList(
(0): ResnetBlock(
(norm1): GroupNorm(32, 256, eps=1e-06, affine=True)
(conv1): Conv2d(256, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(norm2): GroupNorm(32, 128, eps=1e-06, affine=True)
(dropout): Dropout(p=0.0, inplace=False)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(nin_shortcut): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1))
)
(1-2): 2 x ResnetBlock(
(norm1): GroupNorm(32, 128, eps=1e-06, affine=True)
(conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(norm2): GroupNorm(32, 128, eps=1e-06, affine=True)
(dropout): Dropout(p=0.0, inplace=False)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
)
(attn): ModuleList()
)
(1): Module(
(block): ModuleList(
(0): ResnetBlock(
(norm1): GroupNorm(32, 512, eps=1e-06, affine=True)
(conv1): Conv2d(512, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(norm2): GroupNorm(32, 256, eps=1e-06, affine=True)
(dropout): Dropout(p=0.0, inplace=False)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(nin_shortcut): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1))
)
(1-2): 2 x ResnetBlock(
(norm1): GroupNorm(32, 256, eps=1e-06, affine=True)
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(norm2): GroupNorm(32, 256, eps=1e-06, affine=True)
(dropout): Dropout(p=0.0, inplace=False)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
)
(attn): ModuleList()
(upsample): Upsample(
(conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
)
(2-3): 2 x Module(
(block): ModuleList(
(0-2): 3 x ResnetBlock(
(norm1): GroupNorm(32, 512, eps=1e-06, affine=True)
(conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(norm2): GroupNorm(32, 512, eps=1e-06, affine=True)
(dropout): Dropout(p=0.0, inplace=False)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
)
(attn): ModuleList()
(upsample): Upsample(
(conv): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
)
)
(norm_out): GroupNorm(32, 128, eps=1e-06, affine=True)
(conv_out): Conv2d(128, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
(conv1): Identity()
(conv2): Identity()
)
(cond_stage_model): T5ACEPlusClipFluxEmbedder T5ACEPlusClipFluxEmbedder(
(t5_model): ACEHFEmbedder ACEHFEmbedder(
(hf_module): T5EncoderModel(
(shared): Embedding(32128, 4096)
(encoder): T5Stack(
(embed_tokens): Embedding(32128, 4096)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=4096, out_features=4096, bias=False)
(k): Linear(in_features=4096, out_features=4096, bias=False)
(v): Linear(in_features=4096, out_features=4096, bias=False)
(o): Linear(in_features=4096, out_features=4096, bias=False)
(relative_attention_bias): Embedding(32, 64)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=4096, out_features=10240, bias=False)
(wi_1): Linear(in_features=4096, out_features=10240, bias=False)
(wo): Linear(in_features=10240, out_features=4096, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-23): 23 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=4096, out_features=4096, bias=False)
(k): Linear(in_features=4096, out_features=4096, bias=False)
(v): Linear(in_features=4096, out_features=4096, bias=False)
(o): Linear(in_features=4096, out_features=4096, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=4096, out_features=10240, bias=False)
(wi_1): Linear(in_features=4096, out_features=10240, bias=False)
(wo): Linear(in_features=10240, out_features=4096, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(clip_model): ACEHFEmbedder ACEHFEmbedder(
(hf_module): CLIPTextModel(
(text_model): CLIPTextTransformer(
(embeddings): CLIPTextEmbeddings(
(token_embedding): Embedding(49408, 768)
(position_embedding): Embedding(77, 768)
)
(encoder): CLIPEncoder(
(layers): ModuleList(
(0-11): 12 x CLIPEncoderLayer(
(self_attn): CLIPSdpaAttention(
(k_proj): Linear(in_features=768, out_features=768, bias=True)
(v_proj): Linear(in_features=768, out_features=768, bias=True)
(q_proj): Linear(in_features=768, out_features=768, bias=True)
(out_proj): Linear(in_features=768, out_features=768, bias=True)
)
(layer_norm1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(mlp): CLIPMLP(
(activation_fn): QuickGELUActivation()
(fc1): Linear(in_features=768, out_features=3072, bias=True)
(fc2): Linear(in_features=3072, out_features=768, bias=True)
)
(layer_norm2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
)
)
)
(final_layer_norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
)
)
)
)
)
)
scepter [INFO] 2025-06-24 07:37:18,962 [File: log.py Function: before_solve at line 260] Tensorboard: save to ./examples/exp_example/20250624073511/tensorboard
scepter [INFO] 2025-06-24 07:43:30,805 [File: log.py Function: _print_iter_log at line 71] Stage [train] iter: [20/100000], data_time: 12.7230(12.7230), time: 18.5911(18.5911), loss: 0.1549(0.1549), throughput: 23664/day, all_throughput: 20, pg0_lr: 0.001000, scale: 1.000000, [8mins 17secs 0.02%(28days 18hours 39mins 9secs)]
scepter [INFO] 2025-06-24 07:49:40,079 [File: log.py Function: _print_iter_log at line 71] Stage [train] iter: [40/100000], data_time: 12.8183(12.7707), time: 18.4638(18.5274), loss: 0.1357(0.1453), throughput: 23698/day, all_throughput: 40, pg0_lr: 0.001000, scale: 1.000000, [14mins 26secs 0.04%(25days 1hours 35mins 41secs)]
scepter [INFO] 2025-06-24 07:52:42,409 [File: checkpoint.py Function: after_iter at line 109] Saving checkpoint after 50 steps
scepter [INFO] 2025-06-24 07:55:48,532 [File: log.py Function: _print_iter_log at line 71] Stage [train] iter: [60/100000], data_time: 12.7975(12.7796), time: 18.4226(18.4925), loss: 0.1088(0.1332), throughput: 23757/day, all_throughput: 60, pg0_lr: 0.001000, scale: 1.000000, [20mins 35secs 0.06%(23days 19hours 27mins 38secs)]
scepter [INFO] 2025-06-24 08:01:55,863 [File: log.py Function: _print_iter_log at line 71] Stage [train] iter: [80/100000], data_time: 12.5829(12.7304), time: 18.3665(18.4610), loss: 0.1000(0.1249), throughput: 23724/day, all_throughput: 80, pg0_lr: 0.001000, scale: 1.000000, [26mins 42secs 0.08%(23days 3hours 57mins 11secs)]
scepter [INFO] 2025-06-24 08:08:02,295 [File: log.py Function: _print_iter_log at line 71] Stage [train] iter: [100/100000], data_time: 12.4769(12.6797), time: 18.3217(18.4331), loss: 0.0889(0.1177), throughput: 23748/day, all_throughput: 100, pg0_lr: 0.001000, scale: 1.000000, [32mins 48secs 0.10%(22days 18hours 21mins 29secs)]
scepter [INFO] 2025-06-24 08:08:02,296 [File: checkpoint.py Function: after_iter at line 109] Saving checkpoint after 100 steps
scepter [INFO] 2025-06-24 08:14:11,503 [File: log.py Function: _print_iter_log at line 71] Stage [train] iter: [120/100000], data_time: 12.8457(12.7074), time: 18.4603(18.4377), loss: 0.0917(0.1134), throughput: 23766/day, all_throughput: 120, pg0_lr: 0.001000, scale: 1.000000, [38mins 58secs 0.12%(22days 12hours 34mins 10secs)]
scepter [INFO] 2025-06-24 08:20:09,138 [File: log.py Function: _print_iter_log at line 71] Stage [train] iter: [140/100000], data_time: 12.2640(12.6440), time: 17.8818(18.3583), loss: 0.1563(0.1195), throughput: 23784/day, all_throughput: 140, pg0_lr: 0.001000, scale: 1.000000, [44mins 55secs 0.14%(22days 6hours 6mins 45secs)]
scepter [INFO] 2025-06-24 08:23:12,656 [File: checkpoint.py Function: after_iter at line 109] Saving checkpoint after 150 steps
scepter [INFO] 2025-06-24 08:26:15,711 [File: log.py Function: _print_iter_log at line 71] Stage [train] iter: [160/100000], data_time: 12.6226(12.6414), time: 18.3286(18.3545), loss: 0.1930(0.1287), throughput: 23818/day, all_throughput: 160, pg0_lr: 0.001000, scale: 1.000000, [51mins 2secs 0.16%(22days 2hours 47mins 39secs)]