recoilme commited on
Commit
55aa8e7
·
1 Parent(s): 2016da2
samples/sdxs_1b_384x768_0.jpg CHANGED

Git LFS Details

  • SHA256: 04e973e35af92734bfb31877d2a5a3a5d548246b06bb6a2f26803635b333b9e7
  • Pointer size: 130 Bytes
  • Size of remote file: 75.4 kB

Git LFS Details

  • SHA256: 4d4a6c84b707a2b71d28dd6cde56c89737fb96c12ad2873033445309a8578cf1
  • Pointer size: 130 Bytes
  • Size of remote file: 72.6 kB
samples/sdxs_1b_416x768_0.jpg CHANGED

Git LFS Details

  • SHA256: ef3a15fb13c198be3a2f1359e993998f8bf8dcb85f716dea457cc859eb9751c5
  • Pointer size: 131 Bytes
  • Size of remote file: 143 kB

Git LFS Details

  • SHA256: 2732a598b73cc0cd816a08b31cb0c5ca984f5bd67be439c6761917f6dcbded11
  • Pointer size: 130 Bytes
  • Size of remote file: 65.7 kB
samples/sdxs_1b_448x768_0.jpg CHANGED

Git LFS Details

  • SHA256: 3051f142812e47c678363a93dafff38b75845714389790441e50a4ecafe639f5
  • Pointer size: 130 Bytes
  • Size of remote file: 93.2 kB

Git LFS Details

  • SHA256: 34f8fada96c7be0ad2f3369a23f01352afbf4612c3ee1d8306d79a26c3035520
  • Pointer size: 131 Bytes
  • Size of remote file: 147 kB
samples/sdxs_1b_480x768_0.jpg CHANGED

Git LFS Details

  • SHA256: 1f68b43fd1e1a684a8e8a5b34713393081c3097d923db99fa99236a93cea3512
  • Pointer size: 131 Bytes
  • Size of remote file: 301 kB

Git LFS Details

  • SHA256: 0090838f7e862ebec5dde85bb5e02a749b18beadb86f77d20876b00c6cfc5d0f
  • Pointer size: 131 Bytes
  • Size of remote file: 143 kB
samples/sdxs_1b_512x768_0.jpg CHANGED

Git LFS Details

  • SHA256: fb71463171685787a5a3d1cc600dd46ca221eee9d56cc549f2247e936e000457
  • Pointer size: 131 Bytes
  • Size of remote file: 179 kB

Git LFS Details

  • SHA256: b28abae45ac8f86bb8a766b4fa0fe5d707509139379f1b6a8c601811d9aaadc3
  • Pointer size: 130 Bytes
  • Size of remote file: 97.1 kB
samples/sdxs_1b_544x768_0.jpg CHANGED

Git LFS Details

  • SHA256: 9bc771776872e1b17c5ca4d6cdf267b5f1e5de1d3481d66d11fc98d45527cc6c
  • Pointer size: 131 Bytes
  • Size of remote file: 185 kB

Git LFS Details

  • SHA256: e668cd9934ea4c429e756980e118293fa15e4455b7ca0a7668805e80e39f6fce
  • Pointer size: 131 Bytes
  • Size of remote file: 133 kB
samples/sdxs_1b_576x768_0.jpg CHANGED

Git LFS Details

  • SHA256: 68aa07d63bdd9437b0da26a1ed99ab710daca55627494196fa8697520455a18d
  • Pointer size: 131 Bytes
  • Size of remote file: 121 kB

Git LFS Details

  • SHA256: a54eb20ec3fd979bb68c3b6528bc9113f027852f69a05f85ca23bb072c97e571
  • Pointer size: 131 Bytes
  • Size of remote file: 138 kB
samples/sdxs_1b_608x768_0.jpg CHANGED

Git LFS Details

  • SHA256: c6b4c65b62eb6aba01e8cb585d45cb87d6ea8f3a5c99578cd9a61d8f8ac75707
  • Pointer size: 131 Bytes
  • Size of remote file: 270 kB

Git LFS Details

  • SHA256: 01a70540c4fb8496527bd4703a561abc4f7c48908ee0e3fcb826917ffe7e670e
  • Pointer size: 131 Bytes
  • Size of remote file: 239 kB
samples/sdxs_1b_640x768_0.jpg CHANGED

Git LFS Details

  • SHA256: 874c5035246f3b58839bba13f1b25076a5b9d1222bc8f4362d4f9411c56a6589
  • Pointer size: 131 Bytes
  • Size of remote file: 132 kB

Git LFS Details

  • SHA256: 26a486fb877e042c27b76b5ee8f8a368fb70cd3a2e9e972f05fd328653c4f462
  • Pointer size: 131 Bytes
  • Size of remote file: 231 kB
samples/sdxs_1b_672x768_0.jpg CHANGED

Git LFS Details

  • SHA256: 792f7f42a44d290b45d44a07cf6b68d642fb24eca5ec1632612efc6cfb81a1b9
  • Pointer size: 131 Bytes
  • Size of remote file: 190 kB

Git LFS Details

  • SHA256: bd5ccb34eb5fe326da19f3297f99541fc0fda94dcfa2396362a438e743a2e507
  • Pointer size: 131 Bytes
  • Size of remote file: 102 kB
samples/sdxs_1b_704x768_0.jpg CHANGED

Git LFS Details

  • SHA256: c020f7189bc050a687fff7f500019adcccaf331e5300dd3dcbe282547428997c
  • Pointer size: 131 Bytes
  • Size of remote file: 161 kB

Git LFS Details

  • SHA256: a03eb642879aa394d38726cd4d3c1fe77bef0c2a8aff7b0b028d8d65c6baf889
  • Pointer size: 130 Bytes
  • Size of remote file: 89.5 kB
samples/sdxs_1b_736x768_0.jpg CHANGED

Git LFS Details

  • SHA256: 7c7337c4ee1c596b0c7bfdb886eb44e29f5e919b8f314ed51173c6a5bc982b0c
  • Pointer size: 131 Bytes
  • Size of remote file: 145 kB

Git LFS Details

  • SHA256: 55b2262cf5c2ca93aee6bdb11cf3f52c05e58823e2d93649dbe87c42f2ca3c0d
  • Pointer size: 131 Bytes
  • Size of remote file: 276 kB
samples/sdxs_1b_768x384_0.jpg CHANGED

Git LFS Details

  • SHA256: 622ce7eef549cf3a412353120eb0386546bd0583f623e2a462b1845c48099fa1
  • Pointer size: 131 Bytes
  • Size of remote file: 252 kB

Git LFS Details

  • SHA256: d72f2969879e2e86d57115a5aadd1e14f877641a9d600a30189a21d71790fae8
  • Pointer size: 130 Bytes
  • Size of remote file: 76.7 kB
samples/sdxs_1b_768x416_0.jpg CHANGED

Git LFS Details

  • SHA256: 2105f3d6b6a10c0ef4cf3097cf60fb205890c5647500700de58ca9156a9622df
  • Pointer size: 131 Bytes
  • Size of remote file: 131 kB

Git LFS Details

  • SHA256: 1f8ebdbd02c645d3793ca8aeead317e450fa0a98e3c5bb5cbc7b60fa13980bf7
  • Pointer size: 131 Bytes
  • Size of remote file: 172 kB
samples/sdxs_1b_768x448_0.jpg CHANGED

Git LFS Details

  • SHA256: 8e63e99418ceceecb8ecba636c8c4d44bf6c230439055f583b0634119156ab3c
  • Pointer size: 131 Bytes
  • Size of remote file: 151 kB

Git LFS Details

  • SHA256: 14913038596f6f126b1380d9df5a8b5c2f2c64ed5894553bac46f4f83f3a0e1e
  • Pointer size: 130 Bytes
  • Size of remote file: 88.7 kB
samples/sdxs_1b_768x480_0.jpg CHANGED

Git LFS Details

  • SHA256: cbb3fb683b10bbb48039b15371092fc84300e3f8705ca0ae6a93bf406a8da880
  • Pointer size: 131 Bytes
  • Size of remote file: 180 kB

Git LFS Details

  • SHA256: 9f76a0c4cd682c7e06b969d3299804b279533ad33d6b8bd57011005a08aa828a
  • Pointer size: 130 Bytes
  • Size of remote file: 95.8 kB
samples/sdxs_1b_768x512_0.jpg CHANGED

Git LFS Details

  • SHA256: 2e1bcd280987b762c5adbd6eb420e4a7a6db56837d8a429f0c255b362186ef14
  • Pointer size: 131 Bytes
  • Size of remote file: 131 kB

Git LFS Details

  • SHA256: dec45499962026109f348f7eed3b0d0895faa523aafa456b508972ef84f839b2
  • Pointer size: 131 Bytes
  • Size of remote file: 221 kB
samples/sdxs_1b_768x544_0.jpg CHANGED

Git LFS Details

  • SHA256: 24e992127aa14803ae3b82d78d6fede7f0023c03edc5ec701c73a3c4926b25b2
  • Pointer size: 131 Bytes
  • Size of remote file: 152 kB

Git LFS Details

  • SHA256: ffeda26ff6e489f382d4c8be56183cd58a32bc115d52a85bd63b9b1b3cbeadf9
  • Pointer size: 131 Bytes
  • Size of remote file: 146 kB
samples/sdxs_1b_768x576_0.jpg CHANGED

Git LFS Details

  • SHA256: 693106a26aff4d77744c59e7fbd5fcb3edc0f0d30072fdcd2c9a036a9b0a2f50
  • Pointer size: 130 Bytes
  • Size of remote file: 93.7 kB

Git LFS Details

  • SHA256: 3f03be230aa478ebf756f0e3e91e8e64df6725f315fee2a0063be28f770925b3
  • Pointer size: 130 Bytes
  • Size of remote file: 99.4 kB
samples/sdxs_1b_768x608_0.jpg CHANGED

Git LFS Details

  • SHA256: db6fe5c7b64ef2528a838c7f26b5da8549e87bd486146a3b9d0ade0c052d1e81
  • Pointer size: 131 Bytes
  • Size of remote file: 240 kB

Git LFS Details

  • SHA256: f28e4e4971e4ae2f091b61d888b3a35117913f8d1895f0a056a9478f585063e9
  • Pointer size: 131 Bytes
  • Size of remote file: 160 kB
samples/sdxs_1b_768x640_0.jpg CHANGED

Git LFS Details

  • SHA256: b8bf0081508aff64350ca55dbffce57ccf3db295b65de30799e82699bd0a8cbf
  • Pointer size: 131 Bytes
  • Size of remote file: 156 kB

Git LFS Details

  • SHA256: cd262f9dd75542c3910310696ed8085547a3ec0c6609ae65ae1e2adc0db59300
  • Pointer size: 131 Bytes
  • Size of remote file: 133 kB
samples/sdxs_1b_768x672_0.jpg CHANGED

Git LFS Details

  • SHA256: a0198ef31c35bc409b3da00518249ec7b324abfd814d51d6cc6c771c5f1d5fb9
  • Pointer size: 131 Bytes
  • Size of remote file: 231 kB

Git LFS Details

  • SHA256: e7ff172b1c6b5ab8c864367c6a06ab752822a64c405b80b55eaf4b29ec2c59e9
  • Pointer size: 131 Bytes
  • Size of remote file: 161 kB
samples/sdxs_1b_768x704_0.jpg CHANGED

Git LFS Details

  • SHA256: a5d9033fbdadcf6fb88e933c8a3283e74bc54155a4a05a52546495f05c259aca
  • Pointer size: 131 Bytes
  • Size of remote file: 252 kB

Git LFS Details

  • SHA256: bbcc65534512af7fa116dada3305006a9cc5eaa1d6d4b1caa664858151ca50fd
  • Pointer size: 131 Bytes
  • Size of remote file: 205 kB
samples/sdxs_1b_768x736_0.jpg CHANGED

Git LFS Details

  • SHA256: 4cb94b9587b8dcc3b65bf9649eb1f2bd34cfa237c35048c583001285e24912e6
  • Pointer size: 131 Bytes
  • Size of remote file: 195 kB

Git LFS Details

  • SHA256: a22feb64669d2755499ab4215b31be27122d4fd3734b265e9a69de7984037432
  • Pointer size: 131 Bytes
  • Size of remote file: 318 kB
samples/sdxs_1b_768x768_0.jpg CHANGED

Git LFS Details

  • SHA256: 2c11f2d0e23e86c88ff6c6ed158de56f2c46302b56f8f64d9de70af001091f4f
  • Pointer size: 131 Bytes
  • Size of remote file: 222 kB

Git LFS Details

  • SHA256: a4ecab6d6e95bcb5e60c90660f06395c552ebb97f02eb78510904e684cf9a598
  • Pointer size: 131 Bytes
  • Size of remote file: 150 kB
sdxs_1b/diffusion_pytorch_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:104b893ba36e1359f8468450653c0e00a7b2faeb1a42d3797747f0d0c06ccd0a
3
  size 4463672488
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3773b931314cc4356c69abfae833a9e948b3092b58d436a73d1cff8a36a74376
3
  size 4463672488
train.py CHANGED
@@ -34,6 +34,7 @@ base_learning_rate = 4e-5 #2.7e-5
34
  min_learning_rate = 9e-6 #2.7e-5
35
  num_epochs = 10
36
  sample_interval_share = 20
 
37
  max_length = 192
38
  use_wandb = True
39
  use_comet_ml = False
@@ -95,8 +96,8 @@ lora_alpha = 64
95
  print("init")
96
 
97
  loss_ratios = {
98
- "mse": 0.6,
99
- "mae": 0.4,
100
  }
101
  median_coeff_steps = 256
102
 
@@ -104,8 +105,9 @@ median_coeff_steps = 256
104
  class MedianLossNormalizer:
105
  def __init__(self, desired_ratios: dict, window_steps: int):
106
  # нормируем доли на случай, если сумма != 1
107
- s = sum(desired_ratios.values())
108
- self.ratios = {k: (v / s) for k, v in desired_ratios.items()}
 
109
  self.buffers = {k: deque(maxlen=window_steps) for k in self.ratios.keys()}
110
  self.window = window_steps
111
 
@@ -358,7 +360,7 @@ def collate_fn_simple(batch):
358
  raw_texts = [item["text"] for item in batch]
359
  texts = [
360
  "" if t.lower().startswith("zero")
361
- else "" if random.random() < 0.1
362
  else t[1:].lstrip() if t.startswith(".")
363
  else t.replace("The image shows ", "").replace("The image is ", "").replace("This image captures ","").strip()
364
  for t in raw_texts
@@ -480,7 +482,7 @@ fixed_samples = get_fixed_samples_by_resolution(dataset)
480
  # --- [UPDATED] Функция для негативного эмбеддинга (возвращает 3 элемента) ---
481
  def get_negative_embedding(neg_prompt="", batch_size=1):
482
  if not neg_prompt:
483
- hidden_dim = 2048
484
  seq_len = max_length
485
  empty_emb = torch.zeros((batch_size, seq_len, hidden_dim), dtype=dtype, device=device)
486
  empty_mask = torch.ones((batch_size, seq_len), dtype=torch.int64, device=device)
@@ -567,6 +569,8 @@ def generate_and_save_samples(fixed_samples_cpu, uncond_data, step):
567
  latents = scheduler.step(flow, t, latents).prev_sample
568
 
569
  current_latents = latents
 
 
570
 
571
  latent_for_vae = current_latents.detach() / scaling_factor + shift_factor
572
  decoded = vae.decode(latent_for_vae.to(torch.float32)).sample
@@ -667,9 +671,9 @@ for epoch in range(start_epoch, start_epoch + num_epochs):
667
  # шум
668
  noise = torch.randn_like(latents, dtype=latents.dtype)
669
  # берём t из [0, 1]
670
- #t = torch.rand(latents.shape[0], device=latents.device, dtype=latents.dtype)
671
- u = torch.rand(latents.shape[0], device=latents.device, dtype=latents.dtype)
672
- t = torch.sigmoid(torch.randn_like(u))
673
 
674
  # интерполяция между x0 и шумом
675
  noisy_latents = (1.0 - t.view(-1, 1, 1, 1)) * latents + t.view(-1, 1, 1, 1) * noise
 
34
  min_learning_rate = 9e-6 #2.7e-5
35
  num_epochs = 10
36
  sample_interval_share = 20
37
+ cfg_dropout = 0.5
38
  max_length = 192
39
  use_wandb = True
40
  use_comet_ml = False
 
96
  print("init")
97
 
98
  loss_ratios = {
99
+ "mse": 0.8,
100
+ "mae": 0.2,
101
  }
102
  median_coeff_steps = 256
103
 
 
105
  class MedianLossNormalizer:
106
  def __init__(self, desired_ratios: dict, window_steps: int):
107
  # нормируем доли на случай, если сумма != 1
108
+ #s = sum(desired_ratios.values())
109
+ #self.ratios = {k: (v / s) for k, v in desired_ratios.items()}
110
+ self.ratios = {k: float(v) for k, v in desired_ratios.items()}
111
  self.buffers = {k: deque(maxlen=window_steps) for k in self.ratios.keys()}
112
  self.window = window_steps
113
 
 
360
  raw_texts = [item["text"] for item in batch]
361
  texts = [
362
  "" if t.lower().startswith("zero")
363
+ else "" if random.random() < cfg_dropout
364
  else t[1:].lstrip() if t.startswith(".")
365
  else t.replace("The image shows ", "").replace("The image is ", "").replace("This image captures ","").strip()
366
  for t in raw_texts
 
482
  # --- [UPDATED] Функция для негативного эмбеддинга (возвращает 3 элемента) ---
483
  def get_negative_embedding(neg_prompt="", batch_size=1):
484
  if not neg_prompt:
485
+ hidden_dim = 1024
486
  seq_len = max_length
487
  empty_emb = torch.zeros((batch_size, seq_len, hidden_dim), dtype=dtype, device=device)
488
  empty_mask = torch.ones((batch_size, seq_len), dtype=torch.int64, device=device)
 
569
  latents = scheduler.step(flow, t, latents).prev_sample
570
 
571
  current_latents = latents
572
+ if step==0:
573
+ current_latents = sample_latents
574
 
575
  latent_for_vae = current_latents.detach() / scaling_factor + shift_factor
576
  decoded = vae.decode(latent_for_vae.to(torch.float32)).sample
 
671
  # шум
672
  noise = torch.randn_like(latents, dtype=latents.dtype)
673
  # берём t из [0, 1]
674
+ t = torch.rand(latents.shape[0], device=latents.device, dtype=latents.dtype)
675
+ #u = torch.rand(latents.shape[0], device=latents.device, dtype=latents.dtype)
676
+ #t = torch.sigmoid(torch.randn_like(u))
677
 
678
  # интерполяция между x0 и шумом
679
  noisy_latents = (1.0 - t.view(-1, 1, 1, 1)) * latents + t.view(-1, 1, 1, 1) * noise