Retrained: batch 16, lambda_cycle=lambda_id=0.125

Browse files

Files changed (5) hide show

README.md +28 -44
config.json +16 -7
full_checkpoint.pth +1 -1
generator.pth +1 -1
training_log.csv +0 -0

README.md CHANGED Viewed

@@ -12,47 +12,37 @@ library_name: pytorch
 pipeline_tag: image-to-image
 ---
-# CIMP Style Transfer Generator (ResNet-18 CIMP, crop 512)
-A metadata-conditioned style-transfer generator that translates HAADF-STEM images between acquisition settings. Given an input image $x$ and a pair of CIMP metadata embeddings $(e_{\text{id}}, e_{\text{tgt}})$, the network produces an image that preserves the content of $x$ but matches the style associated with $e_{\text{tgt}}$ (target dwell time, beam current, detector gain, offset, inner collection angle, or convergence angle).
-Conditioning runs on top of [Stemson-AI/cmmp-resnet18-512](https://huggingface.co/Stemson-AI/cmmp-resnet18-512), our best CIMP visual-metadata encoder. The generator was trained against four objectives: an LSGAN adversarial loss, LPIPS cycle-consistency, LPIPS identity, and a CIMP-space embedding-alignment term.
 ## Architecture
-- **Generator**: `StyleUNet`, 4-level U-Net with FiLM conditioning at every convolutional block.
-  - `base_filters = 32`, `embed_dim = 256` (concatenated target + identity CIMP embeddings of 128-d each).
-  - ~9.9M parameters.
-- **Discriminator**: `NoisePatchGAN`, PatchGAN-style with metadata conditioning tiled spatially.
-  - `base_filters = 64`, `meta_embed_dim = 128`.
-  - ~0.66M parameters.
 ## Training Configuration
 | Parameter | Value |
 |---|---|
 | CIMP encoder | [Stemson-AI/cmmp-resnet18-512](https://huggingface.co/Stemson-AI/cmmp-resnet18-512) (frozen) |
-| Crop size | 512 $\times$ 512 |
-| Batch size | 8 |
-| Epochs | 250 (best checkpoint by composite val loss) |
 | Optimizer | Adam, $\text{lr}_G = \text{lr}_D = 2 \cdot 10^{-4}$ |
-| Adversarial loss | LSGAN ($\mathcal{L}_{\text{LSGAN}}$) |
-| Cycle loss | LPIPS round-trip ($\mathcal{L}_{\text{cyc}}$), $\lambda_1 = 0.1$ |
-| Identity loss | LPIPS ($\mathcal{L}_{\text{id}}$), $\lambda_2 = 0.1$ |
-| Embedding alignment | MSE in CIMP visual-embedding space ($\mathcal{L}_{\text{emb}}$), $\lambda_3 = 0.5$ |
 | Hardware | 1 $\times$ H100 |
-## Best-Epoch Validation Metrics
-| Metric | Best epoch | Value |
-|---|---|---|
-| Embedding alignment | 234 | 0.00638 |
-| Cycle consistency | 40 | 0.00432 |
-| Identity stability | 118 | 0.00183 |
-| Composite validation | **176** | **0.00391** |
-The uploaded `generator.pth` / `full_checkpoint.pth` correspond to the composite-best checkpoint (epoch 176).
 ## Files
 - `generator.pth` - inference-ready state dict for the `StyleUNet` generator.
@@ -70,37 +60,31 @@ import torch.nn.functional as F
 device = "cuda"
-# 1. Load the CIMP encoder (provides the metadata embeddings)
 cimp = CMMP(
-    meta_input_dim=7,
-    embed_dim=128,
-    image_encoder="resnet18",
-    image_size=512,
-    meta_hidden_dim=256,
-    meta_num_layers=3,
 ).to(device)
-cimp_path = hf_hub_download("Stemson-AI/cmmp-resnet18-512", "model.pth")
-cimp.load_state_dict(torch.load(cimp_path, map_location=device))
 cimp.eval()
-# 2. Load the style-transfer generator
 gen = StyleUNet(embed_dim=256, in_channels=1, out_channels=1, base_filters=32, use_FiLM=True).to(device)
-gen_path = hf_hub_download("Stemson-AI/cimp-style-transfer-512", "generator.pth")
-gen.load_state_dict(torch.load(gen_path, map_location=device))
 gen.eval()
-# 3. Generate: x is a (1, 1, 512, 512) grayscale image in [0, 1]
-#    src_meta and tgt_meta are 7-d z-scored metadata vectors (1, 7)
 with torch.no_grad():
     e_id  = F.normalize(cimp.meta(src_meta.to(device)), p=2, dim=-1)
     e_tgt = F.normalize(cimp.meta(tgt_meta.to(device)), p=2, dim=-1)
-    y = gen(x.to(device), e_tgt, e_id)  # (1, 1, 512, 512) in [-1, 1] (Tanh output)
 ```
 ## Related Models
-- [Stemson-AI/cmmp-resnet18-512](https://huggingface.co/Stemson-AI/cmmp-resnet18-512) - the CIMP encoder this generator conditions on (required for inference).
-- [Stemson-AI/cmmp-resnet18-256](https://huggingface.co/Stemson-AI/cmmp-resnet18-256) - earlier ResNet-18 CIMP variant at crop 256.
 ## Citation

 pipeline_tag: image-to-image
 ---
+# CIMP Style Transfer Generator (ResNet-18 CIMP, crop 512, batch 16)
+A metadata-conditioned style-transfer generator that translates HAADF-STEM images between acquisition settings. Given an input image $x$ and a pair of CIMP metadata embeddings $(e_{\text{id}}, e_{\text{tgt}})$, the network produces an image that preserves the content of $x$ but matches the style associated with $e_{\text{tgt}}$.
+Conditioning runs on top of [Stemson-AI/cmmp-resnet18-512](https://huggingface.co/Stemson-AI/cmmp-resnet18-512). Training used four objectives: LSGAN adversarial, LPIPS cycle-consistency, LPIPS identity, and a CIMP-space embedding-alignment term.
+## What's new vs. the previous upload
+- `batch_size` increased from 8 to 16.
+- `lambda_cycle` and `lambda_id` increased from 0.1 to 0.125 to put slightly more weight on content preservation.
 ## Architecture
+- **Generator**: `StyleUNet` with FiLM conditioning at every convolutional block. `base_filters=32`, `embed_dim=256` (concatenated 128-d target + identity CIMP embeddings). ~9.9M params.
+- **Discriminator**: `NoisePatchGAN`, `base_filters=64`, `meta_embed_dim=128`. ~0.66M params.
 ## Training Configuration
 | Parameter | Value |
 |---|---|
 | CIMP encoder | [Stemson-AI/cmmp-resnet18-512](https://huggingface.co/Stemson-AI/cmmp-resnet18-512) (frozen) |
+| Crop size | $512 \times 512$ |
+| Batch size | 16 |
+| Epochs | 250 |
 | Optimizer | Adam, $\text{lr}_G = \text{lr}_D = 2 \cdot 10^{-4}$ |
+| LSGAN | $\mathcal{L}_{\text{LSGAN}}$ |
+| Cycle | LPIPS, $\lambda_1 = 0.125$ |
+| Identity | LPIPS, $\lambda_2 = 0.125$ |
+| Emb alignment | MSE in CIMP visual-embedding space, $\lambda_3 = 0.5$ |
 | Hardware | 1 $\times$ H100 |
 ## Files
 - `generator.pth` - inference-ready state dict for the `StyleUNet` generator.
 device = "cuda"
 cimp = CMMP(
+    meta_input_dim=7, embed_dim=128,
+    image_encoder="resnet18", image_size=512,
+    meta_hidden_dim=256, meta_num_layers=3,
 ).to(device)
+cimp.load_state_dict(torch.load(hf_hub_download("Stemson-AI/cmmp-resnet18-512", "model.pth"),
+                                map_location=device))
 cimp.eval()
 gen = StyleUNet(embed_dim=256, in_channels=1, out_channels=1, base_filters=32, use_FiLM=True).to(device)
+gen.load_state_dict(torch.load(hf_hub_download("Stemson-AI/cimp-style-transfer-512", "generator.pth"),
+                               map_location=device))
 gen.eval()
+# x: (1, 1, 512, 512) in [0, 1]; src_meta, tgt_meta: (1, 7) z-scored
 with torch.no_grad():
     e_id  = F.normalize(cimp.meta(src_meta.to(device)), p=2, dim=-1)
     e_tgt = F.normalize(cimp.meta(tgt_meta.to(device)), p=2, dim=-1)
+    y = gen(x.to(device), e_tgt, e_id)
 ```
 ## Related Models
+- [Stemson-AI/cmmp-resnet18-512](https://huggingface.co/Stemson-AI/cmmp-resnet18-512) - the CIMP encoder.
+- [Stemson-AI/cmmp-resnet18-256](https://huggingface.co/Stemson-AI/cmmp-resnet18-256) - earlier CIMP variant.
 ## Citation

config.json CHANGED Viewed

@@ -22,19 +22,28 @@
   },
   "training": {
     "crop_size": 512,
-    "batch_size": 8,
     "epochs": 250,
-    "lr_G": 2e-4,
-    "lr_D": 2e-4,
     "lambda_GAN": 1.0,
-    "lambda_cycle": 0.1,
     "lambda_emb": 0.5,
-    "lambda_id": 0.1,
     "id_loss_fn": "lpips",
     "cycle_loss_fn": "lpips",
     "optimizer": "Adam",
     "adversarial_loss": "LSGAN"
   },
   "meta_dim": 7,
-  "meta_names": ["pixel_size", "dwell_time", "convergence_angle", "beam_current", "gain", "offset", "inner_coll_angle"]
-}

   },
   "training": {
     "crop_size": 512,
+    "batch_size": 16,
     "epochs": 250,
+    "lr_G": 0.0002,
+    "lr_D": 0.0002,
     "lambda_GAN": 1.0,
+    "lambda_cycle": 0.125,
     "lambda_emb": 0.5,
+    "lambda_id": 0.125,
     "id_loss_fn": "lpips",
     "cycle_loss_fn": "lpips",
     "optimizer": "Adam",
     "adversarial_loss": "LSGAN"
   },
   "meta_dim": 7,
+  "meta_names": [
+    "pixel_size",
+    "dwell_time",
+    "convergence_angle",
+    "beam_current",
+    "gain",
+    "offset",
+    "inner_coll_angle"
+  ],
+  "best_epoch_composite_val": 216
+}

full_checkpoint.pth CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:fa0cda61783480aa28d683fb7d388d6950bdace9cf8f53eea63f6e50c1a0c410
 size 42249339

 version https://git-lfs.github.com/spec/v1
+oid sha256:e687972a154b1f8a765e5ebd7dda8dde706cac83e13305295bcd0728004c8426
 size 42249339

generator.pth CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9632cd2d5a858200654e95cefd29ed3870dbd7bc360259cbcd59bfb1eb596555
 size 39597771

 version https://git-lfs.github.com/spec/v1
+oid sha256:633756be32f683595ce9f61a2c91ed04a90de4a052b0005ed94083a9660a34a5
 size 39597771

training_log.csv CHANGED Viewed

The diff for this file is too large to render. See raw diff