Upload 2-parameter conditional DDPM (HI emulation, CAMELS LH params_2, epoch 200) with full training/eval/posterior toolchain

f513198 verified 21 days ago

preview code

raw

history blame contribute delete

7.33 kB

	---
	license: mit
	library_name: pytorch
	tags:
	- diffusion
	- ddpm
	- ddim
	- cosmology
	- astrophysics
	- camels
	- emulator
	- conditional-generation
	pipeline_tag: unconditional-image-generation
	---

	# DDPM HI Emulator — 2 Parameter (CAMELS LH)

	A conditional Denoising Diffusion Probabilistic Model (DDPM) that emulates
	neutral-hydrogen (HI) 2D maps from the CAMELS Latin-Hypercube (LH)
	simulation suite, conditioned on two cosmological parameters
	(e.g. Ωm, σ8). Sampling supports both full DDPM and accelerated DDIM.

	This checkpoint is epoch 200 of the training run carried out under
	`DDPM_HI_Emulation_improved/outputs_conditional_2label_20260408_125646/`.

	## Files in this repo

	Top level

	\| File \| Purpose \|
	\|------\|---------\|
	\| `model.pt` \| PyTorch checkpoint (state-dict for `ConditionalDiffusionModel`) \|
	\| `args.json` / `args.txt` \| Training hyper-parameters and U-Net configuration \|
	\| `config.json` \| Architecture summary (for Hub discoverability) \|
	\| `inference_example.py` \| Runnable example: downloads weights and generates a sample \|

	`src/` — per-model Python

	\| File \| Purpose \|
	\|------\|---------\|
	\| `train_conditional.py` \| Training entry point (`label_dim=2`) \|
	\| `evaluate_conditional.py` \| Held-out evaluation: samples + metrics \|
	\| `ddim_investigation_2param.py` \| DDIM-vs-DDPM sampler comparison study \|
	\| `unet_conditional.py` \| `ConditionalUNet` module \|
	\| `diffusion_conditional.py` \| `GaussianDiffusion` (DDPM + DDIM) and the wrapping `ConditionalDiffusionModel` \|
	\| `dataset_conditional.py` \| CAMELS LH dataset loader + label normalisation \|

	`scripts/shell/` — SLURM launchers

	\| File \| Purpose \|
	\|------\|---------\|
	\| `train_conditional.sh` \| Submit a training job (`label_dim=2`) \|
	\| `evaluate_conditional.sh` \| Submit evaluation against the held-out test split \|
	\| `run_ddim_investigation_2param.sh` \| Launch the DDIM sampler study \|

	`cross_model/` — posterior + comparison scripts that use BOTH models

	\| File \| Purpose \|
	\|------\|---------\|
	\| `compare_posterior_inference.py` (+ `run_compare_posterior.sh`) \| End-to-end posterior comparison between 2-param and 6-param emulators \|
	\| `ddpm_posterior_corrected.py` (+ `scripts/run_ddpm_posterior_corrected.sh`) \| Corrected DDPM posterior inference \|
	\| `poster.py` / `check_poster_env.py` (+ `scripts/run_poster.sh`) \| Posterior orchestration and environment check \|
	\| `submit_vlb_1000grid.py` / `run_vlb_inference_*.sh` \| Variational-lower-bound grid inference (200 / 1000 grid) \|
	\| `scripts/compare_ddpm_models.py` (+ `run_ddpm_comparison.sh`) \| DDPM-2 vs DDPM-6 comparison figures \|
	\| `scripts/ddpm_posterior_six_anchors.py` (+ `run_ddpm_posterior_six_anchors.sh`) \| Six-anchor posterior visualisation \|
	\| `scripts/ddpm_figure6_integration.py`, `figure6_2409_style.py`, `run_ddpm_figure6_suite.py` (+ `run_ddpm_figure6.sh`) \| Figure 6 generation pipeline \|
	\| `scripts/ddpm_triangle_integration.py`, `triangle_plot_posterior.py` (+ `run_triangle_ddpm_both.sh`) \| Triangle-plot posterior figures \|
	\| `scripts/sigma_contour_utils.py` \| Confidence-contour helper used by the figure scripts \|
	\| `scripts/compare_ddpm_training_curves.py` \| Parses SLURM logs for combined train/val loss plots \|
	\| `cross_model/README.md` \| How to point these scripts at locally-downloaded weights/data \|

	These cross-model scripts default to the original cluster paths (e.g.
	`<CAMELS_LH_DATA_DIR>/params_2`). After downloading
	this repo, supply `--bundle-2param`, `--bundle-6param`, `--data-2param`,
	`--data-6param` to override.

	## Architecture

	Conditional U-Net + Gaussian diffusion process. Hyper-parameters (taken from
	`args.json`):

	\| Field \| Value \|
	\|-------\|-------\|
	\| `label_dim` \| 2 \|
	\| `base_channels` \| 64 \|
	\| `channel_multipliers` \| [1, 2, 4, 8] \|
	\| `attention_levels` \| [2, 3] \|
	\| `dropout` \| 0.1 \|
	\| `timesteps` \| 1500 (linear β schedule: 1e-4 → 0.02) \|
	\| EMA decay \| 0.9999 \|
	\| Sampler \| DDIM, 50 steps (DDPM also supported) \|
	\| Image size \| 256 × 256, single channel \|
	\| Image range \| [-1, 1] (training data is rescaled by `x * 2 - 1`) \|

	Labels are z-scored using the training-split mean / std. The
	`inference_example.py` shows how to recover this normalisation from the
	CAMELS LH `params_2` dataset, or you can pass already-normalised conditioning
	values directly.

	## Quick start

	```python
	from huggingface_hub import hf_hub_download
	import sys, torch, json
	from pathlib import Path

	# 1) Download all needed files
	repo = "collins909/DDPM-2param"
	ckpt_path = hf_hub_download(repo, "model.pt")
	args_path = hf_hub_download(repo, "args.json")
	# Pull the bundled source files so we can import the model classes.
	for name in ("unet_conditional.py", "diffusion_conditional.py", "__init__.py"):
	hf_hub_download(repo, f"src/{name}")
	sys.path.insert(0, str(Path(ckpt_path).parent / "src"))

	from unet_conditional import ConditionalUNet
	from diffusion_conditional import GaussianDiffusion, ConditionalDiffusionModel

	# 2) Rebuild the model from args.json
	args = json.loads(Path(args_path).read_text())
	unet = ConditionalUNet(
	in_channels=1, out_channels=1,
	label_dim=args["label_dim"],
	base_channels=args["base_channels"],
	channel_multipliers=tuple(args["channel_multipliers"]),
	attention_levels=tuple(args["attention_levels"]),
	dropout=args["dropout"],
	)
	diffusion = GaussianDiffusion(
	timesteps=args["timesteps"],
	beta_start=args["beta_start"],
	beta_end=args["beta_end"],
	schedule_type=args["schedule_type"],
	)
	model = ConditionalDiffusionModel(unet, diffusion)

	# 3) Load the checkpoint and sample
	ckpt = torch.load(ckpt_path, map_location="cpu", weights_only=False)
	model.load_state_dict(ckpt["model_state_dict"])
	model.eval()

	# Conditioning vector must be z-scored using training-split label statistics.
	labels = torch.tensor([[0.0, 0.0]]) # placeholder; see inference_example.py
	sample = model.sample(labels, channels=1, height=256, width=256,
	device="cpu", use_ddim=True, ddim_steps=50)
	# sample is in [-1, 1]; rescale to physical HI units as needed.
	```

	For an end-to-end runnable example (including label normalisation, GPU usage,
	and image saving), see `inference_example.py` in this repo.

	## Training data

	Trained on CAMELS LH HI maps with 2-label conditioning. The exact data
	layout used by `src/dataset_conditional.py` is:

	```
	<data_dir>/
	train_LH_2.npy, val_LH_2.npy, test_LH_2.npy
	train_labels_LH.npy, val_labels_LH.npy, test_labels_LH.npy
	```

	Images are rescaled to `[-1, 1]`; labels are z-scored using train-split
	statistics. Point your training/eval scripts at the local directory that contains those
	files (e.g. via `--data_dir <CAMELS_LH_DATA_DIR>/params_2`).

	## Intended use & limitations

	- Intended for research on diffusion emulators for cosmological fields.
	- The 2-label setup is a simplified subset of the full CAMELS LH parameter
	space; see the companion 6-parameter model
	(`collins909/DDPM-6param`) for the full conditioning.
	- Outputs are 256 × 256 single-channel maps in the model's normalised range.
	Apply the inverse of any data-pipeline preprocessing before physical
	interpretation.

	## Citation

	If you use this checkpoint, please cite the CAMELS project and the upstream
	DDPM HI emulation work. (Citation block to be filled in once the
	accompanying paper is published.)