FrankCCCCC commited on
Commit
9c2cc1a
·
verified ·
1 Parent(s): 9fd0334

Upload folder using huggingface_hub

Browse files
README.md ADDED
@@ -0,0 +1,130 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - diffusers
5
+ - image-generation
6
+ - unconditional-image-generation
7
+ - diffusion-models
8
+ - ddpm
9
+ - ema
10
+ - cifar10
11
+ datasets:
12
+ - cifar10
13
+ pipeline_tag: image-generation
14
+ ---
15
+
16
+ # DDPM EMA CIFAR-10
17
+
18
+ ## Model Description
19
+
20
+ This model is an EMA (Exponential Moving Average) version of the DDPM (Denoising Diffusion Probabilistic Models) trained on CIFAR-10 dataset. It's based on the original [DDPM](https://github.com/hojonathanho/diffusion) model but uses exponential moving averages of model parameters for improved stability and quality.
21
+
22
+ **Model Type**: Unconditional Image Generation
23
+ **Architecture**: DDPM
24
+ **Training Dataset**: CIFAR-10
25
+ **Image Resolution**: 32×32 pixels
26
+ **License**: Apache-2.0
27
+
28
+ ## Model Details
29
+
30
+ This model implements the DDPM approach described in the paper ["Denoising Diffusion Probabilistic Models"](https://arxiv.org/abs/2006.11239) by Jonathan Ho, Ajay Jain, and Pieter Abbeel. The EMA version provides more stable training and often better sample quality by maintaining exponentially weighted averages of model parameters.
31
+
32
+ ### Key Features:
33
+ - **EMA Training**: Uses exponential moving averages for improved model stability
34
+ - **High Quality Generation**: Produces high-quality 32×32 pixel images
35
+ - **CIFAR-10 Classes**: Generates images from all 10 CIFAR-10 categories (airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck)
36
+ - **Diffusers Compatible**: Fully compatible with Hugging Face Diffusers library
37
+
38
+ ## Usage
39
+
40
+ ### Basic Usage
41
+
42
+ ```python
43
+ from diffusers import DDPMPipeline
44
+
45
+ # Load the model
46
+ model_id = "FrankCCCCC/ddpm-ema-cifar10" # Replace with actual repo ID
47
+ pipeline = DDPMPipeline.from_pretrained(model_id)
48
+
49
+ # Generate an image
50
+ image = pipeline().images[0]
51
+ image.save("generated_cifar10.png")
52
+ ```
53
+
54
+ ### Generate Multiple Images
55
+
56
+ ```python
57
+ from diffusers import DDPMPipeline
58
+
59
+ pipeline = DDPMPipeline.from_pretrained("FrankCCCCC/ddpm-ema-cifar10")
60
+
61
+ # Generate batch of images
62
+ images = pipeline(batch_size=4).images
63
+
64
+ # Save images
65
+ for i, image in enumerate(images):
66
+ image.save(f"generated_cifar10_{i}.png")
67
+ ```
68
+
69
+ ### Advanced Usage with Different Schedulers
70
+
71
+ ```python
72
+ from diffusers import DDPMPipeline, DDIMScheduler, PNDMScheduler
73
+
74
+ pipeline = DDPMPipeline.from_pretrained("FrankCCCCC/ddpm-ema-cifar10")
75
+
76
+ # Use DDIM scheduler for faster inference
77
+ ddim_scheduler = DDIMScheduler.from_config(pipeline.scheduler.config)
78
+ pipeline.scheduler = ddim_scheduler
79
+
80
+ # Generate with fewer inference steps
81
+ image = pipeline(num_inference_steps=50).images[0]
82
+ image.save("generated_ddim.png")
83
+ ```
84
+
85
+ ## Training Details
86
+
87
+ - **Dataset**: CIFAR-10 (50,000 training images, 32×32 RGB)
88
+ - **Training Procedure**: EMA version of standard DDPM training
89
+ - **Model Architecture**: U-Net
90
+ - **Parameter Updates**: Exponential moving averages applied to model weights
91
+ - **Training Objective**: Variational lower bound on negative log likelihood
92
+
93
+ ## Model Performance
94
+
95
+ The EMA version typically provides:
96
+ - **Improved Stability**: More consistent training dynamics
97
+ - **Better Sample Quality**: Often achieves better FID scores compared to non-EMA versions
98
+ - **Reduced Mode Collapse**: More diverse sample generation
99
+
100
+ Expected performance metrics (approximate):
101
+ - **FID Score**:
102
+ - 4.5216 (50K ``.png`` Samples are generated by the DDIM with 100 sampling steps)
103
+ - 6.5398 (10K ``.png`` Samples are generated by the DDIM with 100 sampling steps)
104
+
105
+ ## Inference Examples
106
+
107
+ The model generates diverse samples across all CIFAR-10 categories:
108
+ - Airplanes, automobiles, birds, cats, deer
109
+ - Dogs, frogs, horses, ships, trucks
110
+
111
+ All generated images are 32×32 pixels in RGB format.
112
+
113
+ ## Citation
114
+
115
+ If you use this model, please cite the original DDPM paper:
116
+
117
+ ```bibtex
118
+ @article{ho2020denoising,
119
+ title={Denoising Diffusion Probabilistic Models},
120
+ author={Ho, Jonathan and Jain, Ajay and Abbeel, Pieter},
121
+ journal={Advances in Neural Information Processing Systems},
122
+ volume={33},
123
+ pages={6840--6851},
124
+ year={2020}
125
+ }
126
+ ```
127
+
128
+ ## License
129
+
130
+ This model is released under the Apache 2.0 License.
model_index.json ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "DDPMPipeline",
3
+ "_diffusers_version": "0.34.0",
4
+ "scheduler": [
5
+ "diffusers",
6
+ "DDPMScheduler"
7
+ ],
8
+ "unet": [
9
+ "diffusers",
10
+ "UNet2DModel"
11
+ ]
12
+ }
scheduler/scheduler_config.json ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "DDPMScheduler",
3
+ "_diffusers_version": "0.34.0",
4
+ "beta_end": 0.02,
5
+ "beta_schedule": "linear",
6
+ "beta_start": 0.0001,
7
+ "clip_sample": true,
8
+ "clip_sample_range": 1.0,
9
+ "dynamic_thresholding_ratio": 0.995,
10
+ "num_train_timesteps": 1000,
11
+ "prediction_type": "epsilon",
12
+ "rescale_betas_zero_snr": false,
13
+ "sample_max_value": 1.0,
14
+ "steps_offset": 0,
15
+ "thresholding": false,
16
+ "timestep_spacing": "leading",
17
+ "trained_betas": null,
18
+ "variance_type": "fixed_large"
19
+ }
unet/config.json ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "UNet2DModel",
3
+ "_diffusers_version": "0.34.0",
4
+ "act_fn": "silu",
5
+ "add_attention": true,
6
+ "attention_head_dim": null,
7
+ "attn_norm_num_groups": null,
8
+ "block_out_channels": [
9
+ 128,
10
+ 256,
11
+ 256,
12
+ 256
13
+ ],
14
+ "center_input_sample": false,
15
+ "class_embed_type": null,
16
+ "down_block_types": [
17
+ "DownBlock2D",
18
+ "AttnDownBlock2D",
19
+ "DownBlock2D",
20
+ "DownBlock2D"
21
+ ],
22
+ "downsample_padding": 0,
23
+ "downsample_type": "conv",
24
+ "dropout": 0.0,
25
+ "flip_sin_to_cos": false,
26
+ "freq_shift": 1,
27
+ "in_channels": 3,
28
+ "layers_per_block": 2,
29
+ "mid_block_scale_factor": 1,
30
+ "mid_block_type": "UNetMidBlock2D",
31
+ "norm_eps": 1e-06,
32
+ "norm_num_groups": 32,
33
+ "num_class_embeds": null,
34
+ "num_train_timesteps": null,
35
+ "out_channels": 3,
36
+ "resnet_time_scale_shift": "default",
37
+ "sample_size": 32,
38
+ "time_embedding_dim": null,
39
+ "time_embedding_type": "positional",
40
+ "up_block_types": [
41
+ "UpBlock2D",
42
+ "UpBlock2D",
43
+ "AttnUpBlock2D",
44
+ "UpBlock2D"
45
+ ],
46
+ "upsample_type": "conv"
47
+ }
unet/diffusion_pytorch_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2fd1376952ca4403185abb572190bdc54797444b41d98dd26ee0c1e6fc970c55
3
+ size 143020060