Create README.md

d5f1a8f verified 5 months ago

2.59 kB

license: mit
datasets:
  - FlameF0X/Lime
pipeline_tag: unconditional-image-generation
tags:
  - Lime

Stable-Lime-v1.1

Stable-Lime-v1.1 is an unconditional diffusion model based on the Denoising Diffusion Probabilistic Models (DDPM) architecture. It has been trained specifically to generate images representing the "essence of Lime."

Model Details

Model Type: Unconditional Image Generation (Diffusion)
Architecture: UNet2DModel with DDPMScheduler
Framework: PyTorch & Hugging Face Diffusers
Resolution: $128 \times 128$ pixels
Channels: 3 (RGB)
License: MIT (Assumed based on open-source usage)

Intended Use

This model is designed for:

Generating $128 \times 128$ images of limes (or lime-like textures).
Educational purposes regarding the implementation of DDPM loops.
Low-resolution, "retro" aesthetic generation.

Out of Scope:

Text-to-Image generation (this model does not accept text prompts).
High-resolution photorealism (limited by the 128px architecture).

Training Data

The model was trained on a proprietary dataset located at dataset_lime/processed.

Preprocessing: Images were resized to $128 \times 128$ and normalized to the range $[-1, 1]$.
Augmentation: Random horizontal flips were applied during training to improve generalization.

Training Procedure

Hyperparameters

The model was trained using the following configuration ("The Lime Settings"):

Parameter	Value	Description
Batch Size	16	Small batch size suitable for consumer GPUs.
Learning Rate	$1 \times 10^{-4}$	Optimizer step size (AdamW).
Epochs	70
Timesteps	1000	Number of diffusion noise steps.
Image Size	128	Output resolution.

Architecture Specification

The U-Net architecture utilizes a deep structure with attention mechanisms in the lower bottleneck layers:

Block Output Channels: (128, 128, 256, 256, 512, 512)
Downsampling: 4x DownBlock2D, 1x AttnDownBlock2D, 1x DownBlock2D
Upsampling: Mirror of downsampling blocks.

Loss Function

The model optimizes the Mean Squared Error (MSE) between the actual noise added and the predicted noise:

$L = \text{MSE}(\epsilon, \epsilon_\theta(x_t, t))$

Where $\epsilon$ is the Gaussian noise and $\epsilon_\theta$ is the model's prediction at timestep $t$.