metadata
license: mit
datasets:
- FlameF0X/Lime
pipeline_tag: unconditional-image-generation
tags:
- Lime
Stable-Lime-v1.1
Stable-Lime-v1.1 is an unconditional diffusion model based on the Denoising Diffusion Probabilistic Models (DDPM) architecture. It has been trained specifically to generate images representing the "essence of Lime."
Model Details
- Model Type: Unconditional Image Generation (Diffusion)
- Architecture: UNet2DModel with DDPMScheduler
- Framework: PyTorch & Hugging Face Diffusers
- Resolution: $128 \times 128$ pixels
- Channels: 3 (RGB)
- License: MIT (Assumed based on open-source usage)
Intended Use
This model is designed for:
- Generating $128 \times 128$ images of limes (or lime-like textures).
- Educational purposes regarding the implementation of DDPM loops.
- Low-resolution, "retro" aesthetic generation.
Out of Scope:
- Text-to-Image generation (this model does not accept text prompts).
- High-resolution photorealism (limited by the 128px architecture).
Training Data
The model was trained on a proprietary dataset located at dataset_lime/processed.
- Preprocessing: Images were resized to $128 \times 128$ and normalized to the range $[-1, 1]$.
- Augmentation: Random horizontal flips were applied during training to improve generalization.
Training Procedure
Hyperparameters
The model was trained using the following configuration ("The Lime Settings"):
| Parameter | Value | Description |
|---|---|---|
| Batch Size | 16 | Small batch size suitable for consumer GPUs. |
| Learning Rate | $1 \times 10^{-4}$ | Optimizer step size (AdamW). |
| Epochs | 70 | |
| Timesteps | 1000 | Number of diffusion noise steps. |
| Image Size | 128 | Output resolution. |
Architecture Specification
The U-Net architecture utilizes a deep structure with attention mechanisms in the lower bottleneck layers:
- Block Output Channels:
(128, 128, 256, 256, 512, 512) - Downsampling: 4x
DownBlock2D, 1xAttnDownBlock2D, 1xDownBlock2D - Upsampling: Mirror of downsampling blocks.
Loss Function
The model optimizes the Mean Squared Error (MSE) between the actual noise added and the predicted noise:
Where $\epsilon$ is the Gaussian noise and $\epsilon_\theta$ is the model's prediction at timestep $t$.
