DDPO Sharpness Checkpoints
LoRA checkpoints from DDPO fine-tuning of Stable Diffusion v1.4 with a sharpness reward (Laplacian variance).
Training Details
- Base model: CompVis/stable-diffusion-v1-4
- Method: DDPO with LoRA
- Reward: Sharpness (Laplacian variance)
- Prompts: ImageNet animals
- Epochs: 100
- Compute: 4x A100-SXM4-80GB (MIT ORCD)
Checkpoints
Uploaded every 10 epochs: checkpoint_0, checkpoint_10, ..., checkpoint_90, checkpoint_98 (final).