Model Card for kilbey1/stoner-psych-lora
A fine-tuned Stable Diffusion v1.5 UNet model specialized for generating psychedelic stoner rock album–style covers.
GitHub Repository
View the associated public GitHub scripts here: https://github.com/evemcgivern/KosmischeCovers
Model Details
Model Description
This model is a direct fine-tuning of the cross-attention layers in the UNet of Stable Diffusion v1.5, optimized on a custom dataset of album covers to produce trippy, cosmic artwork reminiscent of stoner rock aesthetics. During training, only the cross-attention (attn2) parameters were updated (≈ trainable_params parameters), leaving the bulk of the UNet frozen for efficiency.
- Developed by: kilbey1
- Shared by: kilbey1
- Model type: Vision diffusion (Stable Diffusion)
- License: [More information needed]
- Fine-tuned from:
runwayml/stable-diffusion-v1-5
Model Sources
- Repository: https://huggingface.co/kilbey1/stoner-psych-lora
- Base model card: runwayml/stable-diffusion-v1-5
Uses
Direct Use
You can use this model out of the box to generate stylistic, psychedelic album covers:
from diffusers import StableDiffusionPipeline
pipe = StableDiffusionPipeline.from_pretrained(
"kilbey1/stoner-psych-lora",
safety_checker=None
)
pipe = pipe.to("cuda") # or "mps" / "cpu"
image = pipe(
"psychedelic stoner rock album cover, cosmic artwork",
num_inference_steps=30
).images[0]
image.save("cover.png")
Downstream Use
- As a base for further fine-tuning on related artwork styles.
- Integration into music-themed creative tools or art generators.
Out-of-Scope Use
- Photorealistic portraits or non-artistic imagery (model is trained on album covers).
- Sensitive content generation; no safety checker included by default.
Bias, Risks, and Limitations
- Technical limitations: May produce artifacts outside training distribution (e.g., faces, text).
- Stylistic bias: Tends toward “psychedelic” color palettes; may not suit other art styles.
- Compute requirements: Best results on GPU with ≥ 12 GB VRAM (mixed precision).
Recommendations
- Monitor prompts for undesired patterns.
- Adjust
num_inference_stepsandguidance_scaleto trade off quality vs. speed.
Getting Started
pip install diffusers accelerate transformers safetensors
from diffusers import StableDiffusionPipeline
pipe = StableDiffusionPipeline.from_pretrained(
"kilbey1/stoner-psych-lora",
safety_checker=None
)
pipe = pipe.to("cuda")
output = pipe("trippy album cover, cosmic", num_inference_steps=25).images[0]
output.save("my_cover.png")
Training Details
Training Data
- Custom dataset of album cover images, curated for psychedelic and stoner-rock aesthetics.
- No public dataset card available.
Training Procedure
- Base model:
runwayml/stable-diffusion-v1-5 - Approach: Direct fine-tuning of cross-attention layers only
- Epochs: 10
- Learning rate: 1 × 10⁻⁴
- Batch size: auto-tuned by hardware (1–8)
- Precision: Mixed-precision (FP16) on CUDA, FP32 on MPS/CPU
- Scheduler: Linear warmup + decay (100 warmup steps)
Preprocessing
- Images resized and center-cropped via standard
AlbumCoversDatasettransforms. - Latents scaled by a factor of 0.18215.
Training Hyperparameters
- Optimizer: AdamW
- Weight decay: 0.0
- Gradient accumulation: none
- Logging: every 20 steps
- Validation: MSE loss on held-out split each epoch
Evaluation
Testing Data, Factors & Metrics
- Validation set: held-out portion of album cover dataset
- Metric: Mean Squared Error (MSE) on predicted noise vs. ground truth noise
Results
| Epoch | Val MSE Loss |
|---|---|
| 1 | 0.122710 |
| 2 | 0.266830 |
| 3 | 0.045715 |
| 4 | 0.096358 |
| 5 | 0.113189 |
| 6 | 0.069246 |
| 7 | 0.113489 |
| 8 | 0.033713 |
| 9 | 0.216780 |
| 10 | 0.107808 |
Environmental Impact
Estimated with ML CO₂ calculator (Lacoste et al., 2019):
- Hardware: NVIDIA GPU (≥ 12 GB), Apple MPS, or CPU
- Total training time: ~ X hours [replace with actual]
- Compute region/provider: [More information needed]
- Estimated CO₂ eq.: [More information needed]
Technical Specifications
- UNet:
UNet2DConditionModelfrom Stable Diffusion v1.5 - VAE:
AutoencoderKLsubfolder “vae” - Scheduler:
DDPMScheduler - Objective: Denoising diffusion MSE loss
Compute Infrastructure
- Platforms supported: CUDA (mixed precision), Apple MPS, CPU
- Software: PyTorch ≥ 1.13, diffusers ≥ 0.19.3, transformers, python-dotenv
Citation
If you use this model in your work, please cite:
@misc{kilbey1_stoner_psych_lora_2025,
title = {Stoner Psych Album Cover Model},
author = {kilbey1},
year = {2025},
howpublished = {Hugging Face Model Hub},
url = {https://huggingface.co/kilbey1/stoner-psych-lora}
}
Authors & Contact
• Model Card Author: kilbey1
• Issues & Questions: https://huggingface.co/kilbey1/stoner-psych-lora/-/issues
- Downloads last month
- -
Model tree for kilbey1/stoner-psych-lora
Base model
stable-diffusion-v1-5/stable-diffusion-v1-5