You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Model Card for `kilbey1/stoner-psych-lora`

A fine-tuned Stable Diffusion v1.5 UNet model specialized for generating psychedelic stoner rock album–style covers.

GitHub Repository

View the associated public GitHub scripts here: https://github.com/evemcgivern/KosmischeCovers

Model Details

Model Description

This model is a direct fine-tuning of the cross-attention layers in the UNet of Stable Diffusion v1.5, optimized on a custom dataset of album covers to produce trippy, cosmic artwork reminiscent of stoner rock aesthetics. During training, only the cross-attention (attn2) parameters were updated (≈ trainable_params parameters), leaving the bulk of the UNet frozen for efficiency.

Developed by: kilbey1
Shared by: kilbey1
Model type: Vision diffusion (Stable Diffusion)
License: [More information needed]
Fine-tuned from: runwayml/stable-diffusion-v1-5

Model Sources

Repository: https://huggingface.co/kilbey1/stoner-psych-lora
Base model card: runwayml/stable-diffusion-v1-5

Uses

Direct Use

You can use this model out of the box to generate stylistic, psychedelic album covers:

from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained(
  "kilbey1/stoner-psych-lora",
  safety_checker=None
)
pipe = pipe.to("cuda")  # or "mps" / "cpu"
image = pipe(
  "psychedelic stoner rock album cover, cosmic artwork", 
  num_inference_steps=30
).images[0]
image.save("cover.png")

Downstream Use

As a base for further fine-tuning on related artwork styles.
Integration into music-themed creative tools or art generators.

Out-of-Scope Use

Photorealistic portraits or non-artistic imagery (model is trained on album covers).
Sensitive content generation; no safety checker included by default.

Bias, Risks, and Limitations

Technical limitations: May produce artifacts outside training distribution (e.g., faces, text).
Stylistic bias: Tends toward “psychedelic” color palettes; may not suit other art styles.
Compute requirements: Best results on GPU with ≥ 12 GB VRAM (mixed precision).

Recommendations

Monitor prompts for undesired patterns.
Adjust num_inference_steps and guidance_scale to trade off quality vs. speed.

Getting Started

pip install diffusers accelerate transformers safetensors

from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained(
  "kilbey1/stoner-psych-lora",
  safety_checker=None
)
pipe = pipe.to("cuda")
output = pipe("trippy album cover, cosmic", num_inference_steps=25).images[0]
output.save("my_cover.png")

Training Details

Training Data

Custom dataset of album cover images, curated for psychedelic and stoner-rock aesthetics.
No public dataset card available.

Training Procedure

Base model: runwayml/stable-diffusion-v1-5
Approach: Direct fine-tuning of cross-attention layers only
Epochs: 10
Learning rate: 1 × 10⁻⁴
Batch size: auto-tuned by hardware (1–8)
Precision: Mixed-precision (FP16) on CUDA, FP32 on MPS/CPU
Scheduler: Linear warmup + decay (100 warmup steps)

Preprocessing

Images resized and center-cropped via standard AlbumCoversDataset transforms.
Latents scaled by a factor of 0.18215.

Training Hyperparameters

Optimizer: AdamW
Weight decay: 0.0
Gradient accumulation: none
Logging: every 20 steps
Validation: MSE loss on held-out split each epoch

Evaluation

Testing Data, Factors & Metrics

Validation set: held-out portion of album cover dataset
Metric: Mean Squared Error (MSE) on predicted noise vs. ground truth noise

Results

Epoch	Val MSE Loss
1	0.122710
2	0.266830
3	0.045715
4	0.096358
5	0.113189
6	0.069246
7	0.113489
8	0.033713
9	0.216780
10	0.107808

Environmental Impact

Estimated with ML CO₂ calculator (Lacoste et al., 2019):

Hardware: NVIDIA GPU (≥ 12 GB), Apple MPS, or CPU
Total training time: ~ X hours [replace with actual]
Compute region/provider: [More information needed]
Estimated CO₂ eq.: [More information needed]

Technical Specifications

UNet: UNet2DConditionModel from Stable Diffusion v1.5
VAE: AutoencoderKL subfolder “vae”
Scheduler: DDPMScheduler
Objective: Denoising diffusion MSE loss

Compute Infrastructure

Platforms supported: CUDA (mixed precision), Apple MPS, CPU
Software: PyTorch ≥ 1.13, diffusers ≥ 0.19.3, transformers, python-dotenv

Citation

If you use this model in your work, please cite:

@misc{kilbey1_stoner_psych_lora_2025,
  title        = {Stoner Psych Album Cover Model},
  author       = {kilbey1},
  year         = {2025},
  howpublished = {Hugging Face Model Hub},
  url          = {https://huggingface.co/kilbey1/stoner-psych-lora}
}

Authors & Contact

•	Model Card Author: kilbey1
•	Issues & Questions: https://huggingface.co/kilbey1/stoner-psych-lora/-/issues

Downloads last month: -

Model tree for kilbey1/stoner-psych-lora

Base model

stable-diffusion-v1-5/stable-diffusion-v1-5

Finetuned

(371)

this model