You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Model Card for kilbey1/stoner-psych-lora

A fine-tuned Stable Diffusion v1.5 UNet model specialized for generating psychedelic stoner rock album–style covers.

GitHub Repository

View the associated public GitHub scripts here: https://github.com/evemcgivern/KosmischeCovers

Model Details

Model Description

This model is a direct fine-tuning of the cross-attention layers in the UNet of Stable Diffusion v1.5, optimized on a custom dataset of album covers to produce trippy, cosmic artwork reminiscent of stoner rock aesthetics. During training, only the cross-attention (attn2) parameters were updated (≈ trainable_params parameters), leaving the bulk of the UNet frozen for efficiency.

  • Developed by: kilbey1
  • Shared by: kilbey1
  • Model type: Vision diffusion (Stable Diffusion)
  • License: [More information needed]
  • Fine-tuned from: runwayml/stable-diffusion-v1-5

Model Sources

Uses

Direct Use

You can use this model out of the box to generate stylistic, psychedelic album covers:

from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained(
  "kilbey1/stoner-psych-lora",
  safety_checker=None
)
pipe = pipe.to("cuda")  # or "mps" / "cpu"
image = pipe(
  "psychedelic stoner rock album cover, cosmic artwork", 
  num_inference_steps=30
).images[0]
image.save("cover.png")

Downstream Use

  • As a base for further fine-tuning on related artwork styles.
  • Integration into music-themed creative tools or art generators.

Out-of-Scope Use

  • Photorealistic portraits or non-artistic imagery (model is trained on album covers).
  • Sensitive content generation; no safety checker included by default.

Bias, Risks, and Limitations

  • Technical limitations: May produce artifacts outside training distribution (e.g., faces, text).
  • Stylistic bias: Tends toward “psychedelic” color palettes; may not suit other art styles.
  • Compute requirements: Best results on GPU with ≥ 12 GB VRAM (mixed precision).

Recommendations

  • Monitor prompts for undesired patterns.
  • Adjust num_inference_steps and guidance_scale to trade off quality vs. speed.

Getting Started

pip install diffusers accelerate transformers safetensors
from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained(
  "kilbey1/stoner-psych-lora",
  safety_checker=None
)
pipe = pipe.to("cuda")
output = pipe("trippy album cover, cosmic", num_inference_steps=25).images[0]
output.save("my_cover.png")

Training Details

Training Data

  • Custom dataset of album cover images, curated for psychedelic and stoner-rock aesthetics.
  • No public dataset card available.

Training Procedure

  • Base model: runwayml/stable-diffusion-v1-5
  • Approach: Direct fine-tuning of cross-attention layers only
  • Epochs: 10
  • Learning rate: 1 × 10⁻⁴
  • Batch size: auto-tuned by hardware (1–8)
  • Precision: Mixed-precision (FP16) on CUDA, FP32 on MPS/CPU
  • Scheduler: Linear warmup + decay (100 warmup steps)

Preprocessing

  • Images resized and center-cropped via standard AlbumCoversDataset transforms.
  • Latents scaled by a factor of 0.18215.

Training Hyperparameters

  • Optimizer: AdamW
  • Weight decay: 0.0
  • Gradient accumulation: none
  • Logging: every 20 steps
  • Validation: MSE loss on held-out split each epoch

Evaluation

Testing Data, Factors & Metrics

  • Validation set: held-out portion of album cover dataset
  • Metric: Mean Squared Error (MSE) on predicted noise vs. ground truth noise

Results

Epoch Val MSE Loss
1 0.122710
2 0.266830
3 0.045715
4 0.096358
5 0.113189
6 0.069246
7 0.113489
8 0.033713
9 0.216780
10 0.107808

Environmental Impact

Estimated with ML CO₂ calculator (Lacoste et al., 2019):

  • Hardware: NVIDIA GPU (≥ 12 GB), Apple MPS, or CPU
  • Total training time: ~ X hours [replace with actual]
  • Compute region/provider: [More information needed]
  • Estimated CO₂ eq.: [More information needed]

Technical Specifications

  • UNet: UNet2DConditionModel from Stable Diffusion v1.5
  • VAE: AutoencoderKL subfolder “vae”
  • Scheduler: DDPMScheduler
  • Objective: Denoising diffusion MSE loss

Compute Infrastructure

  • Platforms supported: CUDA (mixed precision), Apple MPS, CPU
  • Software: PyTorch ≥ 1.13, diffusers ≥ 0.19.3, transformers, python-dotenv

Citation

If you use this model in your work, please cite:

@misc{kilbey1_stoner_psych_lora_2025,
  title        = {Stoner Psych Album Cover Model},
  author       = {kilbey1},
  year         = {2025},
  howpublished = {Hugging Face Model Hub},
  url          = {https://huggingface.co/kilbey1/stoner-psych-lora}
}

Authors & Contact

•	Model Card Author: kilbey1
•	Issues & Questions: https://huggingface.co/kilbey1/stoner-psych-lora/-/issues
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kilbey1/stoner-psych-lora

Finetuned
(371)
this model