PixArt-Σ LoRA Fine-tuned for Lego Image Generation
This model is a LoRA fine-tuned version of PixArt-alpha/PixArt-Sigma-XL-2-1024-MS on the Norod78/lego-blip-captions-512 dataset for generating lego style images.
Model Details
- Base Model: PixArt-alpha/PixArt-Sigma-XL-2-1024-MS
- Training Method: LoRA (Low-Rank Adaptation)
- Domain: Lego
- Dataset: Norod78/lego-blip-captions-512
- LoRA Rank: 16
- LoRA Alpha: 32
- Task: Text-to-Image Generation
Training Details
- Epochs: 50
- Batch Size: 1
- Gradient Accumulation Steps: 4
- Learning Rate: 1e-4
- Training Steps: 1500
- Mixed Precision: FP16
Usage
from diffusers import PixArtSigmaPipeline
import torch
# Load pipeline
pipe = PixArtSigmaPipeline.from_pretrained(
"PixArt-alpha/PixArt-Sigma-XL-2-1024-MS",
torch_dtype=torch.float16
).to("cuda")
# Load LoRA weights
pipe.load_lora_weights("matthew816/pixart-lora-lego")
# Generate image from text
prompt = "lego style, a cat sitting on a chair"
image = pipe(
prompt=prompt,
num_inference_steps=20,
guidance_scale=4.5
).images[0]
image.save("generated_lego_image.png")
Examples
This model generates images in lego style from text descriptions.
Example prompts:
- "lego style, a dragon flying over mountains"
- "lego style, a robot playing guitar"
- "lego style, a sunset over the ocean"
Citation
If you use this model, please cite the original PixArt-Σ model and the dataset.
@article{chen2024pixart,
title={PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation},
author={Chen, Junsong and others},
journal={arXiv preprint arXiv:2403.04692},
year={2024}
}
- Downloads last month
- 4
Model tree for matthew816/pixart-lora-lego
Base model
PixArt-alpha/PixArt-Sigma-XL-2-1024-MS