Pixel Art LoRA β€” Stuffed Animal Sprite Converter

SDXL DreamBooth LoRA trained to convert stuffed animal photos into 8-bit pixel art sprites.

Developed as part of the λͺ½κΈ€λ§ˆμ„ (Mongle Village) project β€” an AI-based stuffed animal persona app.


What it does

Converts a photo of a stuffed animal into a 16x16-style pixel art character sprite with white background, bold outlines, and flat colors.

Input photo β†’ Background removal (rembg) β†’ Edge detection (ControlNet) β†’ Pixel art (this LoRA)

Model Details

Item Value
Base model stabilityai/stable-diffusion-xl-base-1.0
Training method DreamBooth LoRA
LoRA rank 32
Training steps 1,500
Learning rate 5e-5
Dataset 236 images (7 categories: animals, food, characters, objects, etc.)
Training time ~17 min on RTX 3060 (12GB VRAM)
File size 177.4 MB

Performance (20 test images, vs 6 other models)

Metric Score Rank
SSIM ↑ 0.5986 πŸ₯ˆ 2nd
LPIPS ↓ (AlexNet) 0.6450 πŸ₯‰ 3rd
CLIP Score ↑ 27.92 4th
Color count ↓ 19,304 πŸ₯ˆ 2nd
Generation success rate 100% πŸ₯‡ 1st

How to use

Intended deployment

This repository is designed to be used as a HuggingFace Hub package for a RunPod GPU server.

RunPod server
  -> download this HuggingFace repo
  -> load pipeline.py
  -> run rembg + Canny + ControlNet + SDXL + LoRA + quantization
  -> expose the result through an API

HuggingFace stores the LoRA weights and pipeline code. The actual inference runs on RunPod.

Requirements

pip install -r requirements.txt

Download from HuggingFace and run locally/on RunPod

from huggingface_hub import snapshot_download
from PIL import Image

repo_dir = snapshot_download("Hadimeeee/pixel-art-lora-sdxl")

import sys
sys.path.insert(0, repo_dir)

from pipeline import load_pipeline

pipe = load_pipeline(repo_dir)
image = Image.open("your_image.jpg").convert("RGB")
result = pipe(image)["image"]
result.save("pixel_art_result.png")

RunPod serverless handler

Use runpod_handler.py as the serverless entrypoint. The handler expects a base64-encoded image:

{
  "input": {
    "image": "<base64 png or jpeg>",
    "num_inference_steps": 50,
    "guidance_scale": 7.5,
    "controlnet_conditioning_scale": 0.8,
    "strength": 0.75,
    "quantize": true,
    "n_colors": 32
  }
}

The response returns a base64-encoded PNG:

{
  "image": "<base64 png>",
  "rembg_ok": true
}

Pipeline breakdown

Step Tool Role
Background removal rembg Isolates the subject on white background
Edge detection OpenCV Canny (low=80, high=180) Extracts silhouette for ControlNet
Shape preservation diffusers/controlnet-canny-sdxl-1.0 Locks the original shape during generation
Style transfer This LoRA Applies pixel art style

Note: rembg and ControlNet are not included in this file. They are separate open-source tools loaded at inference time.


Tips

  • Works best on stuffed animals and character-shaped objects with clear silhouettes
  • If background removal fails, the pipeline automatically falls back to the original image
  • For more pixel-art-like results, apply color quantization after generation:
    result.quantize(colors=32, method=Image.Quantize.MEDIANCUT, dither=Image.Dither.NONE).convert("RGB")
    
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW

Model tree for Hadimeeee/pixel-art-lora-sdxl

Adapter
(8612)
this model