---
license: apache-2.0
base_model: stabilityai/stable-diffusion-xl-base-1.0
tags:
  - stable-diffusion-xl
  - lora
  - dreambooth
  - pixel-art
  - controlnet
  - image-to-image
language:
  - en
---

# Mongle 32-bit Pixel Art LoRA

SDXL DreamBooth LoRA for converting stuffed animal photos into modern 32-bit style pixel art characters.

This repository is intended to be used as a HuggingFace Hub package for RunPod. HuggingFace stores the LoRA weights and pipeline code; inference runs on a RunPod GPU server.

```text
input image
  -> rembg background removal
  -> OpenCV Canny edge extraction
  -> SDXL ControlNet Img2Img
  -> Mongle 32-bit LoRA
  -> output image
```

No color quantization or pixelation post-processing is applied in this version.

## Model Details

| Item | Value |
|---|---|
| Base model | stabilityai/stable-diffusion-xl-base-1.0 |
| Training method | DreamBooth LoRA |
| LoRA rank | 32 |
| Training steps | 2,000 |
| Learning rate | 1e-4 |
| Dataset | 243 images after copyright-risk exclusion |
| Style | modern 32-bit pixel art, chibi proportions, soft shading |

## Runtime Components

| Component | Role |
|---|---|
| rembg | removes photo background |
| OpenCV Canny | extracts silhouette edges |
| diffusers/controlnet-canny-sdxl-1.0 | preserves input shape |
| stabilityai/stable-diffusion-xl-base-1.0 | base image generation model |
| this LoRA | applies Mongle 32-bit pixel art style |

## RunPod Usage

Install dependencies:

```bash
pip install -r requirements.txt
```

Set cache paths to `/workspace`:

```bash
source setup_runpod_env.sh
```

RunPod serverless entrypoint:

```bash
python runpod_handler.py
```

The handler expects a base64-encoded image:

```json
{
  "input": {
    "image": "<base64 png or jpeg>",
    "num_inference_steps": 50,
    "guidance_scale": 7.5,
    "controlnet_conditioning_scale": 0.8,
    "strength": 0.6
  }
}
```

The response returns a base64-encoded PNG:

```json
{
  "image": "<base64 png>",
  "rembg_ok": true
}
```

## Local/RunPod Python Example

```python
from huggingface_hub import snapshot_download
from PIL import Image
import sys

repo_dir = snapshot_download("Hadimeeee/mongle-lora-v3-32bit")
sys.path.insert(0, repo_dir)

from pipeline import load_pipeline

pipe = load_pipeline(repo_dir)
image = Image.open("your_image.jpg").convert("RGB")
result = pipe(image)["image"]
result.save("mongle_32bit_result.png")
```

## Parameter Grid Testing

Use `test_grid.py` on RunPod to compare prompts and generation settings. This script does not apply color quantization or pixelation post-processing.

```bash
python test_grid.py \
  --input image \
  --output outputs/grid_test \
  --strengths 0.45,0.55,0.65,0.75 \
  --controlnet-scales 0.6,0.8,1.0 \
  --guidance-scales 7.5 \
  --steps 30 \
  --prompt-presets reference \
  --limit 0
```

By default, only `*_grid.png` comparison files are saved. Add `--save-individual` if each generated image should also be saved.

Recommended first pass:

| Parameter | Values |
|---|---|
| `strength` | 0.45, 0.55, 0.65, 0.75 |
| `controlnet_conditioning_scale` | 0.6, 0.8, 1.0 |
| `guidance_scale` | 7.5 |
| `steps` | 30 for search, 50 for final candidates |

## Notes

The LoRA file does not include SDXL, ControlNet, or rembg. Those components are loaded at inference time by the pipeline code.