Hadimeeee's picture
Upload README.md with huggingface_hub
34887cf verified
---
license: apache-2.0
base_model: stabilityai/stable-diffusion-xl-base-1.0
tags:
- stable-diffusion-xl
- lora
- dreambooth
- pixel-art
- controlnet
- image-to-image
language:
- en
---
# Mongle 32-bit Pixel Art LoRA
SDXL DreamBooth LoRA for converting stuffed animal photos into modern 32-bit style pixel art characters.
This repository is intended to be used as a HuggingFace Hub package for RunPod. HuggingFace stores the LoRA weights and pipeline code; inference runs on a RunPod GPU server.
```text
input image
-> rembg background removal
-> OpenCV Canny edge extraction
-> SDXL ControlNet Img2Img
-> Mongle 32-bit LoRA
-> output image
```
No color quantization or pixelation post-processing is applied in this version.
## Model Details
| Item | Value |
|---|---|
| Base model | stabilityai/stable-diffusion-xl-base-1.0 |
| Training method | DreamBooth LoRA |
| LoRA rank | 32 |
| Training steps | 2,000 |
| Learning rate | 1e-4 |
| Dataset | 243 images after copyright-risk exclusion |
| Style | modern 32-bit pixel art, chibi proportions, soft shading |
## Runtime Components
| Component | Role |
|---|---|
| rembg | removes photo background |
| OpenCV Canny | extracts silhouette edges |
| diffusers/controlnet-canny-sdxl-1.0 | preserves input shape |
| stabilityai/stable-diffusion-xl-base-1.0 | base image generation model |
| this LoRA | applies Mongle 32-bit pixel art style |
## RunPod Usage
Install dependencies:
```bash
pip install -r requirements.txt
```
Set cache paths to `/workspace`:
```bash
source setup_runpod_env.sh
```
RunPod serverless entrypoint:
```bash
python runpod_handler.py
```
The handler expects a base64-encoded image:
```json
{
"input": {
"image": "<base64 png or jpeg>",
"num_inference_steps": 50,
"guidance_scale": 7.5,
"controlnet_conditioning_scale": 0.8,
"strength": 0.6
}
}
```
The response returns a base64-encoded PNG:
```json
{
"image": "<base64 png>",
"rembg_ok": true
}
```
## Local/RunPod Python Example
```python
from huggingface_hub import snapshot_download
from PIL import Image
import sys
repo_dir = snapshot_download("Hadimeeee/mongle-lora-v3-32bit")
sys.path.insert(0, repo_dir)
from pipeline import load_pipeline
pipe = load_pipeline(repo_dir)
image = Image.open("your_image.jpg").convert("RGB")
result = pipe(image)["image"]
result.save("mongle_32bit_result.png")
```
## Parameter Grid Testing
Use `test_grid.py` on RunPod to compare prompts and generation settings. This script does not apply color quantization or pixelation post-processing.
```bash
python test_grid.py \
--input image \
--output outputs/grid_test \
--strengths 0.45,0.55,0.65,0.75 \
--controlnet-scales 0.6,0.8,1.0 \
--guidance-scales 7.5 \
--steps 30 \
--prompt-presets reference \
--limit 0
```
By default, only `*_grid.png` comparison files are saved. Add `--save-individual` if each generated image should also be saved.
Recommended first pass:
| Parameter | Values |
|---|---|
| `strength` | 0.45, 0.55, 0.65, 0.75 |
| `controlnet_conditioning_scale` | 0.6, 0.8, 1.0 |
| `guidance_scale` | 7.5 |
| `steps` | 30 for search, 50 for final candidates |
## Notes
The LoRA file does not include SDXL, ControlNet, or rembg. Those components are loaded at inference time by the pipeline code.