SanskarModi's picture
updated code to auto downlaod loras
600587b

A newer version of the Gradio SDK is available: 6.5.1

Upgrade
metadata
title: sd-image-gen-toolkit
app_file: src/sdgen/main.py
sdk: gradio
sdk_version: 6.0.2

Stable Diffusion Image Generation Toolkit

appdemo

Live Demo


Overview

A modular, lightweight image generation toolkit built on Hugging Face Diffusers, designed for CPU-friendly deployment, clean architecture, and practical usability.

It supports Text β†’ Image, Image β†’ Image, and Upscaling, with a preset system, optional LoRA adapters, and a local metadata history for reproducibility.


Features

Text β†’ Image

  • Stable Diffusion 1.5 and Turbo
  • Configurable prompt parameters:
    • prompt / negative prompt
    • steps
    • guidance (CFG)
    • resolution
    • seed (optional)
  • JSON metadata output
  • Style presets for quick experimentation

Image β†’ Image

  • Modify existing images via the SD Img2Img pipeline
  • Denoising strength control
  • Full parameter configuration
  • Shared preset system
  • History saved for reproducibility

Upscaling (Real-ESRGAN NCNN)

  • 2Γ— and 4Γ— upscaling
  • NCNN backend (no GPU required)
  • Minimal dependencies
  • Fast on CPU environments (HF Spaces)

LoRA Adapter Support

  • Runtime loading of .safetensors adapters
  • Up to two adapters with independent weights
  • Alpha range -2 β†’ +2 per adapter
  • Automatic discovery under:

src/assets/loras/
  • LoRA UI is disabled for Turbo, since Turbo does not benefit from LoRA injection

Metadata History

Every generation stores:

  • model id
  • prompt + negative prompt
  • steps, cfg, resolution
  • seed
  • LoRA names + weights
  • timestamp

All generated data is stored in a tree structure under:


src/assets/history/

Architecture


src/
└── sdgen/
β”œβ”€β”€ sd/                     # Stable Diffusion runtime
β”‚   β”œβ”€β”€ pipeline.py         # model loading, device config
β”‚   β”œβ”€β”€ generator.py        # text-to-image inference
β”‚   β”œβ”€β”€ img2img.py          # image-to-image inference
β”‚   β”œβ”€β”€ lora_loader.py      # LoRA discovery & injection
β”‚   └── models.py           # typed config & metadata objects
β”‚
β”œβ”€β”€ ui/                     # Gradio UI components
β”‚   β”œβ”€β”€ layout.py           # composition root for UI
β”‚   └── tabs/               # modular tabs
β”‚       β”œβ”€β”€ txt2img_tab.py
β”‚       β”œβ”€β”€ img2img_tab.py
β”‚       β”œβ”€β”€ upscaler_tab.py
β”‚       β”œβ”€β”€ presets_tab.py
β”‚       └── history_tab.py
β”‚
β”œβ”€β”€ presets/                # curated basic presets
β”‚   └── styles.py           # preset registry
β”‚
β”œβ”€β”€ upscaler/               # Real-ESRGAN NCNN backend
β”‚   β”œβ”€β”€ upscaler.py         # interface + metadata
β”‚   └── realesrgan.py       # NCNN wrapper
β”‚
β”œβ”€β”€ utils/                  # shared utilities
β”‚   β”œβ”€β”€ history.py          # atomic storage format
β”‚   β”œβ”€β”€ common.py           # PIL helpers
β”‚   └── logger.py           # structured logging
β”‚
└── config/                 # static configuration
β”œβ”€β”€ paths.py            # resolved directories
└── settings.py         # environment settings

Presets (Included)

The project includes four style presets, each defining:

  • prompt
  • negative prompt

These presets are neutral and work with both SD1.5 and Turbo:

Name Style
Realistic Photo 35mm, photorealistic
Anime clean anime illustration
Cinematic / Moody cinematic lighting/grain
Oil Painting classical oil painting

Presets do not include LoRA parameters.
Users may manually combine presets with LoRA adapters.


Installation

Clone

git clone https://github.com/sanskarmodi8/stable-diffusion-image-generator
cd stable-diffusion-image-generator

Environment

python -m venv .venv
source .venv/bin/activate

Install Dependencies (CPU)

pip install -r requirements.txt
pip install -e .

GPU (optional)

pip install torch torchvision torchaudio \
  --index-url https://download.pytorch.org/whl/cu121

Run

python src/sdgen/main.py

Open in browser:

http://127.0.0.1:7860

Adding LoRA Models

Place .safetensors files here:

src/assets/loras/

They will be automatically detected and displayed in the UI (SD1.5 only).

This repository does not include LoRA files.


Third-Party LoRA Models

The app supports optional LoRA adapters. LoRA weights are not included and are the property of their respective authors.

If you choose to download LoRA files automatically (see lora_urls.py), they are fetched directly from their original sources (Civitai).

This project does not redistribute LoRA weights. Refer to each model’s license on Civitai.


Development

The repo uses pre-commit hooks for consistency:

pre-commit install

Tools:

  • ruff
  • black
  • isort

Check formatting:

ruff check .
black .

License

This project is licensed under the MIT License. See the LICENSE file.


Author

Sanskar Modi