Spaces:
Sleeping
A newer version of the Gradio SDK is available:
6.5.1
title: sd-image-gen-toolkit
app_file: src/sdgen/main.py
sdk: gradio
sdk_version: 6.0.2
Stable Diffusion Image Generation Toolkit
Overview
A modular, lightweight image generation toolkit built on Hugging Face Diffusers, designed for CPU-friendly deployment, clean architecture, and practical usability.
It supports Text β Image, Image β Image, and Upscaling, with a preset system, optional LoRA adapters, and a local metadata history for reproducibility.
Features
Text β Image
- Stable Diffusion 1.5 and Turbo
- Configurable prompt parameters:
- prompt / negative prompt
- steps
- guidance (CFG)
- resolution
- seed (optional)
- JSON metadata output
- Style presets for quick experimentation
Image β Image
- Modify existing images via the SD Img2Img pipeline
- Denoising strength control
- Full parameter configuration
- Shared preset system
- History saved for reproducibility
Upscaling (Real-ESRGAN NCNN)
- 2Γ and 4Γ upscaling
- NCNN backend (no GPU required)
- Minimal dependencies
- Fast on CPU environments (HF Spaces)
LoRA Adapter Support
- Runtime loading of
.safetensorsadapters - Up to two adapters with independent weights
- Alpha range
-2 β +2per adapter - Automatic discovery under:
src/assets/loras/
- LoRA UI is disabled for Turbo, since Turbo does not benefit from LoRA injection
Metadata History
Every generation stores:
- model id
- prompt + negative prompt
- steps, cfg, resolution
- seed
- LoRA names + weights
- timestamp
All generated data is stored in a tree structure under:
src/assets/history/
Architecture
src/
βββ sdgen/
βββ sd/ # Stable Diffusion runtime
β βββ pipeline.py # model loading, device config
β βββ generator.py # text-to-image inference
β βββ img2img.py # image-to-image inference
β βββ lora_loader.py # LoRA discovery & injection
β βββ models.py # typed config & metadata objects
β
βββ ui/ # Gradio UI components
β βββ layout.py # composition root for UI
β βββ tabs/ # modular tabs
β βββ txt2img_tab.py
β βββ img2img_tab.py
β βββ upscaler_tab.py
β βββ presets_tab.py
β βββ history_tab.py
β
βββ presets/ # curated basic presets
β βββ styles.py # preset registry
β
βββ upscaler/ # Real-ESRGAN NCNN backend
β βββ upscaler.py # interface + metadata
β βββ realesrgan.py # NCNN wrapper
β
βββ utils/ # shared utilities
β βββ history.py # atomic storage format
β βββ common.py # PIL helpers
β βββ logger.py # structured logging
β
βββ config/ # static configuration
βββ paths.py # resolved directories
βββ settings.py # environment settings
Presets (Included)
The project includes four style presets, each defining:
- prompt
- negative prompt
These presets are neutral and work with both SD1.5 and Turbo:
| Name | Style |
|---|---|
| Realistic Photo | 35mm, photorealistic |
| Anime | clean anime illustration |
| Cinematic / Moody | cinematic lighting/grain |
| Oil Painting | classical oil painting |
Presets do not include LoRA parameters.
Users may manually combine presets with LoRA adapters.
Installation
Clone
git clone https://github.com/sanskarmodi8/stable-diffusion-image-generator
cd stable-diffusion-image-generator
Environment
python -m venv .venv
source .venv/bin/activate
Install Dependencies (CPU)
pip install -r requirements.txt
pip install -e .
GPU (optional)
pip install torch torchvision torchaudio \
--index-url https://download.pytorch.org/whl/cu121
Run
python src/sdgen/main.py
Open in browser:
http://127.0.0.1:7860
Adding LoRA Models
Place .safetensors files here:
src/assets/loras/
They will be automatically detected and displayed in the UI (SD1.5 only).
This repository does not include LoRA files.
Third-Party LoRA Models
The app supports optional LoRA adapters. LoRA weights are not included and are the property of their respective authors.
If you choose to download LoRA files automatically (see lora_urls.py), they are fetched directly from their original sources (Civitai).
This project does not redistribute LoRA weights. Refer to each modelβs license on Civitai.
Development
The repo uses pre-commit hooks for consistency:
pre-commit install
Tools:
- ruff
- black
- isort
Check formatting:
ruff check .
black .
License
This project is licensed under the MIT License.
See the LICENSE file.