SanskarModi's picture
updated code to auto downlaod loras
600587b
---
title: sd-image-gen-toolkit
app_file: src/sdgen/main.py
sdk: gradio
sdk_version: 6.0.2
---
# Stable Diffusion Image Generation Toolkit
![appdemo](https://drive.google.com/uc?export=view&id=1dO2bnYmEEj3fNU0-dV692icUPSwyP93G)
[**Live Demo**](https://huggingface.co/spaces/SanskarModi/sd-image-gen-toolkit)
---
## Overview
A modular, lightweight image generation toolkit built on **Hugging Face Diffusers**, designed for **CPU-friendly deployment**, clean architecture, and practical usability.
It supports **Text β†’ Image**, **Image β†’ Image**, and **Upscaling**, with a **preset system**, optional **LoRA adapters**, and a local **metadata history** for reproducibility.
---
## Features
### Text β†’ Image
- Stable Diffusion **1.5** and **Turbo**
- Configurable prompt parameters:
- prompt / negative prompt
- steps
- guidance (CFG)
- resolution
- seed (optional)
- JSON metadata output
- Style presets for quick experimentation
### Image β†’ Image
- Modify existing images via the SD Img2Img pipeline
- Denoising strength control
- Full parameter configuration
- Shared preset system
- History saved for reproducibility
### Upscaling (Real-ESRGAN NCNN)
- **2Γ— and 4Γ—** upscaling
- NCNN backend (no GPU required)
- Minimal dependencies
- Fast on CPU environments (HF Spaces)
### LoRA Adapter Support
- Runtime loading of `.safetensors` adapters
- Up to **two adapters** with independent weights
- Alpha range `-2 β†’ +2` per adapter
- Automatic discovery under:
```
src/assets/loras/
```
- LoRA UI is **disabled for Turbo**, since Turbo does not benefit from LoRA injection
### Metadata History
Every generation stores:
- model id
- prompt + negative prompt
- steps, cfg, resolution
- seed
- LoRA names + weights
- timestamp
All generated data is stored in a tree structure under:
```
src/assets/history/
```
---
## Architecture
```
src/
└── sdgen/
β”œβ”€β”€ sd/ # Stable Diffusion runtime
β”‚ β”œβ”€β”€ pipeline.py # model loading, device config
β”‚ β”œβ”€β”€ generator.py # text-to-image inference
β”‚ β”œβ”€β”€ img2img.py # image-to-image inference
β”‚ β”œβ”€β”€ lora_loader.py # LoRA discovery & injection
β”‚ └── models.py # typed config & metadata objects
β”‚
β”œβ”€β”€ ui/ # Gradio UI components
β”‚ β”œβ”€β”€ layout.py # composition root for UI
β”‚ └── tabs/ # modular tabs
β”‚ β”œβ”€β”€ txt2img_tab.py
β”‚ β”œβ”€β”€ img2img_tab.py
β”‚ β”œβ”€β”€ upscaler_tab.py
β”‚ β”œβ”€β”€ presets_tab.py
β”‚ └── history_tab.py
β”‚
β”œβ”€β”€ presets/ # curated basic presets
β”‚ └── styles.py # preset registry
β”‚
β”œβ”€β”€ upscaler/ # Real-ESRGAN NCNN backend
β”‚ β”œβ”€β”€ upscaler.py # interface + metadata
β”‚ └── realesrgan.py # NCNN wrapper
β”‚
β”œβ”€β”€ utils/ # shared utilities
β”‚ β”œβ”€β”€ history.py # atomic storage format
β”‚ β”œβ”€β”€ common.py # PIL helpers
β”‚ └── logger.py # structured logging
β”‚
└── config/ # static configuration
β”œβ”€β”€ paths.py # resolved directories
└── settings.py # environment settings
````
---
## Presets (Included)
The project includes **four style presets**, each defining:
- prompt
- negative prompt
These presets are neutral and work with both **SD1.5** and **Turbo**:
| Name | Style |
|--------------------|----------------------------|
| Realistic Photo | 35mm, photorealistic |
| Anime | clean anime illustration |
| Cinematic / Moody | cinematic lighting/grain |
| Oil Painting | classical oil painting |
Presets do **not include LoRA parameters**.
Users may manually combine presets with LoRA adapters.
---
## Installation
### Clone
```bash
git clone https://github.com/sanskarmodi8/stable-diffusion-image-generator
cd stable-diffusion-image-generator
````
### Environment
```bash
python -m venv .venv
source .venv/bin/activate
```
### Install Dependencies (CPU)
```bash
pip install -r requirements.txt
pip install -e .
```
### GPU (optional)
```bash
pip install torch torchvision torchaudio \
--index-url https://download.pytorch.org/whl/cu121
```
---
## Run
```bash
python src/sdgen/main.py
```
Open in browser:
```
http://127.0.0.1:7860
```
---
## Adding LoRA Models
Place `.safetensors` files here:
```
src/assets/loras/
```
They will be automatically detected and displayed in the UI (SD1.5 only).
This repository **does not include** LoRA files.
---
## Third-Party LoRA Models
The app supports optional LoRA adapters.
LoRA weights are **not included** and are **the property of their respective authors**.
If you choose to download LoRA files automatically (see `lora_urls.py`), they are fetched directly from their original sources (**Civitai**).
This project does **not** redistribute LoRA weights.
Refer to each model’s license on Civitai.
---
## Development
The repo uses `pre-commit` hooks for consistency:
```bash
pre-commit install
```
Tools:
* ruff
* black
* isort
Check formatting:
```bash
ruff check .
black .
```
---
## License
This project is licensed under the **MIT License**.
See the [`LICENSE`](LICENSE) file.
---
## Author
[**Sanskar Modi**](https://github.com/sanskarmodi8)