Spaces:
Sleeping
Sleeping
| title: sd-image-gen-toolkit | |
| app_file: src/sdgen/main.py | |
| sdk: gradio | |
| sdk_version: 6.0.2 | |
| # Stable Diffusion Image Generation Toolkit | |
|  | |
| [**Live Demo**](https://huggingface.co/spaces/SanskarModi/sd-image-gen-toolkit) | |
| --- | |
| ## Overview | |
| A modular, lightweight image generation toolkit built on **Hugging Face Diffusers**, designed for **CPU-friendly deployment**, clean architecture, and practical usability. | |
| It supports **Text β Image**, **Image β Image**, and **Upscaling**, with a **preset system**, optional **LoRA adapters**, and a local **metadata history** for reproducibility. | |
| --- | |
| ## Features | |
| ### Text β Image | |
| - Stable Diffusion **1.5** and **Turbo** | |
| - Configurable prompt parameters: | |
| - prompt / negative prompt | |
| - steps | |
| - guidance (CFG) | |
| - resolution | |
| - seed (optional) | |
| - JSON metadata output | |
| - Style presets for quick experimentation | |
| ### Image β Image | |
| - Modify existing images via the SD Img2Img pipeline | |
| - Denoising strength control | |
| - Full parameter configuration | |
| - Shared preset system | |
| - History saved for reproducibility | |
| ### Upscaling (Real-ESRGAN NCNN) | |
| - **2Γ and 4Γ** upscaling | |
| - NCNN backend (no GPU required) | |
| - Minimal dependencies | |
| - Fast on CPU environments (HF Spaces) | |
| ### LoRA Adapter Support | |
| - Runtime loading of `.safetensors` adapters | |
| - Up to **two adapters** with independent weights | |
| - Alpha range `-2 β +2` per adapter | |
| - Automatic discovery under: | |
| ``` | |
| src/assets/loras/ | |
| ``` | |
| - LoRA UI is **disabled for Turbo**, since Turbo does not benefit from LoRA injection | |
| ### Metadata History | |
| Every generation stores: | |
| - model id | |
| - prompt + negative prompt | |
| - steps, cfg, resolution | |
| - seed | |
| - LoRA names + weights | |
| - timestamp | |
| All generated data is stored in a tree structure under: | |
| ``` | |
| src/assets/history/ | |
| ``` | |
| --- | |
| ## Architecture | |
| ``` | |
| src/ | |
| βββ sdgen/ | |
| βββ sd/ # Stable Diffusion runtime | |
| β βββ pipeline.py # model loading, device config | |
| β βββ generator.py # text-to-image inference | |
| β βββ img2img.py # image-to-image inference | |
| β βββ lora_loader.py # LoRA discovery & injection | |
| β βββ models.py # typed config & metadata objects | |
| β | |
| βββ ui/ # Gradio UI components | |
| β βββ layout.py # composition root for UI | |
| β βββ tabs/ # modular tabs | |
| β βββ txt2img_tab.py | |
| β βββ img2img_tab.py | |
| β βββ upscaler_tab.py | |
| β βββ presets_tab.py | |
| β βββ history_tab.py | |
| β | |
| βββ presets/ # curated basic presets | |
| β βββ styles.py # preset registry | |
| β | |
| βββ upscaler/ # Real-ESRGAN NCNN backend | |
| β βββ upscaler.py # interface + metadata | |
| β βββ realesrgan.py # NCNN wrapper | |
| β | |
| βββ utils/ # shared utilities | |
| β βββ history.py # atomic storage format | |
| β βββ common.py # PIL helpers | |
| β βββ logger.py # structured logging | |
| β | |
| βββ config/ # static configuration | |
| βββ paths.py # resolved directories | |
| βββ settings.py # environment settings | |
| ```` | |
| --- | |
| ## Presets (Included) | |
| The project includes **four style presets**, each defining: | |
| - prompt | |
| - negative prompt | |
| These presets are neutral and work with both **SD1.5** and **Turbo**: | |
| | Name | Style | | |
| |--------------------|----------------------------| | |
| | Realistic Photo | 35mm, photorealistic | | |
| | Anime | clean anime illustration | | |
| | Cinematic / Moody | cinematic lighting/grain | | |
| | Oil Painting | classical oil painting | | |
| Presets do **not include LoRA parameters**. | |
| Users may manually combine presets with LoRA adapters. | |
| --- | |
| ## Installation | |
| ### Clone | |
| ```bash | |
| git clone https://github.com/sanskarmodi8/stable-diffusion-image-generator | |
| cd stable-diffusion-image-generator | |
| ```` | |
| ### Environment | |
| ```bash | |
| python -m venv .venv | |
| source .venv/bin/activate | |
| ``` | |
| ### Install Dependencies (CPU) | |
| ```bash | |
| pip install -r requirements.txt | |
| pip install -e . | |
| ``` | |
| ### GPU (optional) | |
| ```bash | |
| pip install torch torchvision torchaudio \ | |
| --index-url https://download.pytorch.org/whl/cu121 | |
| ``` | |
| --- | |
| ## Run | |
| ```bash | |
| python src/sdgen/main.py | |
| ``` | |
| Open in browser: | |
| ``` | |
| http://127.0.0.1:7860 | |
| ``` | |
| --- | |
| ## Adding LoRA Models | |
| Place `.safetensors` files here: | |
| ``` | |
| src/assets/loras/ | |
| ``` | |
| They will be automatically detected and displayed in the UI (SD1.5 only). | |
| This repository **does not include** LoRA files. | |
| --- | |
| ## Third-Party LoRA Models | |
| The app supports optional LoRA adapters. | |
| LoRA weights are **not included** and are **the property of their respective authors**. | |
| If you choose to download LoRA files automatically (see `lora_urls.py`), they are fetched directly from their original sources (**Civitai**). | |
| This project does **not** redistribute LoRA weights. | |
| Refer to each modelβs license on Civitai. | |
| --- | |
| ## Development | |
| The repo uses `pre-commit` hooks for consistency: | |
| ```bash | |
| pre-commit install | |
| ``` | |
| Tools: | |
| * ruff | |
| * black | |
| * isort | |
| Check formatting: | |
| ```bash | |
| ruff check . | |
| black . | |
| ``` | |
| --- | |
| ## License | |
| This project is licensed under the **MIT License**. | |
| See the [`LICENSE`](LICENSE) file. | |
| --- | |
| ## Author | |
| [**Sanskar Modi**](https://github.com/sanskarmodi8) |