Spaces:

SanskarModi
/

sd-image-gen-toolkit

Running

App Files Files Community

SanskarModi commited on Dec 6, 2025

Commit

b2d52ea

1 Parent(s): bb38244

updated readme

Browse files

Files changed (1) hide show

README.md +125 -31

README.md CHANGED Viewed

@@ -166,51 +166,145 @@ http://127.0.0.1:7860
 ---
-# Roadmap (LoRA, QLoRA, and Training)
-**Update planned**: full LoRA loading and fine-tuning support.
-Scope includes:
-### 1. LoRA Runtime Inference
-* Load LoRA weights into existing UNet
-* Adjustable LoRA alpha/scaling
-* UI selector for LoRA checkpoints
-* Enable mixing multiple LoRAs
-Implementation plan:
-* Attach `lora_attn_procs` to model
-* Discover `.safetensors` in `/assets/lora`
-* Store LoRA metadata in history
-* Persist alpha value and presets
-### 2. QLoRA Fine-Tuning
-* Train lightweight LoRA modules on GPUs (11GB VRAM OK)
-* Use parameter-efficient training
-* Merge adapters for export
-* Allow user fine-tuning via command line
-Stack:
-* accelerate
-* peft
-* bitsandbytes (if GPU available)
-UI tab planned:
-* dataset upload
-* config builder
-* start training
-* track loss, sample outputs
-**Why LoRA?**
-* Enables personal styles without training the full model
-* Reduces VRAM and compute cost by 50–200×
-* Industry-standard for SD customization
 ---

 ---
+# Roadmap (Focused, High-Impact Features)
+This project is under active development. The next milestones focus on **practical model customization and multi-model support**, optimized for **CPU-only deployment environments** such as Hugging Face Spaces.
+The roadmap is intentionally **lean** to maximize value within limited compute constraints.
+---
+## 1. LoRA Runtime Inference (Core Feature)
+Add lightweight **Low-Rank Adaptation** support for Stable Diffusion pipelines without modifying base model weights.
+### Scope
+- Load external **`.safetensors` LoRA adapters** into UNet
+- Apply LoRA modules dynamically at inference
+- **Alpha (weight) slider** to control influence
+- **UI dropdown** for selecting LoRA adapters
+- **Automatic discovery** of LoRAs under:
+```
+src/assets/loras/
+```
+### Deliverables
+- `lora_loader.py` utility
+- integration into existing `load_pipeline()`
+- UI: LoRA selector + alpha parameter
+- history metadata with:
+- `lora_paths`
+- `lora_weights`
+---
+## 2. Multi-LoRA Mixing (2 adapters)
+Support mixing **two LoRA adapters** with independent weights.
+### Scope
+- Simple weighted merge at attention processors
+- UI:
+- LoRA A dropdown + alpha
+- LoRA B dropdown + alpha
+- Conflict handling for overlapping layers
+### Deliverables
+- `apply_lora_mix()` utility
+- metadata persistence
+---
+## 3. SDXL-Turbo Pipeline Support
+Add a **third runtime model**:
+```
+stabilityai/stable-diffusion-xl-base
+stabilityai/sdxl-turbo
+````
+### Scope
+- instantiate SDXL Turbo pipeline
+- auto configure:
+  - steps (1-4)
+  - CFG (0-1)
+- model selection integrated in UI
+- reproducible metadata
+### Notes
+SDXL Turbo is optimized for **fast generation** and works well on constrained environments with reduced steps.
+---
+## 4. Enhanced Presets
+Presets currently define only prompts. Extend them to define **full recommended parameter sets** per use case.
+### Scope
+Each preset can define:
+- prompt
+- negative prompt
+- inference steps
+- CFG scale
+- resolution
+- recommended model
+- recommended LoRA (+alpha)
+### Example
+```json
+{
+  "preset": "Anime Portrait",
+  "prompt": "...",
+  "negative": "...",
+  "steps": 15,
+  "cfg": 6,
+  "width": 512,
+  "height": 768,
+  "model": "SD1.5",
+  "lora": {
+    "path": "anime_face.safetensors",
+    "alpha": 0.8
+  }
+}
+````
+---
+## 5. Metadata Improvements
+Enhance metadata tracking for **reproducibility**.
+### Added Fields
+* `model_id`
+* `lora_names`
+* `lora_alphas`
+* `preset_used`
+* `resolution`
+* provenance timestamp
+This enables exact replication of generated images.
+---
+## 6. Example LoRA & Training Scripts (No UI)
+Provide **self-contained example** to demonstrate training:
+* a Colab notebook for **LoRA fine-tuning**
+* a small 20-image dataset
+* training duration < 45 minutes on free GPU
+* export `.safetensors` file
+* use it in presets
+### Deliverables
+* `examples/train_lora.ipynb`
+* resulting LoRA stored at `assets/loras/example.safetensors`
 ---