Spaces:
Runtime error
Runtime error
| SD.Next includes *experimental* support for additional model pipelines | |
| This includes support for additional models such as: | |
| - **Stable Diffusion XL** | |
| - **Kandinsky** | |
| - **Deep Floyd IF** | |
| And soon: | |
| - **Shap-E**, **UniDiffuser**, **Consistency Models**, **Diffedit Zero-Shot** | |
| - **Text2Video**, **Video2Video**, etc... | |
| *This has been made possible by integration of [huggingface diffusers](https://huggingface.co/docs/diffusers/index) library with the help of huggingface team!* | |
| ## How to | |
| Moved to [Installation](https://github.com/vladmandic/automatic/wiki/Installation) and [SDXL](https://github.com/vladmandic/automatic/wiki/SDXL) | |
| ## Integration | |
| ### Standard workflows | |
| - **txt2img** | |
| - **img2img** | |
| - **inpaint** | |
| - **process** | |
| ### Model Access | |
| - For standard **SD 1.5** and **SD 2.1** models, you can use either | |
| standard *safetensor* models (single file) or *diffusers* models (folder structure) | |
| - For additional models, you can use *diffusers* models only | |
| - You can download diffuser models directly from [Huggingface hub](https://huggingface.co/) | |
| or use built-in model search & download in SD.Next: **UI -> Models -> Huggingface** | |
| - Note that access to some models is gated | |
| In which case, you need to accept model EULA and provide your huggingface token | |
| - When loading safetensors models, you must specify model pipeline type in: | |
| **UI -> Settings -> Diffusers -> Pipeline** | |
| When loading huggingface models, pipeline type is automatically detected | |
| - If you get this `Diffuser model downloaded error: model=stabilityai/stable-diffusion-etc [Errno 2] No such file or directory:` | |
| you need to go to the HuggingFace page and accept the EULA for that model. | |
| ### Extra Networks | |
| - Lora networks | |
| - Textual inversions (embeddings) | |
| Note that Lora and TI need are still model-specific, so you cannot use Lora trained on SD 1.5 on SD-XL | |
| (just like you couldn't do it on SD 2.1 model) - it needs to be trained for a specific model | |
| Support for SD-XL training is expected shortly | |
| ### Diffuser Settings | |
| - UI -> Settings -> Diffuser Settings | |
| contains additional tunable parameters | |
| ### Samplers | |
| - Samplers (schedulers) are pipeline specific, so when running with diffuser backend, you'll see a different list of samplers | |
| - UI -> Settings -> Sampler Settings shows different configurable parameters depending on backend | |
| - Recommended sampler for diffusers is **DEIS** | |
| ### Other | |
| - Updated **System Info** tab with additional information | |
| - Support for `lowvram` and `medvram` modes - Both work extremely well | |
| Additional tunables are available in UI -> Settings -> Diffuser Settings | |
| - Support for both default **SDP** and **xFormers** cross-optimizations | |
| Other cross-optimization methods are not available | |
| - **Extra Networks UI** will show available diffusers models | |
| - **CUDA model compile** | |
| UI Settings -> Compute settings | |
| Requires GPU with high VRAM | |
| Diffusers recommend `reduce overhead` compile mode, but other methods are available as well | |
| Fullgraph compile is possible (with sufficient vram) when using diffusers | |
| - Note that some CUDA compile modes only work on Linux | |
| ## SD-XL Notes | |
| - [SD-XL Technical Report](https://github.com/Stability-AI/generative-models/blob/main/assets/sdxl_report.pdf) | |
| - SD-XL model is designed as two-stage model | |
| You can run SD-XL pipeline using just `base` model or load both `base` and `refiner` models | |
| - `base`: Trained on images with variety of aspect ratios and uses OpenCLIP-ViT/G and CLIP-ViT/L for text encoding | |
| - `refiner`: Trained to denoise small noise levels of high quality data and uses the OpenCLIP model | |
| - Having both `base` model and `refiner` model loaded can require significant VRAM | |
| - If you want to use `refiner` model, it is advised to add `sd_model_refiner` to **quicksettings** | |
| in UI Settings -> User Interface | |
| - SD-XL model was trained on **1024px** images | |
| You can use it with smaller sizes, but you will likely get better results with SD 1.5 models | |
| - SD-XL model NSFW filter has been turned off | |
| ### Download SD-XL 1.0 | |
| 1. Enter `stabilityai/stable-diffusion-xl-base-1.0` in *Select Model* and press *Download* | |
| 2. Enter `stabilityai/stable-diffusion-xl-refiner-1.0` in *Select Model* and press *Download* | |
| ## Limitations | |
| - Any extension that requires access to model internals will likely not work when using diffusers backend | |
| This for example includes standard extensions such as `ControlNet`, `MultiDiffusion`, | |
| *Note: application will auto-disable incompatible built-in extensions when running in diffusers mode* | |
| - Explicit `refiner` as postprocessing is not yet implemented | |
| - Hypernetworks | |
| - Limited callbacks support for scripts/extensions: additional callbacks will be added as needed | |
| ## Performance | |
| Comparison of original stable diffusion pipeline and diffusers pipeline when using standard SD 1.5 model | |
| Performance is measured for `batch-size` 1, 2, 4, 8 16 | |
| | pipeline | performance it/s | memory cpu/gpu | | |
| | --- | --- | --- | | |
| | original | 7.99 / 7.93 / 8.83 / 9.14 / 9.2 | 6.7 / 7.2 | | |
| | original medvram | 6.23 / 7.16 / 8.41 / 9.24 / 9.68 | 8.4 / 6.8 | | |
| | original lowvram | 1.05 / 1.94 / 3.2 / 4.81 / 6.46 | 8.8 / 5.2 | | |
| | diffusers | 9 / 7.4 / 8.2 / 8.4 / 7.0 | 4.3 / 9.0 | | |
| | diffusers medvram | 7.5 / 6.7 / 7.5 / 7.8 / 7.2 | 6.6 / 8.2 | | |
| | diffusers lowvram | 7.0 / 7.0 / 7.4 / 7.7 / 7.8 | 4.3 / 7.2 | | |
| | diffusers with safetensors | 8.9 / 7.3 / 8.1 / 8.4 / 7.1 | 5.9 / 9.0 | | |
| Notes: | |
| - Test environment: nVidia RTX 3060 GPU, Torch 2.1-nightly with CUDA 12.1, Cross-optimization: SDP | |
| - All being equal, diffusers seem to: | |
| - Use slightly less RAM and more VRAM | |
| - Have highly efficient medvram/lowvram equivalents which don't lose a lot of performance | |
| - Faster on smaller batch sizes, slower on larger batch sizes | |