--- library_name: spandrel license: other license_name: mixed license_link: LICENSE tags: - image-super-resolution - super-resolution - upscaling - real-esrgan - hat - swinir - comfyui - ffmpega - video-processing pipeline_tag: image-to-image --- # AI Upscale Models for FFMPEGA Pre-trained super-resolution models for use with [ComfyUI-FFMPEGA](https://github.com/AEmotionStudio/ComfyUI-FFMPEGA)'s AI Upscale feature. Models are automatically downloaded on first use — no manual setup required. ## Models | File | Architecture | Scale | Size | VRAM | Best For | |------|-------------|-------|------|------|----------| | `RealESRGAN_x4plus.pth` | RRDBNet (GAN) | 4× | 67 MB | ~2 GB | General real-world photos | | `RealESRGAN_x4plus_anime_6B.pth` | RRDBNet (compact) | 4× | 18 MB | ~1 GB | Anime, cartoon, illustration | | `Real_HAT_GAN_SRx4.pth` | HAT (hybrid attention) | 4× | 170 MB | ~4 GB | SOTA quality, fine detail | | `003_realSR_BSRGAN_DFOWMFC_s64w8_SwinIR-L_x4_GAN.pth` | SwinIR-Large | 4× | 48 MB | ~3 GB | Clean images, classical SR | All models output 4× resolution. For 2× output, the upscaler runs at 4× then applies high-quality Lanczos downscaling. ## Usage in FFMPEGA 1. Set `llm_model` → `none` 2. Set `no_llm_mode` → `ai_upscale` 3. Choose `upscale_model` (e.g. `hat_x4` for best quality) 4. Choose `upscale_scale` (`4` or `2`) 5. Connect an image or video input and run ## Model Loading Models are loaded via [spandrel](https://github.com/chaiNNer-org/spandrel), which auto-detects the architecture from the checkpoint file. No additional dependencies are needed beyond what ComfyUI already provides. ## Credits - **Real-ESRGAN**: [xinntao/Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN) — BSD-3-Clause - **HAT**: [XPixelGroup/HAT](https://github.com/XPixelGroup/HAT) — MIT - **SwinIR**: [JingyunLiang/SwinIR](https://github.com/JingyunLiang/SwinIR) — Apache 2.0