---
library_name: spandrel
license: other
license_name: mixed
license_link: LICENSE
tags:
  - image-super-resolution
  - super-resolution
  - upscaling
  - real-esrgan
  - hat
  - swinir
  - comfyui
  - ffmpega
  - video-processing
pipeline_tag: image-to-image
---

# AI Upscale Models for FFMPEGA

Pre-trained super-resolution models for use with [ComfyUI-FFMPEGA](https://github.com/AEmotionStudio/ComfyUI-FFMPEGA)'s AI Upscale feature.

Models are automatically downloaded on first use — no manual setup required.

## Models

| File | Architecture | Scale | Size | VRAM | Best For |
|------|-------------|-------|------|------|----------|
| `RealESRGAN_x4plus.pth` | RRDBNet (GAN) | 4× | 67 MB | ~2 GB | General real-world photos |
| `RealESRGAN_x4plus_anime_6B.pth` | RRDBNet (compact) | 4× | 18 MB | ~1 GB | Anime, cartoon, illustration |
| `Real_HAT_GAN_SRx4.pth` | HAT (hybrid attention) | 4× | 170 MB | ~4 GB | SOTA quality, fine detail |
| `003_realSR_BSRGAN_DFOWMFC_s64w8_SwinIR-L_x4_GAN.pth` | SwinIR-Large | 4× | 48 MB | ~3 GB | Clean images, classical SR |

All models output 4× resolution. For 2× output, the upscaler runs at 4× then applies high-quality Lanczos downscaling.

## Usage in FFMPEGA

1. Set `llm_model` → `none`
2. Set `no_llm_mode` → `ai_upscale`
3. Choose `upscale_model` (e.g. `hat_x4` for best quality)
4. Choose `upscale_scale` (`4` or `2`)
5. Connect an image or video input and run

## Model Loading

Models are loaded via [spandrel](https://github.com/chaiNNer-org/spandrel), which auto-detects the architecture from the checkpoint file. No additional dependencies are needed beyond what ComfyUI already provides.

## Credits

- **Real-ESRGAN**: [xinntao/Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN) — BSD-3-Clause
- **HAT**: [XPixelGroup/HAT](https://github.com/XPixelGroup/HAT) — MIT
- **SwinIR**: [JingyunLiang/SwinIR](https://github.com/JingyunLiang/SwinIR) — Apache 2.0