| | --- |
| | license: apache-2.0 |
| | library_name: realesrgan |
| | pipeline_tag: image-to-image |
| | tags: |
| | - image-upscaling |
| | - super-resolution |
| | - realesrgan |
| | - esrgan |
| | - post-processing |
| | - image-enhancement |
| | --- |
| | |
| | <!-- README Version: v1.2 --> |
| |
|
| | # FLUX Upscale Models Collection v1.2 |
| |
|
| | This repository contains Real-ESRGAN upscale models for post-processing and enhancing generated images. These models can upscale images by 2x or 4x while adding fine details and improving sharpness. |
| |
|
| | ## Model Description |
| |
|
| | Real-ESRGAN (Real Enhanced Super-Resolution Generative Adversarial Networks) models for high-quality image upscaling. These models are commonly used as post-processing steps for AI-generated images to increase resolution and enhance details. |
| |
|
| | **Key Capabilities**: |
| | - 2x and 4x image upscaling |
| | - Detail enhancement and sharpening |
| | - Noise reduction and artifact removal |
| | - Optimized for AI-generated images |
| | - CPU and GPU compatible |
| |
|
| | ## Repository Contents |
| |
|
| | **Total Size**: ~192MB |
| |
|
| | ### Upscale Models |
| | - `upscale_models/4x-UltraSharp.pth` (64MB) - 4x upscaling with ultra-sharp detail enhancement |
| | - `upscale_models/RealESRGAN_x2plus.pth` (64MB) - 2x upscaling model |
| | - `upscale_models/RealESRGAN_x4plus.pth` (64MB) - 4x upscaling model |
| |
|
| | ## Hardware Requirements |
| |
|
| | - **VRAM**: 4GB+ recommended for GPU inference |
| | - **Disk Space**: 192MB |
| | - **Memory**: 8GB+ system RAM recommended |
| | - **Compatible with**: CPU or GPU inference (CUDA, ROCm, or CPU) |
| |
|
| | ## Usage Examples |
| |
|
| | ### Basic Usage with Real-ESRGAN |
| |
|
| | ```python |
| | from basicsr.archs.rrdbnet_arch import RRDBNet |
| | from realesrgan import RealESRGANer |
| | import cv2 |
| | |
| | # Load the upscaler model |
| | model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32) |
| | |
| | upsampler = RealESRGANer( |
| | scale=4, |
| | model_path="E:\\huggingface\\flux-upscale\\upscale_models\\4x-UltraSharp.pth", |
| | model=model, |
| | tile=0, |
| | tile_pad=10, |
| | pre_pad=0, |
| | half=True # Use FP16 for faster inference on GPU |
| | ) |
| | |
| | # Load and upscale an image |
| | img = cv2.imread("input.png", cv2.IMREAD_COLOR) |
| | output, _ = upsampler.enhance(img, outscale=4) |
| | cv2.imwrite("output_upscaled.png", output) |
| | ``` |
| |
|
| | ### Using with FLUX Pipeline |
| |
|
| | ```python |
| | from diffusers import FluxPipeline |
| | from realesrgan import RealESRGANer |
| | from basicsr.archs.rrdbnet_arch import RRDBNet |
| | import torch |
| | import numpy as np |
| | |
| | # Generate image with FLUX |
| | pipe = FluxPipeline.from_pretrained( |
| | "E:\\huggingface\\flux-dev-fp16", |
| | torch_dtype=torch.float16 |
| | ) |
| | pipe.to("cuda") |
| | |
| | image = pipe( |
| | prompt="a beautiful landscape with mountains", |
| | num_inference_steps=30 |
| | ).images[0] |
| | |
| | # Convert PIL to numpy/cv2 format |
| | img_array = np.array(image) |
| | |
| | # Initialize upscaler |
| | model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32) |
| | upsampler = RealESRGANer( |
| | scale=4, |
| | model_path="E:\\huggingface\\flux-upscale\\upscale_models\\4x-UltraSharp.pth", |
| | model=model, |
| | half=True |
| | ) |
| | |
| | # Upscale the generated image |
| | upscaled, _ = upsampler.enhance(img_array, outscale=4) |
| | |
| | # Save result |
| | import cv2 |
| | cv2.imwrite("flux_upscaled_4x.png", upscaled) |
| | ``` |
| |
|
| | ### Tiled Processing for Large Images |
| |
|
| | ```python |
| | from basicsr.archs.rrdbnet_arch import RRDBNet |
| | from realesrgan import RealESRGANer |
| | import cv2 |
| | |
| | # Configure for large images with tiling |
| | model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32) |
| | |
| | upsampler = RealESRGANer( |
| | scale=4, |
| | model_path="E:\\huggingface\\flux-upscale\\upscale_models\\RealESRGAN_x4plus.pth", |
| | model=model, |
| | tile=512, # Process in 512x512 tiles |
| | tile_pad=10, # Padding to avoid seams |
| | pre_pad=0, |
| | half=True |
| | ) |
| | |
| | # Process large image |
| | img = cv2.imread("large_image.png", cv2.IMREAD_COLOR) |
| | output, _ = upsampler.enhance(img, outscale=4) |
| | cv2.imwrite("large_upscaled.png", output) |
| | ``` |
| |
|
| | ## Model Comparison |
| |
|
| | | Model | Scale | Best For | File Size | Speed | |
| | |-------|-------|----------|-----------|-------| |
| | | 4x-UltraSharp | 4x | Sharp details, AI-generated images | 64MB | Moderate | |
| | | RealESRGAN_x2plus | 2x | Moderate upscaling, faster processing | 64MB | Fast | |
| | | RealESRGAN_x4plus | 4x | General purpose 4x upscaling | 64MB | Moderate | |
| |
|
| | **Model Selection Guide**: |
| | - **4x-UltraSharp**: Best for AI-generated images needing maximum sharpness |
| | - **RealESRGAN_x2plus**: Quick 2x upscaling with balanced quality |
| | - **RealESRGAN_x4plus**: General-purpose 4x upscaling for various image types |
| |
|
| | ## Model Specifications |
| |
|
| | - **Architecture**: RRDB (Residual in Residual Dense Block) |
| | - **Input Channels**: 3 (RGB) |
| | - **Output Channels**: 3 (RGB) |
| | - **Feature Dimensions**: 64 |
| | - **Network Blocks**: 23 (standard configuration) |
| | - **Growth Channels**: 32 |
| | - **Format**: PyTorch `.pth` files |
| | - **Precision**: FP32 (supports FP16 inference) |
| |
|
| | ## Performance Tips |
| |
|
| | - **GPU Acceleration**: Use `half=True` for FP16 inference on compatible GPUs (approximately 2x faster) |
| | - **Tiling for VRAM**: Enable tiling with `tile=512` to reduce VRAM usage for large images |
| | - **Tile Padding**: Use `tile_pad=10` to minimize visible seams between tiles |
| | - **Batch Processing**: Process multiple images sequentially to amortize model loading time |
| | - **CPU Fallback**: Models work on CPU but will be significantly slower (~10-20x) |
| | - **Optimal Scale**: Use 2x for faster processing, 4x for maximum detail enhancement |
| | - **Input Quality**: Better input images produce better upscaling results |
| | - **File Formats**: Use lossless formats (PNG) for best quality preservation |
| |
|
| | ## Use Cases |
| |
|
| | - Post-processing AI-generated images from FLUX.1, Stable Diffusion, etc. |
| | - Enhancing FLUX.1-dev outputs for high-resolution prints |
| | - Increasing resolution of generated artwork for commercial use |
| | - Adding fine details to synthetic images |
| | - Print preparation for generated images (posters, canvas prints) |
| | - Upscaling video frames for AI video generation pipelines |
| | - Restoring and enhancing low-resolution generated content |
| |
|
| | ## Installation |
| |
|
| | ```bash |
| | pip install realesrgan basicsr |
| | ``` |
| |
|
| | **Dependencies**: |
| | - Python 3.8+ |
| | - PyTorch 1.7+ |
| | - basicsr |
| | - realesrgan |
| | - opencv-python |
| | - numpy |
| |
|
| | ## License |
| |
|
| | These models are released under the Apache 2.0 license. |
| |
|
| | ## Citation |
| |
|
| | ```bibtex |
| | @InProceedings{wang2021realesrgan, |
| | author = {Xintao Wang and Liangbin Xie and Chao Dong and Ying Shan}, |
| | title = {Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data}, |
| | booktitle = {International Conference on Computer Vision Workshops (ICCVW)}, |
| | year = {2021} |
| | } |
| | ``` |
| |
|
| | ## Links and Resources |
| |
|
| | - **Real-ESRGAN Paper**: [arXiv:2107.10833](https://arxiv.org/abs/2107.10833) |
| | - **Official Repository**: [xinntao/Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN) |
| | - **BasicSR Library**: [xinntao/BasicSR](https://github.com/xinntao/BasicSR) |
| | - **Hugging Face**: [Real-ESRGAN Models](https://huggingface.co/models?other=real-esrgan) |
| | - **Model Downloads**: Available through official Real-ESRGAN releases |
| |
|
| | ## Model Card Contact |
| |
|
| | For questions about Real-ESRGAN models, refer to the official Real-ESRGAN repository and documentation at https://github.com/xinntao/Real-ESRGAN |
| |
|