metadata
pipeline_tag: text-to-image
library_name: diffusers
license: mit
Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models (CVPR 2025)
This repository contains the model introduced in Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models.
For the official code implementation, please visit https://github.com/zhang0jhon/Diffusion-4K.
Usage Example (Flux-12B)
The following code snippet demonstrates image generation using the Flux-12B model:
from diffusers import FluxPipeline
import torch
pipe = FluxPipeline.from_pretrained("zhang0jhon/flux_wavelet", torch_dtype=torch.float16) # Replace with your model ID
prompt = "a photo of an astronaut riding a horse on mars"
image = pipe(prompt, guidance_scale=5.0, height=4096, width=4096, num_inference_steps=50).images[0]
# Save or display the image
image.save("astronaut_horse.png")
Remember to install the necessary libraries (pip install diffusers transformers) and adjust parameters like guidance_scale, height, width, and num_inference_steps for optimal results. Refer to the Github repository for more details and examples.
Citation
@inproceedings{zhang2025diffusion4k,
title={Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models},
author={Jinjin Zhang, Qiuyu Huang, Junjie Liu, Xiefan Guo and Di Huang},
year={2025},
booktitle={IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
}