flux_wavelet / README.md
nielsr's picture
nielsr HF Staff
Add model card
8ade061 verified
|
raw
history blame
1.6 kB
metadata
pipeline_tag: text-to-image
library_name: diffusers
license: mit

Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models (CVPR 2025)

This repository contains the model introduced in Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models.

For the official code implementation, please visit https://github.com/zhang0jhon/Diffusion-4K.

Usage Example (Flux-12B)

The following code snippet demonstrates image generation using the Flux-12B model:

from diffusers import FluxPipeline
import torch

pipe = FluxPipeline.from_pretrained("zhang0jhon/flux_wavelet", torch_dtype=torch.float16) # Replace with your model ID

prompt = "a photo of an astronaut riding a horse on mars"
image = pipe(prompt, guidance_scale=5.0, height=4096, width=4096, num_inference_steps=50).images[0]

# Save or display the image
image.save("astronaut_horse.png")

Remember to install the necessary libraries (pip install diffusers transformers) and adjust parameters like guidance_scale, height, width, and num_inference_steps for optimal results. Refer to the Github repository for more details and examples.

Citation

@inproceedings{zhang2025diffusion4k,
    title={Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models},
    author={Jinjin Zhang, Qiuyu Huang, Junjie Liu, Xiefan Guo and Di Huang},
    year={2025},
    booktitle={IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
}