Spaces:

elismasilva
/

z-image-panorama

Running on Zero

App Files Files Community

z-image-panorama / README.md

elismasilva

update name

085e339 verified 9 days ago

preview code

raw

history blame contribute delete

4.93 kB

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

metadata

title: Z Image Turbo - Panorama
emoji: 🏞️
colorFrom: red
colorTo: yellow
sdk: gradio
sdk_version: 5.50.0
app_file: app.py
pinned: true
license: apache-2.0
short_description: Create stunning panoramas with Z-Image-Turbo.

Z Image Turbo - Panorama 🏞️✨

Create stunning, seamless panoramic images by combining multiple distinct scenes with the power of the Z-Image-Turbo model. This application uses an advanced "Mixture of Diffusers" tiling pipeline to generate high-resolution compositions from left, center, and right text prompts, even on memory-constrained hardware, thanks to GGUF model support.

What is Z Image Turbo - Panorama?

Panorama Z-Image Turbo is a creative tool that leverages a sophisticated tiling mechanism to generate a single, wide-format image from three separate text prompts. Instead of stretching a single concept, you can describe different but related scenes for the left, center, and right portions of the image. The pipeline then intelligently generates each part and seamlessly blends them together.

This is ideal for:

Creating expansive cityscapes or landscapes: Describe a bustling street that transitions to a central plaza, which then leads to a quiet park.
Composing complex scenes: Place different characters or objects side-by-side in a shared, coherent environment.
Generating ultra-wide art: Create unique, high-resolution images perfect for wallpapers or digital art.
Multi-language creation: With built-in translation, you can write prompts in English, Korean, or Chinese.

The core technology uses a custom ZImageMoDTilingPipeline built on the Diffusers library. It has been meticulously adapted to the unique architecture of the ZImageTransformer2DModel, correctly handling its 16-channel latent space and specific input tensor format.

Key Features

Multi-Prompt Composition: Control the left, center, and right of your image with unique prompts.
GGUF Transformer Support: Utilizes a quantized GGUF version of the transformer, significantly reducing VRAM usage and making it possible to run on consumer GPUs.
Seamless Stitching: Uses advanced blending methods (Cosine or Gaussian) to eliminate visible seams between tiles.
High-Resolution Output: Generates images far wider than what a standard pipeline can handle in a single pass.
Efficient Memory Management: Employs diffusers' standard CPU offloading to manage memory for non-transformer components.
Multi-Language Prompts: Supports on-the-fly translation for prompts written in Korean and Chinese.

Running the App Locally

Follow these steps to run the Gradio application on your own machine.

1. Prerequisites

Python 3.10+
Git and Git LFS installed.

2. Clone the Repository

git clone https://huggingface.co/spaces/elismasilva/z-image-panorama
cd z-image-panorama

3. Set Up a Virtual Environment (Recommended)

# Windows
python -m venv venv
.\venv\Scripts\activate

# macOS / Linux
python3 -m venv venv
source venv/bin/activate

4. Install Dependencies

pip install -r requirements.txt

5. Configure Local Model Paths (Optional)

The app is configured to download models from the Hugging Face Hub by default. For faster startup and offline use, you can use local models by setting environment variables.

For example, if you have the GGUF model in F:\models\Z-Image-Turbo:

# Windows CMD
set GGUF_LOCAL_DIR=F:\models\Z-Image-Turbo

# Windows PowerShell
$env:GGUF_LOCAL_DIR="F:\models\Z-Image-Turbo"

# Linux/macOS
export GGUF_LOCAL_DIR=/path/to/your/models/Z-Image-Turbo

6. Run the Gradio App

python app.py

The application will start and provide a local URL (usually http://127.0.0.1:7860) that you can open in your web browser.

Using the Command-Line Script (`infer.py`)

The infer.py script allows you to test the pipeline directly from the command line.

1. Configure the Script

Open infer.py and modify the parameters inside the main() function, such as the prompt_grid, target_height, target_width, etc., to match your desired output.

2. Run the Script

Execute the script from your terminal:

python infer.py

The script will print its progress to the console, including the tqdm progress bar, and save the final image to the outputs/ directory.

Acknowledgements

Alibaba Tongyi-MAI Team for the powerful Z-Image Turbo model.
The original authors of the Mixture of Diffusers technique.
Hugging Face for the diffusers library and the Spaces platform.