tencent
/

HunyuanImage-3.0

@@ -5,10 +5,9 @@ license: other
 license_link: LICENSE
 ---
 <div align="center">
-<img src="./assets/logo.png" alt="HunyuanImage-3.0 Logo" width="400">
 # 🎨 HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation
@@ -26,12 +25,12 @@ license_link: LICENSE
   <a href=https://hunyuan.tencent.com/image target="_blank"><img src=https://img.shields.io/badge/Official%20Site-333399.svg?logo=homepage height=22px></a>
   <a href=https://huggingface.co/tencent/HunyuanImage-3.0 target="_blank"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Models-d96902.svg height=22px></a>
   <a href=https://github.com/Tencent-Hunyuan/HunyuanImage-3.0 target="_blank"><img src= https://img.shields.io/badge/Page-bb8a2e.svg?logo=github height=22px></a>
-  <a href=https://github.com/Tencent-Hunyuan/HunyuanImage-3.0/blob/main/assets/HunyuanImage_3_0.pdf target="_blank"><img src=https://img.shields.io/badge/Report-b5212f.svg?logo=arxiv height=22px></a>
   <a href=https://x.com/TencentHunyuan target="_blank"><img src=https://img.shields.io/badge/Hunyuan-black.svg?logo=x height=22px></a>
 </div>
 <p align="center">
-    👏 Join our <a href="https://github.com/Tencent-Hunyuan/HunyuanImage-3.0/blob/main/assets/WECHAT.md" target="_blank">WeChat</a> and <a href="https://discord.gg/ehjWMqF5wY">Discord</a> |
 💻 <a href="https://hunyuan.tencent.com/modelSquare/home/play?modelId=289&from=/visual">Official website(官网) Try our model!</a>&nbsp&nbsp
 </p>
@@ -125,7 +124,10 @@ If you develop/use HunyuanImage-3.0 in your projects, welcome to let us know.
 # 1. First install PyTorch (CUDA 12.8 Version)
 pip install torch==2.7.1 torchvision==0.22.1 torchaudio==2.7.1 --index-url https://download.pytorch.org/whl/cu128
-# 2. Then install other dependencies
 pip install -r requirements.txt
 ```
@@ -204,24 +206,31 @@ hf download tencent/HunyuanImage-3.0 --local-dir ./HunyuanImage-3
 ```
 #### 3️⃣ Run the Demo
 ```bash
-python3 run_image_gen.py --model-id ./HunyuanImage-3 --verbose 1 --prompt "A brown and white dog is running on the grass"
 ```
 #### 4️⃣ Command Line Arguments
-| Arguments            | Description                                                     | Default     |
-|----------------------|-----------------------------------------------------------------|-------------|
-| `--prompt`           | Input prompt                                                    | (Required)  |
-| `--model-id`         | Model path                                                      | (Required)  |
-| `--attn-impl`        | Attention implementation. Either `sdpa` or `flash_attention_2`. | `sdpa`      |
-| `--moe-impl`         | MoE implementation. Either `eager` or `flashinfer`              | `eager`     |
-| `--seed`             | Random seed for image generation                                | `None`      |
-| `--diff-infer-steps` | Diffusion infer steps                                           | `50`        |
-| `--image-size`       | Image resolution. Can be `auto`, like `1280x768` or `16:9`      | `auto`      |
-| `--save`             | Image save path.                                                | `image.png` |
-| `--verbose`          | Verbose level. 0: No log; 1: log inference information.         | `0`         |
 ### 🎨 Interactive Gradio Demo
@@ -422,4 +431,3 @@ We extend our heartfelt gratitude to the following open-source projects and comm
 * 🌐 [HuggingFace](https://huggingface.co/) - AI model hub and community
 * ⚡ [FlashAttention](https://github.com/Dao-AILab/flash-attention) - Memory-efficient attention
 * 🚀 [FlashInfer](https://github.com/flashinfer-ai/flashinfer) - Optimized inference engine

 license_link: LICENSE
 ---
 <div align="center">
+<img src="./assets/logo.png" alt="HunyuanImage-3.0 Logo" width="600">
 # 🎨 HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation
   <a href=https://hunyuan.tencent.com/image target="_blank"><img src=https://img.shields.io/badge/Official%20Site-333399.svg?logo=homepage height=22px></a>
   <a href=https://huggingface.co/tencent/HunyuanImage-3.0 target="_blank"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Models-d96902.svg height=22px></a>
   <a href=https://github.com/Tencent-Hunyuan/HunyuanImage-3.0 target="_blank"><img src= https://img.shields.io/badge/Page-bb8a2e.svg?logo=github height=22px></a>
+  <a href=./assets/HunyuanImage_3_0.pdf target="_blank"><img src=https://img.shields.io/badge/Report-b5212f.svg?logo=arxiv height=22px></a>
   <a href=https://x.com/TencentHunyuan target="_blank"><img src=https://img.shields.io/badge/Hunyuan-black.svg?logo=x height=22px></a>
 </div>
 <p align="center">
+    👏 Join our <a href="./assets/WECHAT.md" target="_blank">WeChat</a> and <a href="https://discord.gg/ehjWMqF5wY">Discord</a> |
 💻 <a href="https://hunyuan.tencent.com/modelSquare/home/play?modelId=289&from=/visual">Official website(官网) Try our model!</a>&nbsp&nbsp
 </p>
 # 1. First install PyTorch (CUDA 12.8 Version)
 pip install torch==2.7.1 torchvision==0.22.1 torchaudio==2.7.1 --index-url https://download.pytorch.org/whl/cu128
+# 2. Then install tencentcloud-sdk
+pip install -i https://mirrors.tencent.com/pypi/simple/ --upgrade tencentcloud-sdk-python
+# 3. Then install other dependencies
 pip install -r requirements.txt
 ```
 ```
 #### 3️⃣ Run the Demo
+The Pretrain Checkpoint does not automatically rewrite or enhance input prompts, for optimal results currently, we recommend community partners to use deepseek to rewrite the prompts.
 ```bash
+# set env
+export DEEPSEEK_KEY_ID="your_deepseek_key_id"
+export DEEPSEEK_KEY_SECRET="your_deepseek_key_secret"
+python3 run_image_gen.py --model-id ./HunyuanImage-3 --verbose 1 --sys-deepseek-prompt "universal" --prompt "A brown and white dog is running on the grass"
 ```
 #### 4️⃣ Command Line Arguments
+| Arguments               | Description                                                     | Default     |
+|-------------------------|-----------------------------------------------------------------|-------------|
+| `--prompt`              | Input prompt                                                    | (Required)  |
+| `--model-id`            | Model path                                                      | (Required)  |
+| `--attn-impl`           | Attention implementation. Either `sdpa` or `flash_attention_2`. | `sdpa`      |
+| `--moe-impl`            | MoE implementation. Either `eager` or `flashinfer`              | `eager`     |
+| `--seed`                | Random seed for image generation                                | `None`      |
+| `--diff-infer-steps`    | Diffusion infer steps                                           | `50`        |
+| `--image-size`          | Image resolution. Can be `auto`, like `1280x768` or `16:9`      | `auto`      |
+| `--save`                | Image save path.                                                | `image.png` |
+| `--verbose`             | Verbose level. 0: No log; 1: log inference information.         | `0`         |
+| `--rewrite`             | Whether to enable rewriting                                     | `True`      |
+| `--sys-deepseek-prompt` | Select sys-prompt from `universal` or `text_rendering`          | `universal` |
 ### 🎨 Interactive Gradio Demo
 * 🌐 [HuggingFace](https://huggingface.co/) - AI model hub and community
 * ⚡ [FlashAttention](https://github.com/Dao-AILab/flash-attention) - Memory-efficient attention
 * 🚀 [FlashInfer](https://github.com/flashinfer-ai/flashinfer) - Optimized inference engine