Update README.md
Browse files
README.md
CHANGED
|
@@ -33,9 +33,10 @@ license: mit
|
|
| 33 |
|
| 34 |
## 📖 Introduction
|
| 35 |
|
| 36 |
-
**UniPic2-Metaquery-9B** is an unified multimodal model built on Qwen2.5-VL-Instruct and SD3.5-Medium. It delivers end-to-end image understanding, text-to-image (T2I) generation, and image editing.
|
| 37 |
-
<div align="center"> <img src="teaser.png" alt="Model Teaser" width="720"> </div>
|
| 38 |
|
|
|
|
|
|
|
| 39 |
|
| 40 |
## 📊 Benchmarks
|
| 41 |
|
|
@@ -61,6 +62,7 @@ cd UniPic-2
|
|
| 61 |
|
| 62 |
### 2. Set Up the Environment
|
| 63 |
```bash
|
|
|
|
| 64 |
conda create -n unipic python=3.10
|
| 65 |
conda activate unipic
|
| 66 |
pip install -r requirements.txt
|
|
|
|
| 33 |
|
| 34 |
## 📖 Introduction
|
| 35 |
|
| 36 |
+
**UniPic2-Metaquery-9B** is an unified multimodal model built on Qwen2.5-VL-Instruct and SD3.5-Medium. It delivers end-to-end image understanding, text-to-image (T2I) generation, and image editing. Requires approximately 40 GB VRAM. For NVIDIA RTX 40-series GPUs, we recommend using the [Skywork/UniPic2-Metaquery-Flash](https://huggingface.co/Skywork/UniPic2-Metaquery-Flash)
|
|
|
|
| 37 |
|
| 38 |
+
<div align="center"> <img src="teaser.png" alt="Model Teaser" width="720"> </div>
|
| 39 |
+
<div align="center"> <img src="understanding.png" alt="Model Teaser" width="720"> </div>
|
| 40 |
|
| 41 |
## 📊 Benchmarks
|
| 42 |
|
|
|
|
| 62 |
|
| 63 |
### 2. Set Up the Environment
|
| 64 |
```bash
|
| 65 |
+
# Requires ~40GB VRAM; for NVIDIA RTX 40-series GPUs, please use the Flash version
|
| 66 |
conda create -n unipic python=3.10
|
| 67 |
conda activate unipic
|
| 68 |
pip install -r requirements.txt
|