Instructions to use AIDC-AI/Ovis-Image-7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use AIDC-AI/Ovis-Image-7B with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("AIDC-AI/Ovis-Image-7B", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Draw Things
- DiffusionBee
Update README.md
Browse files
README.md
CHANGED
|
@@ -21,6 +21,7 @@ tags:
|
|
| 21 |
<img src=https://cdn-uploads.huggingface.co/production/uploads/636f4c6b5d2050767e4a1491/cfsnngElzYv8DbTKsLohl.png width="40%"/>
|
| 22 |
</div>
|
| 23 |
|
|
|
|
| 24 |
<p align="center">
|
| 25 |
<a href="https://arxiv.org/abs/2511.22982"><img src="https://img.shields.io/badge/arXiv_paper-2511.22982-b31b1b.svg" alt="arxiv"></a>
|
| 26 |
<a href="https://github.com/AIDC-AI/Ovis-Image/blob/main/docs/Ovis_Image_Technical_Report.pdf"><img src="https://img.shields.io/badge/Paper-PDF-b31b1b" alt="paper"></a>
|
|
@@ -36,12 +37,19 @@ constraints.
|
|
| 36 |
|
| 37 |
|
| 38 |
|
| 39 |
-
<p align="center">
|
| 40 |
<img src="https://cdn-uploads.huggingface.co/production/uploads/636f4c6b5d2050767e4a1491/10uxbIz-NJf1d716eFZ6O.png" width="95%">
|
| 41 |
<br>
|
| 42 |
<em>The overall architecture of Ovis-Image (cf. Fig.2 in our report).</em>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 43 |
</p>
|
| 44 |
|
|
|
|
| 45 |
## 🏆 Highlights
|
| 46 |
|
| 47 |
* **Strong text rendering at a compact 7B scale**: Ovis-Image is a 7B text-to-image model that delivers text rendering quality comparable to much larger 20B-class systems such as Qwen-Image and competitive with leading closed-source models like GPT4o in text-centric scenarios, while remaining small enough to run on widely accessible hardware.
|
|
|
|
| 21 |
<img src=https://cdn-uploads.huggingface.co/production/uploads/636f4c6b5d2050767e4a1491/cfsnngElzYv8DbTKsLohl.png width="40%"/>
|
| 22 |
</div>
|
| 23 |
|
| 24 |
+
|
| 25 |
<p align="center">
|
| 26 |
<a href="https://arxiv.org/abs/2511.22982"><img src="https://img.shields.io/badge/arXiv_paper-2511.22982-b31b1b.svg" alt="arxiv"></a>
|
| 27 |
<a href="https://github.com/AIDC-AI/Ovis-Image/blob/main/docs/Ovis_Image_Technical_Report.pdf"><img src="https://img.shields.io/badge/Paper-PDF-b31b1b" alt="paper"></a>
|
|
|
|
| 37 |
|
| 38 |
|
| 39 |
|
| 40 |
+
<!-- <p align="center">
|
| 41 |
<img src="https://cdn-uploads.huggingface.co/production/uploads/636f4c6b5d2050767e4a1491/10uxbIz-NJf1d716eFZ6O.png" width="95%">
|
| 42 |
<br>
|
| 43 |
<em>The overall architecture of Ovis-Image (cf. Fig.2 in our report).</em>
|
| 44 |
+
</p> -->
|
| 45 |
+
|
| 46 |
+
<p align="center">
|
| 47 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/636f4c6b5d2050767e4a1491/tHBZVUQg9rK3BoLpwCNSg.png" width="95%">
|
| 48 |
+
<br>
|
| 49 |
+
<em>The overall architecture of Ovis-Image.</em>
|
| 50 |
</p>
|
| 51 |
|
| 52 |
+
|
| 53 |
## 🏆 Highlights
|
| 54 |
|
| 55 |
* **Strong text rendering at a compact 7B scale**: Ovis-Image is a 7B text-to-image model that delivers text rendering quality comparable to much larger 20B-class systems such as Qwen-Image and competitive with leading closed-source models like GPT4o in text-centric scenarios, while remaining small enough to run on widely accessible hardware.
|