AIDC-AI
/

Ovis-Image-7B

image generation

Model card Files Files and versions

Flourish commited on 25 days ago

Commit

66ef018

·

verified ·

1 Parent(s): ac8fb10

Update README.md

Files changed (1) hide show

README.md +9 -1

README.md CHANGED Viewed

@@ -21,6 +21,7 @@ tags:
   <img src=https://cdn-uploads.huggingface.co/production/uploads/636f4c6b5d2050767e4a1491/cfsnngElzYv8DbTKsLohl.png width="40%"/>
 </div>
 <p align="center">
   <a href="https://arxiv.org/abs/2511.22982"><img src="https://img.shields.io/badge/arXiv_paper-2511.22982-b31b1b.svg" alt="arxiv"></a>
   <a href="https://github.com/AIDC-AI/Ovis-Image/blob/main/docs/Ovis_Image_Technical_Report.pdf"><img src="https://img.shields.io/badge/Paper-PDF-b31b1b" alt="paper"></a>
@@ -36,12 +37,19 @@ constraints.
-<p align="center">
   <img src="https://cdn-uploads.huggingface.co/production/uploads/636f4c6b5d2050767e4a1491/10uxbIz-NJf1d716eFZ6O.png" width="95%">
   <br>
   <em>The overall architecture of Ovis-Image (cf. Fig.2 in our report).</em>
 </p>
 ## 🏆 Highlights
 *   **Strong text rendering at a compact 7B scale**: Ovis-Image is a 7B text-to-image model that delivers text rendering quality comparable to much larger 20B-class systems such as Qwen-Image and competitive with leading closed-source models like GPT4o in text-centric scenarios, while remaining small enough to run on widely accessible hardware.

   <img src=https://cdn-uploads.huggingface.co/production/uploads/636f4c6b5d2050767e4a1491/cfsnngElzYv8DbTKsLohl.png width="40%"/>
 </div>
 <p align="center">
   <a href="https://arxiv.org/abs/2511.22982"><img src="https://img.shields.io/badge/arXiv_paper-2511.22982-b31b1b.svg" alt="arxiv"></a>
   <a href="https://github.com/AIDC-AI/Ovis-Image/blob/main/docs/Ovis_Image_Technical_Report.pdf"><img src="https://img.shields.io/badge/Paper-PDF-b31b1b" alt="paper"></a>
+<!-- <p align="center">
   <img src="https://cdn-uploads.huggingface.co/production/uploads/636f4c6b5d2050767e4a1491/10uxbIz-NJf1d716eFZ6O.png" width="95%">
   <br>
   <em>The overall architecture of Ovis-Image (cf. Fig.2 in our report).</em>
+</p> -->
+<p align="center">
+  <img src="https://cdn-uploads.huggingface.co/production/uploads/636f4c6b5d2050767e4a1491/tHBZVUQg9rK3BoLpwCNSg.png" width="95%">
+  <br>
+  <em>The overall architecture of Ovis-Image.</em>
 </p>
 ## 🏆 Highlights
 *   **Strong text rendering at a compact 7B scale**: Ovis-Image is a 7B text-to-image model that delivers text rendering quality comparable to much larger 20B-class systems such as Qwen-Image and competitive with leading closed-source models like GPT4o in text-centric scenarios, while remaining small enough to run on widely accessible hardware.