OpenGVLab
/

InternVL-Chat-V1-1

Image-Text-to-Text

feature-extraction

Model card Files Files and versions

czczup commited on Apr 20, 2024

Commit

1165597

·

verified ·

1 Parent(s): f92cfa6

Update README.md

Files changed (1) hide show

README.md +10 -10

README.md CHANGED Viewed

@@ -10,16 +10,19 @@ datasets:
 pipeline_tag: visual-question-answering
 ---
-# Model Card for InternVL-Chat-Chinese-V1.1
-<img src="https://cdn-uploads.huggingface.co/production/uploads/64119264f0f81eb569e0d569/-N5Kz3SQM2KOxN0m70ecj.webp" alt="Image Description" width="300" height="300">
-## What is InternVL?
-\[[Paper](https://arxiv.org/abs/2312.14238)\]  \[[GitHub](https://github.com/OpenGVLab/InternVL)\] \[[Chat Demo](https://internvl.opengvlab.com/)\]
-InternVL scales up the ViT to _**6B parameters**_ and aligns it with LLM.
 ## Model Details
 - **Model Type:** multimodal large language model (MLLM)
@@ -40,13 +43,10 @@ InternVL scales up the ViT to _**6B parameters**_ and aligns it with LLM.
 ## Model Usage
-We provide a minimum code example to run InternVL-Chat using only the `transformers` library.
 You also can use our [online demo](https://internvl.opengvlab.com/) for a quick experience of this model.
-Note: If you meet this error `ImportError: This modeling file requires the following packages that were not found in your environment: fastchat`, please run `pip install fschat`.
 ```python
 import torch
 from PIL import Image

 pipeline_tag: visual-question-answering
 ---
+# Model Card for InternVL-Chat-V1.1
+<img src="https://cdn-uploads.huggingface.co/production/uploads/64119264f0f81eb569e0d569/-N5Kz3SQM2KOxN0m70ecj.webp" alt="Image Description" width="300" height="300">
+\[[Paper](https://arxiv.org/abs/2312.14238)\]  \[[GitHub](https://github.com/OpenGVLab/InternVL)\] \[[Chat Demo](https://internvl.opengvlab.com/)\] \[[中文解读](https://zhuanlan.zhihu.com/p/675877376)]
+| Model                   | Date       | Download                                                                             | Note                               |
+| ----------------------- | ---------- | ------------------------------------------------------------------------------------ | ---------------------------------- |
+| InternVL-Chat-V1.5 | 2024.04.18 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5)       | support 4K image; super strong OCR; Approaching the performance of GPT-4V and Gemini Pro on various benchmarks like MMMU, DocVQA, ChartQA, MathVista, etc. (🔥new)|
+| InternVL-Chat-V1.2-Plus | 2024.02.21 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-Chinese-V1-2-Plus)       | more SFT data and stronger  |
+| InternVL-Chat-V1.2      | 2024.02.11 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-Chinese-V1-2)            | scaling up LLM to 34B       |
+| InternVL-Chat-V1.1      | 2024.01.24 | 🤗 [HF link](https://huggingface.co/OpenGVLab/InternVL-Chat-Chinese-V1-1)            | support Chinese and stronger OCR   |
 ## Model Details
 - **Model Type:** multimodal large language model (MLLM)
 ## Model Usage
+We provide an example code to run InternVL-Chat-V1.1 using only the `transformers` library.
 You also can use our [online demo](https://internvl.opengvlab.com/) for a quick experience of this model.
 ```python
 import torch
 from PIL import Image