| | --- |
| | inference: false |
| | --- |
| | <br> |
| | <br> |
| |
|
| | # ShareGPT4V-7B Model Card |
| |
|
| | ## Model details |
| |
|
| | **Model type:** |
| | ShareGPT4V-7B is an open-source chatbot trained by fine-tuning CLP vision tower and LLaMA/Vicuna on GPT4-Vision-assisted [ShareGPT4V](https://huggingface.co/datasets/Lin-Chen/ShareGPT4V) data and LLaVA instruction-tuning data. |
| |
|
| | **Model date:** |
| | ShareGPT4V-7B was trained in Nov 2023. |
| |
|
| | **Paper or resources for more information:** |
| | [[Project](https://ShareGPT4V.github.io/)] [[Paper](https://huggingface.co/papers/2311.12793)] [[Code](https://github.com/InternLM/InternLM-XComposer/tree/main/projects/ShareGPT4V)] |
| |
|
| | ## License |
| | Llama 2 is licensed under the LLAMA 2 Community License, |
| | Copyright (c) Meta Platforms, Inc. All Rights Reserved. |
| |
|
| | ## Intended use |
| | **Primary intended uses:** |
| | The primary use of ShareGPT4V-7B is research on large multimodal models and chatbots. |
| |
|
| | **Primary intended users:** |
| | The primary intended users of the model are researchers and hobbyists in computer vision, natural language processing, machine learning, and artificial intelligence. |
| |
|
| | ## Training dataset |
| | - 1.2M high-quality image-text pairs, i.e., ShareGPT4V-PT data |
| | - 100K GPT4-Vision-generated image-text pairs |
| | - LLaVA instruction-tuning data |
| |
|
| | ## Evaluation dataset |
| | A collection of 11 benchmarks |
| |
|