lmms-lab
/

LLaVA-OneVision-1.5-8B-Instruct

Image-Text-to-Text

feature-extraction

Model card Files Files and versions

Metrics Training metrics Community

xiangan commited on Oct 21

Commit

80719a5

·

verified ·

1 Parent(s): 75a1758

Update README.md

Files changed (1) hide show

README.md +2 -3

README.md CHANGED Viewed

@@ -31,7 +31,6 @@ pipeline_tag: image-text-to-text
 ## Introduction
-Copilot said: LLaVA-OneVision-1.5 is a fully open-source family of
 LLaVA-OneVision-1.5 is a fully open-source family of large multimodal models (LMMs) built to democratize multimodal training. Trained on native‑resolution images, it delivers state‑of‑the‑art performance at substantially lower cost. The project also releases high‑quality pretraining and SFT data, a complete and efficient training framework with recipes and configs, and comprehensive logs to support transparent, reproducible research.
 #### **Superior Performance**
   - The model leads on multiple multimodal benchmarks and generally surpasses Qwen2.5-VL.
@@ -69,8 +68,8 @@ LLaVA-OneVision-1.5 is a fully open-source family of large multimodal models (LM
 | Description        | Link                                                                                                   | Status      |
 |--------------------|--------------------------------------------------------------------------------------------------------|-------------|
-| OV-1.5-Mid-Training-85M   | [🤗HF/85M](https://huggingface.co/datasets/lmms-lab/LLaVA-One-Vision-1.5-Mid-Training-85M) | Uploading…  |
-| OV-1.5-Instruct           | [🤗HF/Inst](https://huggingface.co/datasets/lmms-lab/LLaVA-OneVision-1.5-Insturct-Data)     | Uploading…  |
 ## Evaluation Results

 ## Introduction
 LLaVA-OneVision-1.5 is a fully open-source family of large multimodal models (LMMs) built to democratize multimodal training. Trained on native‑resolution images, it delivers state‑of‑the‑art performance at substantially lower cost. The project also releases high‑quality pretraining and SFT data, a complete and efficient training framework with recipes and configs, and comprehensive logs to support transparent, reproducible research.
 #### **Superior Performance**
   - The model leads on multiple multimodal benchmarks and generally surpasses Qwen2.5-VL.
 | Description        | Link                                                                                                   | Status      |
 |--------------------|--------------------------------------------------------------------------------------------------------|-------------|
+| LLaVA-OneVision-1.5-Mid-Training-85M   | [🤗HF / Mid-Training 85M](https://huggingface.co/datasets/mvp-lab/LLaVA-OneVision-1.5-Mid-Training-85M) | Uploading…  |
+| LLaVA-OneVision-1.5-Instruct           | [🤗HF / Instruct-Data](https://huggingface.co/datasets/mvp-lab/LLaVA-OneVision-1.5-Instruct-Data)        | Available  |
 ## Evaluation Results