jiangchengchengNLP
/

qwenva

Model card Files Files and versions

Update README.md

#1

by jiangchengchengNLP - opened Dec 7, 2024

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -8,7 +8,7 @@ base_model:
 ---
 # Visual Language Model Based on Qwen and CLIP
-This is a visual language multimodal model built upon the Qwen series language models and the CLIP visual encoder. It has been trained for 10 epochs on the LLaVA pre-training dataset and nearly 800K examples (150K instruction fine-tuning and 665K instruction mixed fine-tuning). However, due to data size is larger than model,  so it can only perform sample question-answering tasks on images and currently supports only English question answering.
 ## Training Details

 ---
 # Visual Language Model Based on Qwen and CLIP
+This is a visual language multimodal model built upon the Qwen series language models and the CLIP visual encoder. It has been trained for 10 epochs on the LLaVA pre-training dataset and nearly 800K examples (150K instruction fine-tuning and 665K instruction mixed fine-tuning). However, due to data size is larger for model,  so it can only perform simple question-answering tasks on images and currently supports only English question answering.
 ## Training Details