Initial upload LLaVA-OneVision model

Files changed (6) hide show

.gitignore CHANGED Viewed

@@ -1,3 +1,6 @@
 n1_*.bin
-n1_*.tar
-n1_*.txt

 n1_*.bin
+# 确保所有 Git 相关文件不会被提交
+.git*
+!.gitattributes
+!.gitignore

README.md ADDED Viewed

+---
+library_name: pytorch
+---
+![llava_onevison_logo](resource/LLaVA_onevision.png)
+LLaVA-OneVision is a multimodal vision-language model that integrates a pretrained Qwen-2 language model with a visual encoder, enabling instruction-tuned understanding and reasoning across text and images.
+Original paper: [LLaVA-OneVision: Easy Visual Task Transfer](https://arxiv.org/abs/2408.03326)
+#LLaVA-OneVision-Qwen2-7B
+This model uses LLaVA-OneVision with Qwen-2 as the language backbone, allowing rich multimodal reasoning and generation capabilities. It is well suited for applications such as image-grounded question answering, multimodal dialogue, and tasks requiring aligned understanding of visual and textual information.
+Model Configuration:
+- Reference implementation: [LLaVA_OneVision](https://github.com/LLaVA-VL/LLaVA-NeXT)
+- Original Weight: [llava-onevision-qwen2-7b-ov-chat](https://huggingface.co/lmms-lab/llava-onevision-qwen2-7b-ov-chat)
+- Vision Encoder: SO400M
+- Language Model: Qwen-2.0
+- Resolution: 3x384x384
+- Support Cooper version:
+    - Cooper SDK: [2.5.2]
+    - Cooper Foundry: [2.2]
+| Model | Device | Model Link |
+| :-----: | :-----: | :-----: |
+| LLaVA-OneVision | N1-655 | [Model_Link](https://huggingface.co/Ambarella/LLaVA-OneVision/blob/main/n1-655_llava_onevision_7B_1NVP.tar) |

n1-655_llava_onevision_7B_1NVP.tar_sha256.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+ 7df1c58d2d9f8920b52f1b04e081bf340741c155ac10dd6e555d8934f0d7bfba cooper_pro_prebuilt_llm_llava-onevision-7b_1NVP_HayPlus_1.2.1.36_Shepherd_1.5.0_20250530.tar

n1_llava_onevision_7B_6NVP.tar ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:32b3517b80151c6225b023201ad8936884c1ece7e3c6d1234422db094d190ffe
+size 7193415680

n1_llava_onevision_7B_6NVP.tar_sha256.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+ 32b3517b80151c6225b023201ad8936884c1ece7e3c6d1234422db094d190ffe cooper_max_prebuilt_llm_llava-onevision_6NVP_HayPlus_1.2.1.36_Shepherd_1.5.0_20250530.tar

resource/LLaVA_onevision.png ADDED Viewed