cooper_robot commited on
Commit
fd151db
·
1 Parent(s): 465ccb9

Initial upload LLaVA-OneVision model

Browse files
.gitignore CHANGED
@@ -1,3 +1,6 @@
1
  n1_*.bin
2
- n1_*.tar
3
- n1_*.txt
 
 
 
 
1
  n1_*.bin
2
+
3
+ # 确保所有 Git 相关文件不会被提交
4
+ .git*
5
+ !.gitattributes
6
+ !.gitignore
README.md ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: pytorch
3
+ ---
4
+
5
+ ![llava_onevison_logo](resource/LLaVA_onevision.png)
6
+
7
+ LLaVA-OneVision is a multimodal vision-language model that integrates a pretrained Qwen-2 language model with a visual encoder, enabling instruction-tuned understanding and reasoning across text and images.
8
+
9
+ Original paper: [LLaVA-OneVision: Easy Visual Task Transfer](https://arxiv.org/abs/2408.03326)
10
+
11
+ #LLaVA-OneVision-Qwen2-7B
12
+
13
+ This model uses LLaVA-OneVision with Qwen-2 as the language backbone, allowing rich multimodal reasoning and generation capabilities. It is well suited for applications such as image-grounded question answering, multimodal dialogue, and tasks requiring aligned understanding of visual and textual information.
14
+
15
+ Model Configuration:
16
+ - Reference implementation: [LLaVA_OneVision](https://github.com/LLaVA-VL/LLaVA-NeXT)
17
+ - Original Weight: [llava-onevision-qwen2-7b-ov-chat](https://huggingface.co/lmms-lab/llava-onevision-qwen2-7b-ov-chat)
18
+ - Vision Encoder: SO400M
19
+ - Language Model: Qwen-2.0
20
+ - Resolution: 3x384x384
21
+ - Support Cooper version:
22
+ - Cooper SDK: [2.5.2]
23
+ - Cooper Foundry: [2.2]
24
+
25
+ | Model | Device | Model Link |
26
+ | :-----: | :-----: | :-----: |
27
+ | LLaVA-OneVision | N1-655 | [Model_Link](https://huggingface.co/Ambarella/LLaVA-OneVision/blob/main/n1-655_llava_onevision_7B_1NVP.tar) |
n1-655_llava_onevision_7B_1NVP.tar_sha256.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ 7df1c58d2d9f8920b52f1b04e081bf340741c155ac10dd6e555d8934f0d7bfba cooper_pro_prebuilt_llm_llava-onevision-7b_1NVP_HayPlus_1.2.1.36_Shepherd_1.5.0_20250530.tar
n1_llava_onevision_7B_6NVP.tar ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:32b3517b80151c6225b023201ad8936884c1ece7e3c6d1234422db094d190ffe
3
+ size 7193415680
n1_llava_onevision_7B_6NVP.tar_sha256.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ 32b3517b80151c6225b023201ad8936884c1ece7e3c6d1234422db094d190ffe cooper_max_prebuilt_llm_llava-onevision_6NVP_HayPlus_1.2.1.36_Shepherd_1.5.0_20250530.tar
resource/LLaVA_onevision.png ADDED

Git LFS Details

  • SHA256: 5a175faf0f0ad1fe3581a00305af8c44a15fcdebb81d2e504f0f161e0e89cad6
  • Pointer size: 131 Bytes
  • Size of remote file: 771 kB