lihongjie
commited on
Commit
·
c0459a6
1
Parent(s):
911f76f
first commit
Browse filesThis view is limited to 50 files because it contains too many changes.
See raw diff
- .gitattributes +58 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/Qwen3-VL-8B-Instruct_vision.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/model.embed_tokens.weight.bfloat16.bin +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l0_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l10_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l11_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l12_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l13_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l14_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l15_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l16_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l17_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l18_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l19_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l1_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l20_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l21_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l22_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l23_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l24_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l25_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l26_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l27_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l28_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l29_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l2_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l30_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l31_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l32_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l33_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l34_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l35_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l3_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l4_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l5_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l6_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l7_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l8_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l9_together.axmodel +3 -0
- Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_post.axmodel +3 -0
- README.md +233 -3
- config.json +0 -0
- images/demo.jpg +3 -0
- images/demo1.jpg +3 -0
- images/recoAll_attractions_1.jpg +3 -0
- images/recoAll_attractions_2.jpg +3 -0
- images/recoAll_attractions_3.jpg +3 -0
- images/recoAll_attractions_4.jpg +3 -0
- images/ssd_car.jpg +3 -0
- images/ssd_horse.jpg +3 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,61 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l13_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
images/recoAll_attractions_1.jpg filter=lfs diff=lfs merge=lfs -text
|
| 38 |
+
images/recoAll_attractions_3.jpg filter=lfs diff=lfs merge=lfs -text
|
| 39 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l27_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 40 |
+
images/recoAll_attractions_4.jpg filter=lfs diff=lfs merge=lfs -text
|
| 41 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l26_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 42 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l32_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 43 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_post.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 44 |
+
video/frame_0016.jpg filter=lfs diff=lfs merge=lfs -text
|
| 45 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/model.embed_tokens.weight.bfloat16.bin filter=lfs diff=lfs merge=lfs -text
|
| 46 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l21_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 47 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l5_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 48 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l2_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 49 |
+
video/frame_0008.jpg filter=lfs diff=lfs merge=lfs -text
|
| 50 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/Qwen3-VL-8B-Instruct_vision.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 51 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l17_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 52 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l20_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 53 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l18_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 54 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l28_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 55 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l4_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 56 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l22_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 57 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l29_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 58 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l9_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 59 |
+
images/ssd_car.jpg filter=lfs diff=lfs merge=lfs -text
|
| 60 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l10_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 61 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l15_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 62 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l1_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 63 |
+
video/frame_0040.jpg filter=lfs diff=lfs merge=lfs -text
|
| 64 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l25_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 65 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l6_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 66 |
+
images/ssd_horse.jpg filter=lfs diff=lfs merge=lfs -text
|
| 67 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l0_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 68 |
+
video/frame_0000.jpg filter=lfs diff=lfs merge=lfs -text
|
| 69 |
+
video/frame_0032.jpg filter=lfs diff=lfs merge=lfs -text
|
| 70 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l23_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 71 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l12_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 72 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l34_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 73 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l33_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 74 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l35_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 75 |
+
images/demo.jpg filter=lfs diff=lfs merge=lfs -text
|
| 76 |
+
images/recoAll_attractions_2.jpg filter=lfs diff=lfs merge=lfs -text
|
| 77 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l16_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 78 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l31_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 79 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l3_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 80 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l8_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 81 |
+
images/demo1.jpg filter=lfs diff=lfs merge=lfs -text
|
| 82 |
+
video/frame_0024.jpg filter=lfs diff=lfs merge=lfs -text
|
| 83 |
+
video/frame_0056.jpg filter=lfs diff=lfs merge=lfs -text
|
| 84 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l14_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 85 |
+
video/frame_0048.jpg filter=lfs diff=lfs merge=lfs -text
|
| 86 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l30_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 87 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l7_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 88 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l24_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 89 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l11_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 90 |
+
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l19_together.axmodel filter=lfs diff=lfs merge=lfs -text
|
| 91 |
+
main_ax650 filter=lfs diff=lfs merge=lfs -text
|
| 92 |
+
main_axcl_aarch64 filter=lfs diff=lfs merge=lfs -text
|
| 93 |
+
main_axcl_x86 filter=lfs diff=lfs merge=lfs -text
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/Qwen3-VL-8B-Instruct_vision.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:dd6d98cb2bfb3f65992135f353226d3973040584083146142c78c0861cc1d0b9
|
| 3 |
+
size 650854262
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/model.embed_tokens.weight.bfloat16.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d6d8b73339dcddb4ecf93c0741c319af21e7f1c0f2a224b31a7cc7b0d42045c0
|
| 3 |
+
size 1244659712
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l0_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:881c7a46f42d0a83a85bd3fc0e737343e0d4a676536de83591fecdb95695f564
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l10_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b794170719bca792ab4780c1064208076b16c114eb19d1adf0375758ff3c42ed
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l11_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a3dac0a799288df45fa3739ca790c8833d16edbdac66c91afacd4bd250095d61
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l12_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c6ec5567b12dc45d667180c503d1f0c60acbb4806fe02596602e099ede7f5e11
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l13_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6e980c9c2af107dddd59340c077f90ff1c67416d343a43ed765ee279af31d51b
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l14_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:832013d6b7604c75597e8b10153afc9c7f3022a30066f9cf3507ae7437ad2790
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l15_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6dc9e3f43a91b30e18c693dba5ede65e5182d485933377d3d3a41fc52fdb20a4
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l16_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5c2f17fb16de54ba78d31d4a0df3e3a38a4136c4a418dd1ac2f50ce4144dc985
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l17_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d65b69df218f437570a0323c35409fb46484d8cac211299e10d0cd6c21ccdbd5
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l18_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:971ddf3a13da777660b6b4efc40b0d4bd6a64572253c81da629d983a28683224
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l19_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c1a10ebe6aa025b00080983cfec1cbcfa685b9c3d01845de69eb1720d5fa2fbe
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l1_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:448b28146ec968437d9501e1ef2564dcee314a9497c26bceece52db7a3b0b12e
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l20_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:944a05f0a75d37897f62fbe995b2122b93947d44a0c8fecd10e282955d33c419
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l21_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c4b27f4554dcfabd89d9a22a496ffe12aa84893310d845f44753784720aef13d
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l22_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2943d8b0e22659885c088043d1853a65948ef60c00978a94f0138ba13133440b
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l23_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:38c0e6adb898fca2e7fb46a8a6df1b620151230e6f11674ec24615f0fdbcba1b
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l24_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7361ba4608d5b0987a7d151992b2666760796af510381a63d0c89f0ec2a66495
|
| 3 |
+
size 243172687
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l25_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:9988bd0f18617585ab765a85f197dae8b1c5a7028f1d787a88c9f507204649b6
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l26_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:9586a62e31b6b41ffeed3444f529439b7548bb5ba448d14a3d4a27dcd36cd74c
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l27_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7627715529c4a81e2a2c625e1a45d7219906952e87e8537e43977d84a66021dd
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l28_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d7010d677a1fdecac02496b10b0601da5fd31b8649cb308779069575fe8af06e
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l29_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:35f2e6ee683aa6ae8d6a506667fecd007a649f17457c346478b534ae4f031e4b
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l2_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6d2b018f47fee60f7508044365c4baebc178f7456d4de4f2c64db23bb61f288b
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l30_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4649aa8bf9897781bbafe1b943d1ce7204b79c42a0c8357092e827473721811f
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l31_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:87e78c853c777ea1aba94dcb6302d454635ac7c674a6fc76bbc86e871d3d7d60
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l32_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:91899cccad6559ac5af8bbfc95051ae7c7bc585a5f1545b4ef839e084a18d734
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l33_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:36a5df29bef7a521fa0cc484ead546d8e379ea95d6e781f4dbe2400c58058467
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l34_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:765ffa314132d7cffa105d64928cd656c598df4d69cdcc37bcf00ff0f60a9f7a
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l35_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d4273ef73fa4f5d7b0eb29d057146d471943f0273e3a0bdd9465983071509813
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l3_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b2a102908ac2b60c29b82ff4b015d7e0df46c286d3e7e56d002d358c9d431866
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l4_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fd140a8140492c985b3746adaf1bb3f9b2c1087a1d23091a3c65e2cf67500a06
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l5_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5a775257727d15f68f06b6ebac17b53aa59683a7c87b80f1e1b21f527d8dcb80
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l6_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:020ced14c5c98d934aaace2f5308689b6d8883878eef53b924c919b3eaf87df8
|
| 3 |
+
size 243172911
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l7_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:aea136f3260cd658226b69f4ca894114961853435af36995dcbb9f2e735047ac
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l8_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:12a542d2c9de16b8be544d79b6728389762da1432b0fa34c7782f85633fc6f81
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_p128_l9_together.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a849b05cc25c44be854a13a0291652668cced55d9bca317a2f075917046a0ffc
|
| 3 |
+
size 243172527
|
Qwen3-VL-8B-Instruct-AX650-c128_p1152/qwen3_vl_text_post.axmodel
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:19c12b62141e847943c8983b1ffdde9555bffecd86b07dde325d6cbc6475cb21
|
| 3 |
+
size 678989672
|
README.md
CHANGED
|
@@ -1,3 +1,233 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: mit
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
+
- zh
|
| 6 |
+
base_model:
|
| 7 |
+
- Qwen/Qwen3-VL-2B-Instruct
|
| 8 |
+
- Qwen/Qwen3-VL-4B-Instruct
|
| 9 |
+
- Qwen/Qwen3-VL-8B-Instruct
|
| 10 |
+
pipeline_tag: image-text-to-text
|
| 11 |
+
library_name: transformers
|
| 12 |
+
tags:
|
| 13 |
+
- Qwen3-VL
|
| 14 |
+
- Qwen3-VL-2B-Instruct
|
| 15 |
+
- Qwen3-VL-4B-Instruct
|
| 16 |
+
- Qwen3-VL-8B-Instruct
|
| 17 |
+
- Int8
|
| 18 |
+
- VLM
|
| 19 |
+
- GPTQ
|
| 20 |
+
---
|
| 21 |
+
|
| 22 |
+
# Qwen3-VL
|
| 23 |
+
|
| 24 |
+
This version of Qwen3-VL-2B-Instruct has been converted to run on the Axera NPU using **w8a16** quantization.
|
| 25 |
+
|
| 26 |
+
Compatible with Pulsar2 version: 5.0
|
| 27 |
+
|
| 28 |
+
## Convert tools links:
|
| 29 |
+
|
| 30 |
+
For those who are interested in model conversion, you can try to export axmodel through the original repo :
|
| 31 |
+
|
| 32 |
+
- https://huggingface.co/Qwen/Qwen3-VL-2B-Instruct
|
| 33 |
+
- https://huggingface.co/Qwen/Qwen3-VL-4B-Instruct
|
| 34 |
+
|
| 35 |
+
[Pulsar2 Link, How to Convert LLM from Huggingface to axmodel](https://pulsar2-docs.readthedocs.io/en/latest/appendix/build_llm.html)
|
| 36 |
+
|
| 37 |
+
[AXera NPU HOST LLM Runtime](https://github.com/AXERA-TECH/Qwen3-VL.AXERA)
|
| 38 |
+
|
| 39 |
+
|
| 40 |
+
## Support Platform
|
| 41 |
+
|
| 42 |
+
- AX650
|
| 43 |
+
- AX650N DEMO Board
|
| 44 |
+
- [M4N-Dock(爱芯派Pro)](https://wiki.sipeed.com/hardware/zh/maixIV/m4ndock/m4ndock.html)
|
| 45 |
+
- [M.2 Accelerator card](https://axcl-docs.readthedocs.io/zh-cn/latest/doc_guide_hardware.html)
|
| 46 |
+
|
| 47 |
+
**Image Process**
|
| 48 |
+
|Chips| input size | image num | image encoder | ttft(168 tokens) | w8a16 | CMM | Flash |
|
| 49 |
+
|--|--|--|--|--|--|--|--|
|
| 50 |
+
|AX650| 384*384 | 1 | ms | ms | tokens/sec| GiB | 14 GiB |
|
| 51 |
+
|
| 52 |
+
**Video Process**
|
| 53 |
+
|Chips| input size | image num | image encoder |ttft(600 tokens) | w8a16 | CMM | Flash |
|
| 54 |
+
|--|--|--|--|--|--|--|--|
|
| 55 |
+
|AX650| 384*384 | 8 | ms | ms | tokens/sec| GiB | 14 GiB |
|
| 56 |
+
|
| 57 |
+
The DDR capacity refers to the CMM memory that needs to be consumed. Ensure that the CMM memory allocation on the development board is greater than this value.
|
| 58 |
+
|
| 59 |
+
## How to use
|
| 60 |
+
|
| 61 |
+
Download all files from this repository to the device
|
| 62 |
+
|
| 63 |
+
**If you using AX650 Board**
|
| 64 |
+
|
| 65 |
+
### Prepare tokenizer server
|
| 66 |
+
|
| 67 |
+
#### Install transformer
|
| 68 |
+
|
| 69 |
+
```
|
| 70 |
+
pip install -r requirements.txt
|
| 71 |
+
```
|
| 72 |
+
|
| 73 |
+
### Demo Run
|
| 74 |
+
|
| 75 |
+
#### Image understand demo
|
| 76 |
+
|
| 77 |
+
##### start tokenizer server for image understand demo
|
| 78 |
+
|
| 79 |
+
```
|
| 80 |
+
python3 tokenizer_images.py --port 8080
|
| 81 |
+
```
|
| 82 |
+
|
| 83 |
+
##### run image understand demo
|
| 84 |
+
|
| 85 |
+
- input text
|
| 86 |
+
|
| 87 |
+
```
|
| 88 |
+
描述这张图片
|
| 89 |
+
```
|
| 90 |
+
|
| 91 |
+
- input image
|
| 92 |
+
|
| 93 |
+

|
| 94 |
+
|
| 95 |
+
```
|
| 96 |
+
root@ax650 ~/Qwen3-VL-2B-Instruct-GPTQ-Int4 # bash run_image_ax650.sh
|
| 97 |
+
[I][ Init][ 156]: LLM init start
|
| 98 |
+
[I][ Init][ 158]: Total CMM:4353 MB
|
| 99 |
+
[I][ Init][ 34]: connect http://127.0.0.1:8080 ok
|
| 100 |
+
bos_id: -1, eos_id: 151645
|
| 101 |
+
img_start_token: 151652
|
| 102 |
+
img_context_token: 151655
|
| 103 |
+
3% | ██ | 1 / 31 [0.01s<0.46s, 66.67 count/s] tokenizer init ok[I][ Init][ 26]: LLaMaEmbedSelector use mmap
|
| 104 |
+
6% | ███ | 2 / 31 [0.02s<0.34s, 90.91 count/s] embed_selector init ok[I][ Init][ 201]: attr.axmodel_num:28
|
| 105 |
+
103% | ██████████████████████████████████ | 32 / 31 [34.03s<32.96s, 0.94 count/s] init vpm axmodel ok,remain_cmm(854 MB)[I][ Init][ 266]: IMAGE_CONTEXT_TOKEN: 151655, IMAGE_START_TOKEN: 151652
|
| 106 |
+
[I][ Init][ 309]: image encoder output float32
|
| 107 |
+
|
| 108 |
+
[I][ Init][ 339]: max_token_len : 2047
|
| 109 |
+
[I][ Init][ 344]: kv_cache_size : 1024, kv_cache_num: 2047
|
| 110 |
+
[I][ Init][ 352]: prefill_token_num : 128
|
| 111 |
+
[I][ Init][ 356]: grp: 1, prefill_max_token_num : 1
|
| 112 |
+
[I][ Init][ 356]: grp: 2, prefill_max_token_num : 128
|
| 113 |
+
[I][ Init][ 356]: grp: 3, prefill_max_token_num : 256
|
| 114 |
+
[I][ Init][ 356]: grp: 4, prefill_max_token_num : 384
|
| 115 |
+
[I][ Init][ 356]: grp: 5, prefill_max_token_num : 512
|
| 116 |
+
[I][ Init][ 356]: grp: 6, prefill_max_token_num : 640
|
| 117 |
+
[I][ Init][ 356]: grp: 7, prefill_max_token_num : 768
|
| 118 |
+
[I][ Init][ 356]: grp: 8, prefill_max_token_num : 896
|
| 119 |
+
[I][ Init][ 356]: grp: 9, prefill_max_token_num : 1024
|
| 120 |
+
[I][ Init][ 356]: grp: 10, prefill_max_token_num : 1152
|
| 121 |
+
[I][ Init][ 360]: prefill_max_token_num : 1152
|
| 122 |
+
[I][ Init][ 372]: LLM init ok
|
| 123 |
+
[I][ Init][ 374]: Left CMM:854 MB
|
| 124 |
+
Type "q" to exit, Ctrl+c to stop current running
|
| 125 |
+
prompt >> 描述这张图片
|
| 126 |
+
image >> images/recoAll_attractions_1.jpg
|
| 127 |
+
[I][ EncodeImage][ 440]: pixel_values size 1
|
| 128 |
+
[I][ EncodeImage][ 441]: grid_h 24 grid_w 24
|
| 129 |
+
[I][ EncodeImage][ 489]: image encode time : 237.778000 ms, size : 1
|
| 130 |
+
[I][ Encode][ 532]: input_ids size:168
|
| 131 |
+
[I][ Encode][ 540]: offset 15
|
| 132 |
+
[I][ Encode][ 569]: img_embed.size:1, 294912
|
| 133 |
+
[I][ Encode][ 583]: out_embed size:344064
|
| 134 |
+
[I][ Encode][ 584]: input_ids size 168
|
| 135 |
+
[I][ Encode][ 586]: position_ids size:168
|
| 136 |
+
[I][ Run][ 607]: input token num : 168, prefill_split_num : 2
|
| 137 |
+
[I][ Run][ 641]: input_num_token:128
|
| 138 |
+
[I][ Run][ 641]: input_num_token:40
|
| 139 |
+
[I][ Run][ 865]: ttft: 313.60 ms
|
| 140 |
+
这是一张在埃及沙漠中拍摄的风景照片。画面中,三座巨大的金字塔在晴朗的天空下矗立,它们是古埃及文明的象征。这些金字塔由巨大的石块堆叠而成,表面因岁月侵蚀而显得斑驳。在金字塔的前方,有几个人影在沙地上行走,这为整个场景提供了比例感和尺度感。整个场景充满了历史的厚重感和神秘的氛围。
|
| 141 |
+
|
| 142 |
+
[N][ Run][ 992]: hit eos,avg 14.14 token/s
|
| 143 |
+
```
|
| 144 |
+
|
| 145 |
+
#### Video understand demo
|
| 146 |
+
|
| 147 |
+
##### start tokenizer server for image understand demo
|
| 148 |
+
|
| 149 |
+
```
|
| 150 |
+
python tokenizer_video.py --port 8080
|
| 151 |
+
```
|
| 152 |
+
|
| 153 |
+
##### run video understand demo
|
| 154 |
+
- input text
|
| 155 |
+
|
| 156 |
+
```
|
| 157 |
+
描述这个视频
|
| 158 |
+
```
|
| 159 |
+
|
| 160 |
+
- input video
|
| 161 |
+
|
| 162 |
+
./video
|
| 163 |
+
|
| 164 |
+
```
|
| 165 |
+
root@ax650 ~/Qwen3-VL-2B-Instruct-GPTQ-Int4 # bash run_video_ax650.sh
|
| 166 |
+
[I][ Init][ 156]: LLM init start
|
| 167 |
+
[I][ Init][ 158]: Total CMM:7884 MB
|
| 168 |
+
[I][ Init][ 34]: connect http://127.0.0.1:8080 ok
|
| 169 |
+
bos_id: -1, eos_id: 151645
|
| 170 |
+
img_start_token: 151652
|
| 171 |
+
img_context_token: 151656
|
| 172 |
+
3% | ██ | 1 / 31 [0.01s<0.34s, 90.91 count/s] tokenizer init ok[I][ Init][ 26]: LLaMaEmbedSelector use mmap
|
| 173 |
+
6% | ███ | 2 / 31 [0.01s<0.23s, 133.33 count/s] embed_selector init ok[I][ Init][ 201]: attr.axmodel_num:28
|
| 174 |
+
103% | ██████████████████████████████████ | 32 / 31 [32.37s<31.36s, 0.99 count/s] init vpm axmodel ok,remain_cmm(4385 MB)[I][ Init][ 266]: IMAGE_CONTEXT_TOKEN: 151656, IMAGE_START_TOKEN: 151652
|
| 175 |
+
[I][ Init][ 309]: image encoder output float32
|
| 176 |
+
|
| 177 |
+
[I][ Init][ 339]: max_token_len : 2047
|
| 178 |
+
[I][ Init][ 344]: kv_cache_size : 1024, kv_cache_num: 2047
|
| 179 |
+
[I][ Init][ 352]: prefill_token_num : 128
|
| 180 |
+
[I][ Init][ 356]: grp: 1, prefill_max_token_num : 1
|
| 181 |
+
[I][ Init][ 356]: grp: 2, prefill_max_token_num : 128
|
| 182 |
+
[I][ Init][ 356]: grp: 3, prefill_max_token_num : 256
|
| 183 |
+
[I][ Init][ 356]: grp: 4, prefill_max_token_num : 384
|
| 184 |
+
[I][ Init][ 356]: grp: 5, prefill_max_token_num : 512
|
| 185 |
+
[I][ Init][ 356]: grp: 6, prefill_max_token_num : 640
|
| 186 |
+
[I][ Init][ 356]: grp: 7, prefill_max_token_num : 768
|
| 187 |
+
[I][ Init][ 356]: grp: 8, prefill_max_token_num : 896
|
| 188 |
+
[I][ Init][ 356]: grp: 9, prefill_max_token_num : 1024
|
| 189 |
+
[I][ Init][ 356]: grp: 10, prefill_max_token_num : 1152
|
| 190 |
+
[I][ Init][ 360]: prefill_max_token_num : 1152
|
| 191 |
+
[I][ Init][ 372]: LLM init ok
|
| 192 |
+
[I][ Init][ 374]: Left CMM:4385 MB
|
| 193 |
+
Type "q" to exit, Ctrl+c to stop current running
|
| 194 |
+
prompt >> 描述这个视频
|
| 195 |
+
video >> video
|
| 196 |
+
video/frame_0000.jpg
|
| 197 |
+
video/frame_0008.jpg
|
| 198 |
+
video/frame_0016.jpg
|
| 199 |
+
video/frame_0024.jpg
|
| 200 |
+
video/frame_0032.jpg
|
| 201 |
+
video/frame_0040.jpg
|
| 202 |
+
video/frame_0048.jpg
|
| 203 |
+
video/frame_0056.jpg
|
| 204 |
+
[I][ EncodeImage][ 440]: pixel_values size 4
|
| 205 |
+
[I][ EncodeImage][ 441]: grid_h 24 grid_w 24
|
| 206 |
+
[I][ EncodeImage][ 489]: image encode time : 751.481018 ms, size : 4
|
| 207 |
+
[I][ Encode][ 532]: input_ids size:600
|
| 208 |
+
[I][ Encode][ 540]: offset 15
|
| 209 |
+
[I][ Encode][ 569]: img_embed.size:4, 294912
|
| 210 |
+
[I][ Encode][ 574]: offset:159
|
| 211 |
+
[I][ Encode][ 574]: offset:303
|
| 212 |
+
[I][ Encode][ 574]: offset:447
|
| 213 |
+
[I][ Encode][ 583]: out_embed size:1228800
|
| 214 |
+
[I][ Encode][ 584]: input_ids size 600
|
| 215 |
+
[I][ Encode][ 586]: position_ids size:600
|
| 216 |
+
[I][ Run][ 607]: input token num : 600, prefill_split_num : 5
|
| 217 |
+
[I][ Run][ 641]: input_num_token:128
|
| 218 |
+
[I][ Run][ 641]: input_num_token:128
|
| 219 |
+
[I][ Run][ 641]: input_num_token:128
|
| 220 |
+
[I][ Run][ 641]: input_num_token:128
|
| 221 |
+
[I][ Run][ 641]: input_num_token:88
|
| 222 |
+
[I][ Run][ 865]: ttft: 843.36 ms
|
| 223 |
+
这是一段关于两只山地旱獭(也称“山地土拨鼠”)在山地环境中互动的视频。
|
| 224 |
+
|
| 225 |
+
在画面中,两只山地旱獭正站在布满碎石的山坡上,背景是连绵起伏的山脉和蓝天。它们的毛色以灰、棕、黑相间,脸部和耳朵周围有明显的黑白条纹,显得非常可爱。
|
| 226 |
+
|
| 227 |
+
这两只旱獭正在进行一场激烈的“拳击”或“格斗”游戏。它们的前爪高高举起,像在互相击打,但它们的姿势和动作表明它们可能是在进行一场激烈的“拳击”或“格斗”游戏。它们的嘴巴和前爪在空中挥舞,似乎在互相攻击或展示力量。
|
| 228 |
+
|
| 229 |
+
整个场景充满了动感和活力,展现了这些小动物在自然环境中充满活力和趣味的一面。
|
| 230 |
+
|
| 231 |
+
[N][ Run][ 992]: hit eos,avg 14.16 token/s
|
| 232 |
+
|
| 233 |
+
```
|
config.json
ADDED
|
File without changes
|
images/demo.jpg
ADDED
|
Git LFS Details
|
images/demo1.jpg
ADDED
|
Git LFS Details
|
images/recoAll_attractions_1.jpg
ADDED
|
Git LFS Details
|
images/recoAll_attractions_2.jpg
ADDED
|
Git LFS Details
|
images/recoAll_attractions_3.jpg
ADDED
|
Git LFS Details
|
images/recoAll_attractions_4.jpg
ADDED
|
Git LFS Details
|
images/ssd_car.jpg
ADDED
|
Git LFS Details
|
images/ssd_horse.jpg
ADDED
|
Git LFS Details
|