stewiezhou commited on
Commit
5de562d
·
1 Parent(s): 624ac1f

Initial Commit

Browse files
LICENSE.txt ADDED
@@ -0,0 +1,672 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Tencent is pleased to support the community by making DSR Suite available.
2
+
3
+ Copyright (C) 2025 Tencent. All rights reserved.
4
+
5
+ The open-source software and/or models and/or datasets included in this distribution may have been modified by Tencent (“Tencent Modifications”). All Tencent Modifications are Copyright (C) Tencent.
6
+
7
+ DSR Suite is licensed under License Term of DSR Suite, except for the third-party components listed below, which remain licensed under their respective original terms. DSR Suite does not impose any additional restrictions beyond those specified in the original licenses of these third-party components. Users are required to comply with all applicable terms and conditions of the original licenses and to ensure that the use of these third-party components conforms to all relevant laws and regulations.
8
+
9
+ For the avoidance of doubt, DSR Suite refers solely to inference code, parameters, and weights made publicly available by Tencent in accordance with License Term of DSR Suite.
10
+
11
+ Terms of License Term of DSR Suite:
12
+ --------------------------------------------------------------------
13
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
14
+ 0. Additional Territorial Limitation
15
+
16
+ DSR Suite IS NOT INTENDED FOR USE WITHIN THE EUROPEAN UNION.
17
+ IN THE EVENT OF ANY CONFLICT, THIS CLAUSE SHALL PREVAIL.
18
+
19
+ You agree to use the DSR Suite only for academic purposes, and refrain from using it for any commercial or production purposes under any circumstances.
20
+
21
+ 1. Definitions.
22
+
23
+ "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document.
24
+
25
+ "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License.
26
+
27
+ "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity.
28
+
29
+ "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License.
30
+
31
+ "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files.
32
+
33
+ "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types.
34
+
35
+ "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below).
36
+
37
+ "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof.
38
+
39
+ "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution."
40
+
41
+ "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work.
42
+
43
+ 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form.
44
+
45
+ 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed.
46
+
47
+ 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions:
48
+
49
+ You must give any other recipients of the Work or Derivative Works a copy of this License; and
50
+
51
+ You must cause any modified files to carry prominent notices stating that You changed the files; and
52
+
53
+ You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and
54
+
55
+ If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License.
56
+
57
+ You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License.
58
+
59
+ 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions.
60
+
61
+ 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file.
62
+
63
+ 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License.
64
+
65
+ 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages.
66
+
67
+ 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability.
68
+
69
+ END OF TERMS AND CONDITIONS
70
+
71
+
72
+
73
+
74
+ Dependencies and Licenses:
75
+
76
+ This open-source project, DSR Suite, builds upon the following open-source models and/or software components, each of which remains licensed under its original license. Certain models or software may include modifications made by Tencent (“Tencent Modifications”), which are Copyright (C) Tencent.
77
+
78
+ In case you believe there have been errors in the attribution below, you may submit the concerns to us for review and correction.
79
+
80
+ Open Source Model Licensed under the Apache-2.0:
81
+ --------------------------------------------------------------------
82
+ 1.Qwen/Qwen2.5-VL-7B-Instruct
83
+ Copyright (c) 2025 Qwen2.5-VL-7B-Instruct Original author and authors
84
+
85
+ Terms of the Apache-2.0:
86
+ --------------------------------------------------------------------
87
+ Apache License
88
+ Version 2.0, January 2004
89
+ http://www.apache.org/licenses/
90
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
91
+
92
+ Definitions.
93
+
94
+ "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document.
95
+
96
+ "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License.
97
+
98
+ "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity.
99
+
100
+ "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License.
101
+
102
+ "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files.
103
+
104
+ "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types.
105
+
106
+ "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below).
107
+
108
+ "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof.
109
+
110
+ "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution."
111
+
112
+ "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work.
113
+
114
+ Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form.
115
+
116
+ Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed.
117
+
118
+ Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions:
119
+
120
+ (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License.
121
+
122
+ Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions.
123
+
124
+ Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file.
125
+
126
+ Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License.
127
+
128
+ Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages.
129
+
130
+ Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability.
131
+
132
+ END OF TERMS AND CONDITIONS
133
+
134
+
135
+ Open Source Software Licensed under the Apache-2.0:
136
+ --------------------------------------------------------------------
137
+ 1. transformers
138
+ Copyright 2018- The Hugging Face team. All rights reserved.
139
+
140
+ 2. deepspeed
141
+ Copyright (c) 2025 DeepSpeed original author and authors
142
+
143
+ 3. accelerate
144
+ Copyright (c) 2025 accelerate original author and authors
145
+
146
+ 4. huggingface-hub
147
+ Copyright (c) 2024 huggingface_hub original author and authors
148
+ Terms of the Apache-2.0:
149
+ --------------------------------------------------------------------
150
+ Apache License
151
+ Version 2.0, January 2004
152
+ http://www.apache.org/licenses/
153
+
154
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
155
+
156
+ 1. Definitions.
157
+
158
+ "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document.
159
+
160
+ "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License.
161
+
162
+ "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity.
163
+
164
+ "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License.
165
+
166
+ "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files.
167
+
168
+ "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types.
169
+
170
+ "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below).
171
+
172
+ "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof.
173
+
174
+ "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution."
175
+
176
+ "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work.
177
+
178
+ 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form.
179
+
180
+ 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed.
181
+
182
+ 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions:
183
+
184
+ You must give any other recipients of the Work or Derivative Works a copy of this License; and
185
+ You must cause any modified files to carry prominent notices stating that You changed the files; and
186
+ You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and
187
+ If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License.
188
+ You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License.
189
+
190
+ 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions.
191
+
192
+ 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file.
193
+
194
+ 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License.
195
+
196
+ 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages.
197
+
198
+ 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability.
199
+
200
+ END OF TERMS AND CONDITIONS
201
+
202
+
203
+
204
+
205
+
206
+ Open Source Software Licensed under the BSD-3-Clause:
207
+ --------------------------------------------------------------------
208
+ 1. torch
209
+ Copyright (c) 2016- Facebook, Inc (Adam Paszke); Copyright (c) 2014- Facebook, Inc (Soumith Chintala); Copyright (c) 2011-2014 Idiap Research Institute (Ronan Collobert); Copyright (c) 2012-2014 Deepmind Technologies (Koray Kavukcuoglu); Copyright (c) 2011-2012 NEC Laboratories America (Koray Kavukcuoglu); Copyright (c) 2011-2013 NYU (Clement Farabet); Copyright (c) 2006-2010 NEC Laboratories America (Ronan Collobert, Leon Bottou, Iain Melvin, Jason Weston); Copyright (c) 2006 Idiap Research Institute (Samy Bengio); Copyright (c) 2001-2004 Idiap Research Institute (Ronan Collobert, Samy Bengio, Johnny Mariethoz); Copyright (c) 2016-present, Facebook Inc. All rights reserved.; Copyright (c) 2016 Facebook Inc.; Copyright (c) 2015 Google Inc.; Copyright (c) 2015 Yangqing Jia; Copyright 2019-2020 Kakao Brain; Copyright (c) 2022 Cruise LLC.; Copyright (c) 2021, 2023-2024 Arm Limited and/or its affiliates; Copyright(c) 2013, 2014, 2015, the respective contributors; Copyright(c) 2015, 2016 the respective contributors
210
+
211
+ 2. torchvision
212
+ Copyright (c) Soumith Chintala 2016,
213
+
214
+ 3. flash attention 2
215
+ Copyright (c) 2022, the respective contributors, as shown by the AUTHORS file.
216
+
217
+ 4. numpy
218
+ Copyright (c) 2005-2023, NumPy Developers.
219
+ Terms of the BSD-3-Clause:
220
+ --------------------------------------------------------------------
221
+ BSD 3-Clause License
222
+
223
+ Redistribution and use in source and binary forms, with or without
224
+ modification, are permitted provided that the following conditions are met:
225
+
226
+ 1. Redistributions of source code must retain the above copyright notice, this
227
+ list of conditions and the following disclaimer.
228
+
229
+ 2. Redistributions in binary form must reproduce the above copyright notice,
230
+ this list of conditions and the following disclaimer in the documentation
231
+ and/or other materials provided with the distribution.
232
+
233
+ 3. Neither the name of the copyright holder nor the names of its
234
+ contributors may be used to endorse or promote products derived from
235
+ this software without specific prior written permission.
236
+
237
+ THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
238
+ AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
239
+ IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
240
+ DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
241
+ FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
242
+ DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
243
+ SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
244
+ CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
245
+ OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
246
+ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
247
+
248
+
249
+
250
+
251
+
252
+ Open Source Software Licensed under the MIT:
253
+ --------------------------------------------------------------------
254
+ 1. opencv-python
255
+ Copyright (c) Olli-Pekka Heinisuo
256
+
257
+ 2. hydra-core
258
+ Copyright (c) Facebook, Inc. and its affiliates.
259
+ Terms of the MIT:
260
+ --------------------------------------------------------------------
261
+ MIT License
262
+
263
+ Permission is hereby granted, free of charge, to any person obtaining a copy
264
+ of this software and associated documentation files (the "Software"), to deal
265
+ in the Software without restriction, including without limitation the rights
266
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
267
+ copies of the Software, and to permit persons to whom the Software is
268
+ furnished to do so, subject to the following conditions:
269
+
270
+ The above copyright notice and this permission notice shall be included in all
271
+ copies or substantial portions of the Software.
272
+
273
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
274
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
275
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
276
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
277
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
278
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
279
+ SOFTWARE.
280
+
281
+
282
+
283
+
284
+
285
+ Open Source Software Licensed under the MPL-2.0 AND MIT:
286
+ --------------------------------------------------------------------
287
+ 1. tqdm
288
+ Copyright (c) 2013 noamraph
289
+ Terms of the MPL-2.0 AND MIT:
290
+ --------------------------------------------------------------------
291
+
292
+ Mozilla Public License, version 2.0
293
+
294
+ 1. Definitions
295
+
296
+ 1.1. “Contributor”
297
+
298
+ means each individual or legal entity that creates, contributes to the
299
+ creation of, or owns Covered Software.
300
+
301
+ 1.2. “Contributor Version”
302
+
303
+ means the combination of the Contributions of others (if any) used by a
304
+ Contributor and that particular Contributor’s Contribution.
305
+
306
+ 1.3. “Contribution”
307
+
308
+ means Covered Software of a particular Contributor.
309
+
310
+ 1.4. “Covered Software”
311
+
312
+ means Source Code Form to which the initial Contributor has attached the
313
+ notice in Exhibit A, the Executable Form of such Source Code Form, and
314
+ Modifications of such Source Code Form, in each case including portions
315
+ thereof.
316
+
317
+ 1.5. “Incompatible With Secondary Licenses”
318
+ means
319
+
320
+ a. that the initial Contributor has attached the notice described in
321
+ Exhibit B to the Covered Software; or
322
+
323
+ b. that the Covered Software was made available under the terms of version
324
+ 1.1 or earlier of the License, but not also under the terms of a
325
+ Secondary License.
326
+
327
+ 1.6. “Executable Form”
328
+
329
+ means any form of the work other than Source Code Form.
330
+
331
+ 1.7. “Larger Work”
332
+
333
+ means a work that combines Covered Software with other material, in a separate
334
+ file or files, that is not Covered Software.
335
+
336
+ 1.8. “License”
337
+
338
+ means this document.
339
+
340
+ 1.9. “Licensable”
341
+
342
+ means having the right to grant, to the maximum extent possible, whether at the
343
+ time of the initial grant or subsequently, any and all of the rights conveyed by
344
+ this License.
345
+
346
+ 1.10. “Modifications”
347
+
348
+ means any of the following:
349
+
350
+ a. any file in Source Code Form that results from an addition to, deletion
351
+ from, or modification of the contents of Covered Software; or
352
+
353
+ b. any new file in Source Code Form that contains any Covered Software.
354
+
355
+ 1.11. “Patent Claims” of a Contributor
356
+
357
+ means any patent claim(s), including without limitation, method, process,
358
+ and apparatus claims, in any patent Licensable by such Contributor that
359
+ would be infringed, but for the grant of the License, by the making,
360
+ using, selling, offering for sale, having made, import, or transfer of
361
+ either its Contributions or its Contributor Version.
362
+
363
+ 1.12. “Secondary License”
364
+
365
+ means either the GNU General Public License, Version 2.0, the GNU Lesser
366
+ General Public License, Version 2.1, the GNU Affero General Public
367
+ License, Version 3.0, or any later versions of those licenses.
368
+
369
+ 1.13. “Source Code Form”
370
+
371
+ means the form of the work preferred for making modifications.
372
+
373
+ 1.14. “You” (or “Your”)
374
+
375
+ means an individual or a legal entity exercising rights under this
376
+ License. For legal entities, “You” includes any entity that controls, is
377
+ controlled by, or is under common control with You. For purposes of this
378
+ definition, “control” means (a) the power, direct or indirect, to cause
379
+ the direction or management of such entity, whether by contract or
380
+ otherwise, or (b) ownership of more than fifty percent (50%) of the
381
+ outstanding shares or beneficial ownership of such entity.
382
+
383
+
384
+ 2. License Grants and Conditions
385
+
386
+ 2.1. Grants
387
+
388
+ Each Contributor hereby grants You a world-wide, royalty-free,
389
+ non-exclusive license:
390
+
391
+ a. under intellectual property rights (other than patent or trademark)
392
+ Licensable by such Contributor to use, reproduce, make available,
393
+ modify, display, perform, distribute, and otherwise exploit its
394
+ Contributions, either on an unmodified basis, with Modifications, or as
395
+ part of a Larger Work; and
396
+
397
+ b. under Patent Claims of such Contributor to make, use, sell, offer for
398
+ sale, have made, import, and otherwise transfer either its Contributions
399
+ or its Contributor Version.
400
+
401
+ 2.2. Effective Date
402
+
403
+ The licenses granted in Section 2.1 with respect to any Contribution become
404
+ effective for each Contribution on the date the Contributor first distributes
405
+ such Contribution.
406
+
407
+ 2.3. Limitations on Grant Scope
408
+
409
+ The licenses granted in this Section 2 are the only rights granted under this
410
+ License. No additional rights or licenses will be implied from the distribution
411
+ or licensing of Covered Software under this License. Notwithstanding Section
412
+ 2.1(b) above, no patent license is granted by a Contributor:
413
+
414
+ a. for any code that a Contributor has removed from Covered Software; or
415
+
416
+ b. for infringements caused by: (i) Your and any other third party’s
417
+ modifications of Covered Software, or (ii) the combination of its
418
+ Contributions with other software (except as part of its Contributor
419
+ Version); or
420
+
421
+ c. under Patent Claims infringed by Covered Software in the absence of its
422
+ Contributions.
423
+
424
+ This License does not grant any rights in the trademarks, service marks, or
425
+ logos of any Contributor (except as may be necessary to comply with the
426
+ notice requirements in Section 3.4).
427
+
428
+ 2.4. Subsequent Licenses
429
+
430
+ No Contributor makes additional grants as a result of Your choice to
431
+ distribute the Covered Software under a subsequent version of this License
432
+ (see Section 10.2) or under the terms of a Secondary License (if permitted
433
+ under the terms of Section 3.3).
434
+
435
+ 2.5. Representation
436
+
437
+ Each Contributor represents that the Contributor believes its Contributions
438
+ are its original creation(s) or it has sufficient rights to grant the
439
+ rights to its Contributions conveyed by this License.
440
+
441
+ 2.6. Fair Use
442
+
443
+ This License is not intended to limit any rights You have under applicable
444
+ copyright doctrines of fair use, fair dealing, or other equivalents.
445
+
446
+ 2.7. Conditions
447
+
448
+ Sections 3.1, 3.2, 3.3, and 3.4 are conditions of the licenses granted in
449
+ Section 2.1.
450
+
451
+
452
+ 3. Responsibilities
453
+
454
+ 3.1. Distribution of Source Form
455
+
456
+ All distribution of Covered Software in Source Code Form, including any
457
+ Modifications that You create or to which You contribute, must be under the
458
+ terms of this License. You must inform recipients that the Source Code Form
459
+ of the Covered Software is governed by the terms of this License, and how
460
+ they can obtain a copy of this License. You may not attempt to alter or
461
+ restrict the recipients’ rights in the Source Code Form.
462
+
463
+ 3.2. Distribution of Executable Form
464
+
465
+ If You distribute Covered Software in Executable Form then:
466
+
467
+ a. such Covered Software must also be made available in Source Code Form,
468
+ as described in Section 3.1, and You must inform recipients of the
469
+ Executable Form how they can obtain a copy of such Source Code Form by
470
+ reasonable means in a timely manner, at a charge no more than the cost
471
+ of distribution to the recipient; and
472
+
473
+ b. You may distribute such Executable Form under the terms of this License,
474
+ or sublicense it under different terms, provided that the license for
475
+ the Executable Form does not attempt to limit or alter the recipients’
476
+ rights in the Source Code Form under this License.
477
+
478
+ 3.3. Distribution of a Larger Work
479
+
480
+ You may create and distribute a Larger Work under terms of Your choice,
481
+ provided that You also comply with the requirements of this License for the
482
+ Covered Software. If the Larger Work is a combination of Covered Software
483
+ with a work governed by one or more Secondary Licenses, and the Covered
484
+ Software is not Incompatible With Secondary Licenses, this License permits
485
+ You to additionally distribute such Covered Software under the terms of
486
+ such Secondary License(s), so that the recipient of the Larger Work may, at
487
+ their option, further distribute the Covered Software under the terms of
488
+ either this License or such Secondary License(s).
489
+
490
+ 3.4. Notices
491
+
492
+ You may not remove or alter the substance of any license notices (including
493
+ copyright notices, patent notices, disclaimers of warranty, or limitations
494
+ of liability) contained within the Source Code Form of the Covered
495
+ Software, except that You may alter any license notices to the extent
496
+ required to remedy known factual inaccuracies.
497
+
498
+ 3.5. Application of Additional Terms
499
+
500
+ You may choose to offer, and to charge a fee for, warranty, support,
501
+ indemnity or liability obligations to one or more recipients of Covered
502
+ Software. However, You may do so only on Your own behalf, and not on behalf
503
+ of any Contributor. You must make it absolutely clear that any such
504
+ warranty, support, indemnity, or liability obligation is offered by You
505
+ alone, and You hereby agree to indemnify every Contributor for any
506
+ liability incurred by such Contributor as a result of warranty, support,
507
+ indemnity or liability terms You offer. You may include additional
508
+ disclaimers of warranty and limitations of liability specific to any
509
+ jurisdiction.
510
+
511
+ 4. Inability to Comply Due to Statute or Regulation
512
+
513
+ If it is impossible for You to comply with any of the terms of this License
514
+ with respect to some or all of the Covered Software due to statute, judicial
515
+ order, or regulation then You must: (a) comply with the terms of this License
516
+ to the maximum extent possible; and (b) describe the limitations and the code
517
+ they affect. Such description must be placed in a text file included with all
518
+ distributions of the Covered Software under this License. Except to the
519
+ extent prohibited by statute or regulation, such description must be
520
+ sufficiently detailed for a recipient of ordinary skill to be able to
521
+ understand it.
522
+
523
+ 5. Termination
524
+
525
+ 5.1. The rights granted under this License will terminate automatically if You
526
+ fail to comply with any of its terms. However, if You become compliant,
527
+ then the rights granted under this License from a particular Contributor
528
+ are reinstated (a) provisionally, unless and until such Contributor
529
+ explicitly and finally terminates Your grants, and (b) on an ongoing basis,
530
+ if such Contributor fails to notify You of the non-compliance by some
531
+ reasonable means prior to 60 days after You have come back into compliance.
532
+ Moreover, Your grants from a particular Contributor are reinstated on an
533
+ ongoing basis if such Contributor notifies You of the non-compliance by
534
+ some reasonable means, this is the first time You have received notice of
535
+ non-compliance with this License from such Contributor, and You become
536
+ compliant prior to 30 days after Your receipt of the notice.
537
+
538
+ 5.2. If You initiate litigation against any entity by asserting a patent
539
+ infringement claim (excluding declaratory judgment actions, counter-claims,
540
+ and cross-claims) alleging that a Contributor Version directly or
541
+ indirectly infringes any patent, then the rights granted to You by any and
542
+ all Contributors for the Covered Software under Section 2.1 of this License
543
+ shall terminate.
544
+
545
+ 5.3. In the event of termination under Sections 5.1 or 5.2 above, all end user
546
+ license agreements (excluding distributors and resellers) which have been
547
+ validly granted by You or Your distributors under this License prior to
548
+ termination shall survive termination.
549
+
550
+ 6. Disclaimer of Warranty
551
+
552
+ Covered Software is provided under this License on an “as is” basis, without
553
+ warranty of any kind, either expressed, implied, or statutory, including,
554
+ without limitation, warranties that the Covered Software is free of defects,
555
+ merchantable, fit for a particular purpose or non-infringing. The entire
556
+ risk as to the quality and performance of the Covered Software is with You.
557
+ Should any Covered Software prove defective in any respect, You (not any
558
+ Contributor) assume the cost of any necessary servicing, repair, or
559
+ correction. This disclaimer of warranty constitutes an essential part of this
560
+ License. No use of any Covered Software is authorized under this License
561
+ except under this disclaimer.
562
+
563
+ 7. Limitation of Liability
564
+
565
+ Under no circumstances and under no legal theory, whether tort (including
566
+ negligence), contract, or otherwise, shall any Contributor, or anyone who
567
+ distributes Covered Software as permitted above, be liable to You for any
568
+ direct, indirect, special, incidental, or consequential damages of any
569
+ character including, without limitation, damages for lost profits, loss of
570
+ goodwill, work stoppage, computer failure or malfunction, or any and all
571
+ other commercial damages or losses, even if such party shall have been
572
+ informed of the possibility of such damages. This limitation of liability
573
+ shall not apply to liability for death or personal injury resulting from such
574
+ party’s negligence to the extent applicable law prohibits such limitation.
575
+ Some jurisdictions do not allow the exclusion or limitation of incidental or
576
+ consequential damages, so this exclusion and limitation may not apply to You.
577
+
578
+ 8. Litigation
579
+
580
+ Any litigation relating to this License may be brought only in the courts of
581
+ a jurisdiction where the defendant maintains its principal place of business
582
+ and such litigation shall be governed by laws of that jurisdiction, without
583
+ reference to its conflict-of-law provisions. Nothing in this Section shall
584
+ prevent a party’s ability to bring cross-claims or counter-claims.
585
+
586
+ 9. Miscellaneous
587
+
588
+ This License represents the complete agreement concerning the subject matter
589
+ hereof. If any provision of this License is held to be unenforceable, such
590
+ provision shall be reformed only to the extent necessary to make it
591
+ enforceable. Any law or regulation which provides that the language of a
592
+ contract shall be construed against the drafter shall not be used to construe
593
+ this License against a Contributor.
594
+
595
+
596
+ 10. Versions of the License
597
+
598
+ 10.1. New Versions
599
+
600
+ Mozilla Foundation is the license steward. Except as provided in Section
601
+ 10.3, no one other than the license steward has the right to modify or
602
+ publish new versions of this License. Each version will be given a
603
+ distinguishing version number.
604
+
605
+ 10.2. Effect of New Versions
606
+
607
+ You may distribute the Covered Software under the terms of the version of
608
+ the License under which You originally received the Covered Software, or
609
+ under the terms of any subsequent version published by the license
610
+ steward.
611
+
612
+ 10.3. Modified Versions
613
+
614
+ If you create software not governed by this License, and you want to
615
+ create a new license for such software, you may create and use a modified
616
+ version of this License if you rename the license and remove any
617
+ references to the name of the license steward (except to note that such
618
+ modified license differs from this License).
619
+
620
+ 10.4. Distributing Source Code Form that is Incompatible With Secondary Licenses
621
+ If You choose to distribute Source Code Form that is Incompatible With
622
+ Secondary Licenses under the terms of this version of the License, the
623
+ notice described in Exhibit B of this License must be attached.
624
+
625
+ Exhibit A - Source Code Form License Notice
626
+
627
+ This Source Code Form is subject to the
628
+ terms of the Mozilla Public License, v.
629
+ 2.0. If a copy of the MPL was not
630
+ distributed with this file, You can
631
+ obtain one at
632
+ http://mozilla.org/MPL/2.0/.
633
+
634
+ If it is not possible or desirable to put the notice in a particular file, then
635
+ You may include the notice in a location (such as a LICENSE file in a relevant
636
+ directory) where a recipient would be likely to look for such a notice.
637
+
638
+ You may add additional accurate notices of copyright ownership.
639
+
640
+ Exhibit B - “Incompatible With Secondary Licenses” Notice
641
+
642
+ This Source Code Form is “Incompatible
643
+ With Secondary Licenses”, as defined by
644
+ the Mozilla Public License, v. 2.0.
645
+
646
+
647
+ --------------------
648
+ And also:
649
+ --------------------
650
+
651
+ MIT License
652
+
653
+ Permission is hereby granted, free of charge, to any person obtaining a copy
654
+ of this software and associated documentation files (the "Software"), to deal
655
+ in the Software without restriction, including without limitation the rights
656
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
657
+ copies of the Software, and to permit persons to whom the Software is
658
+ furnished to do so, subject to the following conditions:
659
+
660
+ The above copyright notice and this permission notice shall be included in all
661
+ copies or substantial portions of the Software.
662
+
663
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
664
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
665
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
666
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
667
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
668
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
669
+ SOFTWARE.
670
+
671
+ ==================================================
672
+ End of the Attribution Notice of this project.
README.md CHANGED
@@ -1,3 +1,4 @@
1
  ---
2
  license: apache-2.0
 
3
  ---
 
1
  ---
2
  license: apache-2.0
3
+ extra_gated_eu_disallowed: true
4
  ---
added_tokens.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "</tool_call>": 151658,
3
+ "<tool_call>": 151657,
4
+ "<|box_end|>": 151649,
5
+ "<|box_start|>": 151648,
6
+ "<|endoftext|>": 151643,
7
+ "<|file_sep|>": 151664,
8
+ "<|fim_middle|>": 151660,
9
+ "<|fim_pad|>": 151662,
10
+ "<|fim_prefix|>": 151659,
11
+ "<|fim_suffix|>": 151661,
12
+ "<|im_end|>": 151645,
13
+ "<|im_start|>": 151644,
14
+ "<|image_pad|>": 151655,
15
+ "<|object_ref_end|>": 151647,
16
+ "<|object_ref_start|>": 151646,
17
+ "<|quad_end|>": 151651,
18
+ "<|quad_start|>": 151650,
19
+ "<|repo_name|>": 151663,
20
+ "<|video_pad|>": 151656,
21
+ "<|vision_end|>": 151653,
22
+ "<|vision_pad|>": 151654,
23
+ "<|vision_start|>": 151652
24
+ }
chat_template.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "chat_template": "{% set image_count = namespace(value=0) %}{% set video_count = namespace(value=0) %}{% for message in messages %}{% if loop.first and message['role'] != 'system' %}<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n{% endif %}<|im_start|>{{ message['role'] }}\n{% if message['content'] is string %}{{ message['content'] }}<|im_end|>\n{% else %}{% for content in message['content'] %}{% if content['type'] == 'image' or 'image' in content or 'image_url' in content %}{% set image_count.value = image_count.value + 1 %}{% if add_vision_id %}Picture {{ image_count.value }}: {% endif %}<|vision_start|><|image_pad|><|vision_end|>{% elif content['type'] == 'video' or 'video' in content %}{% set video_count.value = video_count.value + 1 %}{% if add_vision_id %}Video {{ video_count.value }}: {% endif %}<|vision_start|><|video_pad|><|vision_end|>{% elif 'text' in content %}{{ content['text'] }}{% endif %}{% endfor %}<|im_end|>\n{% endif %}{% endfor %}{% if add_generation_prompt %}<|im_start|>assistant\n{% endif %}"
3
+ }
config.json ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "Qwen2_5_VLForConditionalGeneration_Spatial"
4
+ ],
5
+ "attention_dropout": 0.0,
6
+ "bos_token_id": 151643,
7
+ "eos_token_id": 151645,
8
+ "hidden_act": "silu",
9
+ "hidden_size": 3584,
10
+ "image_token_id": 151655,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 18944,
13
+ "max_position_embeddings": 128000,
14
+ "max_window_layers": 28,
15
+ "model_type": "qwen2_5_vl",
16
+ "num_attention_heads": 28,
17
+ "num_hidden_layers": 28,
18
+ "num_key_value_heads": 4,
19
+ "rms_norm_eps": 1e-06,
20
+ "rope_scaling": {
21
+ "mrope_section": [
22
+ 16,
23
+ 24,
24
+ 24
25
+ ],
26
+ "rope_type": "default",
27
+ "type": "default"
28
+ },
29
+ "rope_theta": 1000000.0,
30
+ "sliding_window": 32768,
31
+ "tie_word_embeddings": false,
32
+ "torch_dtype": "bfloat16",
33
+ "transformers_version": "4.51.1",
34
+ "use_cache": true,
35
+ "use_sliding_window": false,
36
+ "video_token_id": 151656,
37
+ "vision_config": {
38
+ "config_all": {
39
+ "model_type": "qwen2_5_vl"
40
+ },
41
+ "depth": 32,
42
+ "fullatt_block_indexes": [
43
+ 7,
44
+ 15,
45
+ 23,
46
+ 31
47
+ ],
48
+ "hidden_act": "silu",
49
+ "hidden_size": 1280,
50
+ "in_channels": 3,
51
+ "in_chans": 3,
52
+ "intermediate_size": 3420,
53
+ "model_type": "qwen2_5_vl",
54
+ "num_heads": 16,
55
+ "out_hidden_size": 3584,
56
+ "patch_size": 14,
57
+ "spatial_merge_size": 2,
58
+ "spatial_patch_size": 14,
59
+ "temporal_patch_size": 2,
60
+ "tokens_per_second": 2,
61
+ "torch_dtype": "bfloat16",
62
+ "window_size": 112
63
+ },
64
+ "vision_end_token_id": 151653,
65
+ "vision_start_token_id": 151652,
66
+ "vision_token_id": 151654,
67
+ "vocab_size": 152064
68
+ }
generation_config.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "attn_implementation": "flash_attention_2",
3
+ "bos_token_id": 151643,
4
+ "do_sample": true,
5
+ "eos_token_id": [
6
+ 151645,
7
+ 151643
8
+ ],
9
+ "pad_token_id": 151643,
10
+ "repetition_penalty": 1.05,
11
+ "temperature": 1e-06,
12
+ "transformers_version": "4.51.1"
13
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model-00001-of-00004.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0e71a377757ae98dfbf746b9c6f7a340f58bf597017c607c19e9fac516e56c9e
3
+ size 4973966448
model-00002-of-00004.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:18b122ac4d0c9b0de4f801c0cb668d4c911591a8ebd4bd4293c6d7b2c1d746f4
3
+ size 4991495784
model-00003-of-00004.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f435e3a20158ad73c487711c93ee887f1a59837b5a54ad225fd4629e0bab134e
3
+ size 4991495888
model-00004-of-00004.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:26be71b5938dadeebac043f56127fb9c5119fa219e6b2c58e8dfa482d44759fa
3
+ size 3361856688
model.safetensors.index.json ADDED
@@ -0,0 +1,1116 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "metadata": {
3
+ "total_size": 18318688256
4
+ },
5
+ "weight_map": {
6
+ "lm_head.weight": "model-00004-of-00004.safetensors",
7
+ "model.embed_tokens.weight": "model-00001-of-00004.safetensors",
8
+ "model.layers.0.input_layernorm.weight": "model-00001-of-00004.safetensors",
9
+ "model.layers.0.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
10
+ "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
11
+ "model.layers.0.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
12
+ "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
13
+ "model.layers.0.self_attn.k_proj.bias": "model-00001-of-00004.safetensors",
14
+ "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
15
+ "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
16
+ "model.layers.0.self_attn.q_proj.bias": "model-00001-of-00004.safetensors",
17
+ "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
18
+ "model.layers.0.self_attn.v_proj.bias": "model-00001-of-00004.safetensors",
19
+ "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
20
+ "model.layers.1.input_layernorm.weight": "model-00002-of-00004.safetensors",
21
+ "model.layers.1.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
22
+ "model.layers.1.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
23
+ "model.layers.1.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
24
+ "model.layers.1.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
25
+ "model.layers.1.self_attn.k_proj.bias": "model-00001-of-00004.safetensors",
26
+ "model.layers.1.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
27
+ "model.layers.1.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
28
+ "model.layers.1.self_attn.q_proj.bias": "model-00001-of-00004.safetensors",
29
+ "model.layers.1.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
30
+ "model.layers.1.self_attn.v_proj.bias": "model-00001-of-00004.safetensors",
31
+ "model.layers.1.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
32
+ "model.layers.10.input_layernorm.weight": "model-00002-of-00004.safetensors",
33
+ "model.layers.10.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
34
+ "model.layers.10.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
35
+ "model.layers.10.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
36
+ "model.layers.10.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
37
+ "model.layers.10.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
38
+ "model.layers.10.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
39
+ "model.layers.10.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
40
+ "model.layers.10.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
41
+ "model.layers.10.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
42
+ "model.layers.10.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
43
+ "model.layers.10.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
44
+ "model.layers.11.input_layernorm.weight": "model-00002-of-00004.safetensors",
45
+ "model.layers.11.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
46
+ "model.layers.11.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
47
+ "model.layers.11.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
48
+ "model.layers.11.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
49
+ "model.layers.11.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
50
+ "model.layers.11.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
51
+ "model.layers.11.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
52
+ "model.layers.11.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
53
+ "model.layers.11.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
54
+ "model.layers.11.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
55
+ "model.layers.11.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
56
+ "model.layers.12.input_layernorm.weight": "model-00003-of-00004.safetensors",
57
+ "model.layers.12.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
58
+ "model.layers.12.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
59
+ "model.layers.12.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
60
+ "model.layers.12.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
61
+ "model.layers.12.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
62
+ "model.layers.12.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
63
+ "model.layers.12.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
64
+ "model.layers.12.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
65
+ "model.layers.12.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
66
+ "model.layers.12.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
67
+ "model.layers.12.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
68
+ "model.layers.13.input_layernorm.weight": "model-00003-of-00004.safetensors",
69
+ "model.layers.13.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
70
+ "model.layers.13.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
71
+ "model.layers.13.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
72
+ "model.layers.13.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
73
+ "model.layers.13.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
74
+ "model.layers.13.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
75
+ "model.layers.13.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
76
+ "model.layers.13.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
77
+ "model.layers.13.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
78
+ "model.layers.13.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
79
+ "model.layers.13.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
80
+ "model.layers.14.input_layernorm.weight": "model-00003-of-00004.safetensors",
81
+ "model.layers.14.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
82
+ "model.layers.14.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
83
+ "model.layers.14.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
84
+ "model.layers.14.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
85
+ "model.layers.14.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
86
+ "model.layers.14.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
87
+ "model.layers.14.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
88
+ "model.layers.14.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
89
+ "model.layers.14.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
90
+ "model.layers.14.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
91
+ "model.layers.14.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
92
+ "model.layers.15.input_layernorm.weight": "model-00003-of-00004.safetensors",
93
+ "model.layers.15.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
94
+ "model.layers.15.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
95
+ "model.layers.15.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
96
+ "model.layers.15.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
97
+ "model.layers.15.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
98
+ "model.layers.15.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
99
+ "model.layers.15.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
100
+ "model.layers.15.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
101
+ "model.layers.15.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
102
+ "model.layers.15.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
103
+ "model.layers.15.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
104
+ "model.layers.16.input_layernorm.weight": "model-00003-of-00004.safetensors",
105
+ "model.layers.16.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
106
+ "model.layers.16.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
107
+ "model.layers.16.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
108
+ "model.layers.16.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
109
+ "model.layers.16.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
110
+ "model.layers.16.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
111
+ "model.layers.16.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
112
+ "model.layers.16.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
113
+ "model.layers.16.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
114
+ "model.layers.16.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
115
+ "model.layers.16.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
116
+ "model.layers.17.input_layernorm.weight": "model-00003-of-00004.safetensors",
117
+ "model.layers.17.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
118
+ "model.layers.17.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
119
+ "model.layers.17.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
120
+ "model.layers.17.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
121
+ "model.layers.17.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
122
+ "model.layers.17.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
123
+ "model.layers.17.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
124
+ "model.layers.17.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
125
+ "model.layers.17.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
126
+ "model.layers.17.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
127
+ "model.layers.17.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
128
+ "model.layers.18.input_layernorm.weight": "model-00003-of-00004.safetensors",
129
+ "model.layers.18.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
130
+ "model.layers.18.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
131
+ "model.layers.18.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
132
+ "model.layers.18.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
133
+ "model.layers.18.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
134
+ "model.layers.18.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
135
+ "model.layers.18.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
136
+ "model.layers.18.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
137
+ "model.layers.18.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
138
+ "model.layers.18.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
139
+ "model.layers.18.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
140
+ "model.layers.19.input_layernorm.weight": "model-00003-of-00004.safetensors",
141
+ "model.layers.19.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
142
+ "model.layers.19.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
143
+ "model.layers.19.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
144
+ "model.layers.19.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
145
+ "model.layers.19.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
146
+ "model.layers.19.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
147
+ "model.layers.19.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
148
+ "model.layers.19.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
149
+ "model.layers.19.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
150
+ "model.layers.19.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
151
+ "model.layers.19.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
152
+ "model.layers.2.input_layernorm.weight": "model-00002-of-00004.safetensors",
153
+ "model.layers.2.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
154
+ "model.layers.2.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
155
+ "model.layers.2.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
156
+ "model.layers.2.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
157
+ "model.layers.2.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
158
+ "model.layers.2.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
159
+ "model.layers.2.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
160
+ "model.layers.2.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
161
+ "model.layers.2.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
162
+ "model.layers.2.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
163
+ "model.layers.2.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
164
+ "model.layers.20.input_layernorm.weight": "model-00003-of-00004.safetensors",
165
+ "model.layers.20.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
166
+ "model.layers.20.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
167
+ "model.layers.20.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
168
+ "model.layers.20.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
169
+ "model.layers.20.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
170
+ "model.layers.20.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
171
+ "model.layers.20.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
172
+ "model.layers.20.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
173
+ "model.layers.20.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
174
+ "model.layers.20.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
175
+ "model.layers.20.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
176
+ "model.layers.21.input_layernorm.weight": "model-00003-of-00004.safetensors",
177
+ "model.layers.21.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
178
+ "model.layers.21.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
179
+ "model.layers.21.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
180
+ "model.layers.21.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
181
+ "model.layers.21.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
182
+ "model.layers.21.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
183
+ "model.layers.21.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
184
+ "model.layers.21.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
185
+ "model.layers.21.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
186
+ "model.layers.21.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
187
+ "model.layers.21.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
188
+ "model.layers.22.input_layernorm.weight": "model-00003-of-00004.safetensors",
189
+ "model.layers.22.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
190
+ "model.layers.22.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
191
+ "model.layers.22.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
192
+ "model.layers.22.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
193
+ "model.layers.22.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
194
+ "model.layers.22.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
195
+ "model.layers.22.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
196
+ "model.layers.22.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
197
+ "model.layers.22.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
198
+ "model.layers.22.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
199
+ "model.layers.22.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
200
+ "model.layers.23.input_layernorm.weight": "model-00004-of-00004.safetensors",
201
+ "model.layers.23.mlp.down_proj.weight": "model-00004-of-00004.safetensors",
202
+ "model.layers.23.mlp.gate_proj.weight": "model-00004-of-00004.safetensors",
203
+ "model.layers.23.mlp.up_proj.weight": "model-00004-of-00004.safetensors",
204
+ "model.layers.23.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
205
+ "model.layers.23.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
206
+ "model.layers.23.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
207
+ "model.layers.23.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
208
+ "model.layers.23.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
209
+ "model.layers.23.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
210
+ "model.layers.23.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
211
+ "model.layers.23.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
212
+ "model.layers.24.input_layernorm.weight": "model-00004-of-00004.safetensors",
213
+ "model.layers.24.mlp.down_proj.weight": "model-00004-of-00004.safetensors",
214
+ "model.layers.24.mlp.gate_proj.weight": "model-00004-of-00004.safetensors",
215
+ "model.layers.24.mlp.up_proj.weight": "model-00004-of-00004.safetensors",
216
+ "model.layers.24.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
217
+ "model.layers.24.self_attn.k_proj.bias": "model-00004-of-00004.safetensors",
218
+ "model.layers.24.self_attn.k_proj.weight": "model-00004-of-00004.safetensors",
219
+ "model.layers.24.self_attn.o_proj.weight": "model-00004-of-00004.safetensors",
220
+ "model.layers.24.self_attn.q_proj.bias": "model-00004-of-00004.safetensors",
221
+ "model.layers.24.self_attn.q_proj.weight": "model-00004-of-00004.safetensors",
222
+ "model.layers.24.self_attn.v_proj.bias": "model-00004-of-00004.safetensors",
223
+ "model.layers.24.self_attn.v_proj.weight": "model-00004-of-00004.safetensors",
224
+ "model.layers.25.input_layernorm.weight": "model-00004-of-00004.safetensors",
225
+ "model.layers.25.mlp.down_proj.weight": "model-00004-of-00004.safetensors",
226
+ "model.layers.25.mlp.gate_proj.weight": "model-00004-of-00004.safetensors",
227
+ "model.layers.25.mlp.up_proj.weight": "model-00004-of-00004.safetensors",
228
+ "model.layers.25.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
229
+ "model.layers.25.self_attn.k_proj.bias": "model-00004-of-00004.safetensors",
230
+ "model.layers.25.self_attn.k_proj.weight": "model-00004-of-00004.safetensors",
231
+ "model.layers.25.self_attn.o_proj.weight": "model-00004-of-00004.safetensors",
232
+ "model.layers.25.self_attn.q_proj.bias": "model-00004-of-00004.safetensors",
233
+ "model.layers.25.self_attn.q_proj.weight": "model-00004-of-00004.safetensors",
234
+ "model.layers.25.self_attn.v_proj.bias": "model-00004-of-00004.safetensors",
235
+ "model.layers.25.self_attn.v_proj.weight": "model-00004-of-00004.safetensors",
236
+ "model.layers.26.input_layernorm.weight": "model-00004-of-00004.safetensors",
237
+ "model.layers.26.mlp.down_proj.weight": "model-00004-of-00004.safetensors",
238
+ "model.layers.26.mlp.gate_proj.weight": "model-00004-of-00004.safetensors",
239
+ "model.layers.26.mlp.up_proj.weight": "model-00004-of-00004.safetensors",
240
+ "model.layers.26.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
241
+ "model.layers.26.self_attn.k_proj.bias": "model-00004-of-00004.safetensors",
242
+ "model.layers.26.self_attn.k_proj.weight": "model-00004-of-00004.safetensors",
243
+ "model.layers.26.self_attn.o_proj.weight": "model-00004-of-00004.safetensors",
244
+ "model.layers.26.self_attn.q_proj.bias": "model-00004-of-00004.safetensors",
245
+ "model.layers.26.self_attn.q_proj.weight": "model-00004-of-00004.safetensors",
246
+ "model.layers.26.self_attn.v_proj.bias": "model-00004-of-00004.safetensors",
247
+ "model.layers.26.self_attn.v_proj.weight": "model-00004-of-00004.safetensors",
248
+ "model.layers.27.input_layernorm.weight": "model-00004-of-00004.safetensors",
249
+ "model.layers.27.mlp.down_proj.weight": "model-00004-of-00004.safetensors",
250
+ "model.layers.27.mlp.gate_proj.weight": "model-00004-of-00004.safetensors",
251
+ "model.layers.27.mlp.up_proj.weight": "model-00004-of-00004.safetensors",
252
+ "model.layers.27.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
253
+ "model.layers.27.self_attn.k_proj.bias": "model-00004-of-00004.safetensors",
254
+ "model.layers.27.self_attn.k_proj.weight": "model-00004-of-00004.safetensors",
255
+ "model.layers.27.self_attn.o_proj.weight": "model-00004-of-00004.safetensors",
256
+ "model.layers.27.self_attn.q_proj.bias": "model-00004-of-00004.safetensors",
257
+ "model.layers.27.self_attn.q_proj.weight": "model-00004-of-00004.safetensors",
258
+ "model.layers.27.self_attn.v_proj.bias": "model-00004-of-00004.safetensors",
259
+ "model.layers.27.self_attn.v_proj.weight": "model-00004-of-00004.safetensors",
260
+ "model.layers.3.input_layernorm.weight": "model-00002-of-00004.safetensors",
261
+ "model.layers.3.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
262
+ "model.layers.3.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
263
+ "model.layers.3.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
264
+ "model.layers.3.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
265
+ "model.layers.3.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
266
+ "model.layers.3.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
267
+ "model.layers.3.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
268
+ "model.layers.3.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
269
+ "model.layers.3.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
270
+ "model.layers.3.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
271
+ "model.layers.3.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
272
+ "model.layers.4.input_layernorm.weight": "model-00002-of-00004.safetensors",
273
+ "model.layers.4.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
274
+ "model.layers.4.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
275
+ "model.layers.4.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
276
+ "model.layers.4.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
277
+ "model.layers.4.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
278
+ "model.layers.4.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
279
+ "model.layers.4.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
280
+ "model.layers.4.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
281
+ "model.layers.4.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
282
+ "model.layers.4.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
283
+ "model.layers.4.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
284
+ "model.layers.5.input_layernorm.weight": "model-00002-of-00004.safetensors",
285
+ "model.layers.5.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
286
+ "model.layers.5.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
287
+ "model.layers.5.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
288
+ "model.layers.5.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
289
+ "model.layers.5.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
290
+ "model.layers.5.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
291
+ "model.layers.5.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
292
+ "model.layers.5.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
293
+ "model.layers.5.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
294
+ "model.layers.5.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
295
+ "model.layers.5.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
296
+ "model.layers.6.input_layernorm.weight": "model-00002-of-00004.safetensors",
297
+ "model.layers.6.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
298
+ "model.layers.6.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
299
+ "model.layers.6.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
300
+ "model.layers.6.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
301
+ "model.layers.6.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
302
+ "model.layers.6.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
303
+ "model.layers.6.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
304
+ "model.layers.6.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
305
+ "model.layers.6.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
306
+ "model.layers.6.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
307
+ "model.layers.6.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
308
+ "model.layers.7.input_layernorm.weight": "model-00002-of-00004.safetensors",
309
+ "model.layers.7.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
310
+ "model.layers.7.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
311
+ "model.layers.7.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
312
+ "model.layers.7.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
313
+ "model.layers.7.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
314
+ "model.layers.7.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
315
+ "model.layers.7.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
316
+ "model.layers.7.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
317
+ "model.layers.7.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
318
+ "model.layers.7.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
319
+ "model.layers.7.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
320
+ "model.layers.8.input_layernorm.weight": "model-00002-of-00004.safetensors",
321
+ "model.layers.8.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
322
+ "model.layers.8.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
323
+ "model.layers.8.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
324
+ "model.layers.8.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
325
+ "model.layers.8.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
326
+ "model.layers.8.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
327
+ "model.layers.8.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
328
+ "model.layers.8.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
329
+ "model.layers.8.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
330
+ "model.layers.8.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
331
+ "model.layers.8.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
332
+ "model.layers.9.input_layernorm.weight": "model-00002-of-00004.safetensors",
333
+ "model.layers.9.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
334
+ "model.layers.9.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
335
+ "model.layers.9.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
336
+ "model.layers.9.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
337
+ "model.layers.9.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
338
+ "model.layers.9.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
339
+ "model.layers.9.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
340
+ "model.layers.9.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
341
+ "model.layers.9.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
342
+ "model.layers.9.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
343
+ "model.layers.9.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
344
+ "model.norm.weight": "model-00004-of-00004.safetensors",
345
+ "visual.blocks.0.attn.proj.bias": "model-00001-of-00004.safetensors",
346
+ "visual.blocks.0.attn.proj.weight": "model-00001-of-00004.safetensors",
347
+ "visual.blocks.0.attn.qkv.bias": "model-00001-of-00004.safetensors",
348
+ "visual.blocks.0.attn.qkv.weight": "model-00001-of-00004.safetensors",
349
+ "visual.blocks.0.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
350
+ "visual.blocks.0.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
351
+ "visual.blocks.0.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
352
+ "visual.blocks.0.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
353
+ "visual.blocks.0.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
354
+ "visual.blocks.0.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
355
+ "visual.blocks.0.norm1.weight": "model-00001-of-00004.safetensors",
356
+ "visual.blocks.0.norm2.weight": "model-00001-of-00004.safetensors",
357
+ "visual.blocks.1.attn.proj.bias": "model-00001-of-00004.safetensors",
358
+ "visual.blocks.1.attn.proj.weight": "model-00001-of-00004.safetensors",
359
+ "visual.blocks.1.attn.qkv.bias": "model-00001-of-00004.safetensors",
360
+ "visual.blocks.1.attn.qkv.weight": "model-00001-of-00004.safetensors",
361
+ "visual.blocks.1.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
362
+ "visual.blocks.1.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
363
+ "visual.blocks.1.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
364
+ "visual.blocks.1.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
365
+ "visual.blocks.1.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
366
+ "visual.blocks.1.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
367
+ "visual.blocks.1.norm1.weight": "model-00001-of-00004.safetensors",
368
+ "visual.blocks.1.norm2.weight": "model-00001-of-00004.safetensors",
369
+ "visual.blocks.10.attn.proj.bias": "model-00001-of-00004.safetensors",
370
+ "visual.blocks.10.attn.proj.weight": "model-00001-of-00004.safetensors",
371
+ "visual.blocks.10.attn.qkv.bias": "model-00001-of-00004.safetensors",
372
+ "visual.blocks.10.attn.qkv.weight": "model-00001-of-00004.safetensors",
373
+ "visual.blocks.10.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
374
+ "visual.blocks.10.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
375
+ "visual.blocks.10.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
376
+ "visual.blocks.10.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
377
+ "visual.blocks.10.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
378
+ "visual.blocks.10.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
379
+ "visual.blocks.10.norm1.weight": "model-00001-of-00004.safetensors",
380
+ "visual.blocks.10.norm2.weight": "model-00001-of-00004.safetensors",
381
+ "visual.blocks.11.attn.proj.bias": "model-00001-of-00004.safetensors",
382
+ "visual.blocks.11.attn.proj.weight": "model-00001-of-00004.safetensors",
383
+ "visual.blocks.11.attn.qkv.bias": "model-00001-of-00004.safetensors",
384
+ "visual.blocks.11.attn.qkv.weight": "model-00001-of-00004.safetensors",
385
+ "visual.blocks.11.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
386
+ "visual.blocks.11.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
387
+ "visual.blocks.11.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
388
+ "visual.blocks.11.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
389
+ "visual.blocks.11.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
390
+ "visual.blocks.11.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
391
+ "visual.blocks.11.norm1.weight": "model-00001-of-00004.safetensors",
392
+ "visual.blocks.11.norm2.weight": "model-00001-of-00004.safetensors",
393
+ "visual.blocks.12.attn.proj.bias": "model-00001-of-00004.safetensors",
394
+ "visual.blocks.12.attn.proj.weight": "model-00001-of-00004.safetensors",
395
+ "visual.blocks.12.attn.qkv.bias": "model-00001-of-00004.safetensors",
396
+ "visual.blocks.12.attn.qkv.weight": "model-00001-of-00004.safetensors",
397
+ "visual.blocks.12.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
398
+ "visual.blocks.12.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
399
+ "visual.blocks.12.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
400
+ "visual.blocks.12.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
401
+ "visual.blocks.12.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
402
+ "visual.blocks.12.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
403
+ "visual.blocks.12.norm1.weight": "model-00001-of-00004.safetensors",
404
+ "visual.blocks.12.norm2.weight": "model-00001-of-00004.safetensors",
405
+ "visual.blocks.13.attn.proj.bias": "model-00001-of-00004.safetensors",
406
+ "visual.blocks.13.attn.proj.weight": "model-00001-of-00004.safetensors",
407
+ "visual.blocks.13.attn.qkv.bias": "model-00001-of-00004.safetensors",
408
+ "visual.blocks.13.attn.qkv.weight": "model-00001-of-00004.safetensors",
409
+ "visual.blocks.13.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
410
+ "visual.blocks.13.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
411
+ "visual.blocks.13.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
412
+ "visual.blocks.13.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
413
+ "visual.blocks.13.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
414
+ "visual.blocks.13.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
415
+ "visual.blocks.13.norm1.weight": "model-00001-of-00004.safetensors",
416
+ "visual.blocks.13.norm2.weight": "model-00001-of-00004.safetensors",
417
+ "visual.blocks.14.attn.proj.bias": "model-00001-of-00004.safetensors",
418
+ "visual.blocks.14.attn.proj.weight": "model-00001-of-00004.safetensors",
419
+ "visual.blocks.14.attn.qkv.bias": "model-00001-of-00004.safetensors",
420
+ "visual.blocks.14.attn.qkv.weight": "model-00001-of-00004.safetensors",
421
+ "visual.blocks.14.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
422
+ "visual.blocks.14.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
423
+ "visual.blocks.14.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
424
+ "visual.blocks.14.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
425
+ "visual.blocks.14.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
426
+ "visual.blocks.14.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
427
+ "visual.blocks.14.norm1.weight": "model-00001-of-00004.safetensors",
428
+ "visual.blocks.14.norm2.weight": "model-00001-of-00004.safetensors",
429
+ "visual.blocks.15.attn.proj.bias": "model-00001-of-00004.safetensors",
430
+ "visual.blocks.15.attn.proj.weight": "model-00001-of-00004.safetensors",
431
+ "visual.blocks.15.attn.qkv.bias": "model-00001-of-00004.safetensors",
432
+ "visual.blocks.15.attn.qkv.weight": "model-00001-of-00004.safetensors",
433
+ "visual.blocks.15.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
434
+ "visual.blocks.15.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
435
+ "visual.blocks.15.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
436
+ "visual.blocks.15.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
437
+ "visual.blocks.15.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
438
+ "visual.blocks.15.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
439
+ "visual.blocks.15.norm1.weight": "model-00001-of-00004.safetensors",
440
+ "visual.blocks.15.norm2.weight": "model-00001-of-00004.safetensors",
441
+ "visual.blocks.16.attn.proj.bias": "model-00001-of-00004.safetensors",
442
+ "visual.blocks.16.attn.proj.weight": "model-00001-of-00004.safetensors",
443
+ "visual.blocks.16.attn.qkv.bias": "model-00001-of-00004.safetensors",
444
+ "visual.blocks.16.attn.qkv.weight": "model-00001-of-00004.safetensors",
445
+ "visual.blocks.16.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
446
+ "visual.blocks.16.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
447
+ "visual.blocks.16.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
448
+ "visual.blocks.16.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
449
+ "visual.blocks.16.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
450
+ "visual.blocks.16.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
451
+ "visual.blocks.16.norm1.weight": "model-00001-of-00004.safetensors",
452
+ "visual.blocks.16.norm2.weight": "model-00001-of-00004.safetensors",
453
+ "visual.blocks.17.attn.proj.bias": "model-00001-of-00004.safetensors",
454
+ "visual.blocks.17.attn.proj.weight": "model-00001-of-00004.safetensors",
455
+ "visual.blocks.17.attn.qkv.bias": "model-00001-of-00004.safetensors",
456
+ "visual.blocks.17.attn.qkv.weight": "model-00001-of-00004.safetensors",
457
+ "visual.blocks.17.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
458
+ "visual.blocks.17.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
459
+ "visual.blocks.17.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
460
+ "visual.blocks.17.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
461
+ "visual.blocks.17.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
462
+ "visual.blocks.17.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
463
+ "visual.blocks.17.norm1.weight": "model-00001-of-00004.safetensors",
464
+ "visual.blocks.17.norm2.weight": "model-00001-of-00004.safetensors",
465
+ "visual.blocks.18.attn.proj.bias": "model-00001-of-00004.safetensors",
466
+ "visual.blocks.18.attn.proj.weight": "model-00001-of-00004.safetensors",
467
+ "visual.blocks.18.attn.qkv.bias": "model-00001-of-00004.safetensors",
468
+ "visual.blocks.18.attn.qkv.weight": "model-00001-of-00004.safetensors",
469
+ "visual.blocks.18.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
470
+ "visual.blocks.18.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
471
+ "visual.blocks.18.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
472
+ "visual.blocks.18.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
473
+ "visual.blocks.18.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
474
+ "visual.blocks.18.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
475
+ "visual.blocks.18.norm1.weight": "model-00001-of-00004.safetensors",
476
+ "visual.blocks.18.norm2.weight": "model-00001-of-00004.safetensors",
477
+ "visual.blocks.19.attn.proj.bias": "model-00001-of-00004.safetensors",
478
+ "visual.blocks.19.attn.proj.weight": "model-00001-of-00004.safetensors",
479
+ "visual.blocks.19.attn.qkv.bias": "model-00001-of-00004.safetensors",
480
+ "visual.blocks.19.attn.qkv.weight": "model-00001-of-00004.safetensors",
481
+ "visual.blocks.19.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
482
+ "visual.blocks.19.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
483
+ "visual.blocks.19.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
484
+ "visual.blocks.19.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
485
+ "visual.blocks.19.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
486
+ "visual.blocks.19.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
487
+ "visual.blocks.19.norm1.weight": "model-00001-of-00004.safetensors",
488
+ "visual.blocks.19.norm2.weight": "model-00001-of-00004.safetensors",
489
+ "visual.blocks.2.attn.proj.bias": "model-00001-of-00004.safetensors",
490
+ "visual.blocks.2.attn.proj.weight": "model-00001-of-00004.safetensors",
491
+ "visual.blocks.2.attn.qkv.bias": "model-00001-of-00004.safetensors",
492
+ "visual.blocks.2.attn.qkv.weight": "model-00001-of-00004.safetensors",
493
+ "visual.blocks.2.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
494
+ "visual.blocks.2.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
495
+ "visual.blocks.2.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
496
+ "visual.blocks.2.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
497
+ "visual.blocks.2.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
498
+ "visual.blocks.2.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
499
+ "visual.blocks.2.norm1.weight": "model-00001-of-00004.safetensors",
500
+ "visual.blocks.2.norm2.weight": "model-00001-of-00004.safetensors",
501
+ "visual.blocks.20.attn.proj.bias": "model-00001-of-00004.safetensors",
502
+ "visual.blocks.20.attn.proj.weight": "model-00001-of-00004.safetensors",
503
+ "visual.blocks.20.attn.qkv.bias": "model-00001-of-00004.safetensors",
504
+ "visual.blocks.20.attn.qkv.weight": "model-00001-of-00004.safetensors",
505
+ "visual.blocks.20.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
506
+ "visual.blocks.20.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
507
+ "visual.blocks.20.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
508
+ "visual.blocks.20.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
509
+ "visual.blocks.20.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
510
+ "visual.blocks.20.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
511
+ "visual.blocks.20.norm1.weight": "model-00001-of-00004.safetensors",
512
+ "visual.blocks.20.norm2.weight": "model-00001-of-00004.safetensors",
513
+ "visual.blocks.21.attn.proj.bias": "model-00001-of-00004.safetensors",
514
+ "visual.blocks.21.attn.proj.weight": "model-00001-of-00004.safetensors",
515
+ "visual.blocks.21.attn.qkv.bias": "model-00001-of-00004.safetensors",
516
+ "visual.blocks.21.attn.qkv.weight": "model-00001-of-00004.safetensors",
517
+ "visual.blocks.21.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
518
+ "visual.blocks.21.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
519
+ "visual.blocks.21.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
520
+ "visual.blocks.21.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
521
+ "visual.blocks.21.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
522
+ "visual.blocks.21.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
523
+ "visual.blocks.21.norm1.weight": "model-00001-of-00004.safetensors",
524
+ "visual.blocks.21.norm2.weight": "model-00001-of-00004.safetensors",
525
+ "visual.blocks.22.attn.proj.bias": "model-00001-of-00004.safetensors",
526
+ "visual.blocks.22.attn.proj.weight": "model-00001-of-00004.safetensors",
527
+ "visual.blocks.22.attn.qkv.bias": "model-00001-of-00004.safetensors",
528
+ "visual.blocks.22.attn.qkv.weight": "model-00001-of-00004.safetensors",
529
+ "visual.blocks.22.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
530
+ "visual.blocks.22.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
531
+ "visual.blocks.22.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
532
+ "visual.blocks.22.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
533
+ "visual.blocks.22.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
534
+ "visual.blocks.22.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
535
+ "visual.blocks.22.norm1.weight": "model-00001-of-00004.safetensors",
536
+ "visual.blocks.22.norm2.weight": "model-00001-of-00004.safetensors",
537
+ "visual.blocks.23.attn.proj.bias": "model-00001-of-00004.safetensors",
538
+ "visual.blocks.23.attn.proj.weight": "model-00001-of-00004.safetensors",
539
+ "visual.blocks.23.attn.qkv.bias": "model-00001-of-00004.safetensors",
540
+ "visual.blocks.23.attn.qkv.weight": "model-00001-of-00004.safetensors",
541
+ "visual.blocks.23.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
542
+ "visual.blocks.23.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
543
+ "visual.blocks.23.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
544
+ "visual.blocks.23.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
545
+ "visual.blocks.23.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
546
+ "visual.blocks.23.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
547
+ "visual.blocks.23.norm1.weight": "model-00001-of-00004.safetensors",
548
+ "visual.blocks.23.norm2.weight": "model-00001-of-00004.safetensors",
549
+ "visual.blocks.24.attn.proj.bias": "model-00001-of-00004.safetensors",
550
+ "visual.blocks.24.attn.proj.weight": "model-00001-of-00004.safetensors",
551
+ "visual.blocks.24.attn.qkv.bias": "model-00001-of-00004.safetensors",
552
+ "visual.blocks.24.attn.qkv.weight": "model-00001-of-00004.safetensors",
553
+ "visual.blocks.24.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
554
+ "visual.blocks.24.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
555
+ "visual.blocks.24.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
556
+ "visual.blocks.24.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
557
+ "visual.blocks.24.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
558
+ "visual.blocks.24.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
559
+ "visual.blocks.24.norm1.weight": "model-00001-of-00004.safetensors",
560
+ "visual.blocks.24.norm2.weight": "model-00001-of-00004.safetensors",
561
+ "visual.blocks.25.attn.proj.bias": "model-00001-of-00004.safetensors",
562
+ "visual.blocks.25.attn.proj.weight": "model-00001-of-00004.safetensors",
563
+ "visual.blocks.25.attn.qkv.bias": "model-00001-of-00004.safetensors",
564
+ "visual.blocks.25.attn.qkv.weight": "model-00001-of-00004.safetensors",
565
+ "visual.blocks.25.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
566
+ "visual.blocks.25.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
567
+ "visual.blocks.25.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
568
+ "visual.blocks.25.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
569
+ "visual.blocks.25.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
570
+ "visual.blocks.25.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
571
+ "visual.blocks.25.norm1.weight": "model-00001-of-00004.safetensors",
572
+ "visual.blocks.25.norm2.weight": "model-00001-of-00004.safetensors",
573
+ "visual.blocks.26.attn.proj.bias": "model-00001-of-00004.safetensors",
574
+ "visual.blocks.26.attn.proj.weight": "model-00001-of-00004.safetensors",
575
+ "visual.blocks.26.attn.qkv.bias": "model-00001-of-00004.safetensors",
576
+ "visual.blocks.26.attn.qkv.weight": "model-00001-of-00004.safetensors",
577
+ "visual.blocks.26.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
578
+ "visual.blocks.26.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
579
+ "visual.blocks.26.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
580
+ "visual.blocks.26.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
581
+ "visual.blocks.26.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
582
+ "visual.blocks.26.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
583
+ "visual.blocks.26.norm1.weight": "model-00001-of-00004.safetensors",
584
+ "visual.blocks.26.norm2.weight": "model-00001-of-00004.safetensors",
585
+ "visual.blocks.27.attn.proj.bias": "model-00001-of-00004.safetensors",
586
+ "visual.blocks.27.attn.proj.weight": "model-00001-of-00004.safetensors",
587
+ "visual.blocks.27.attn.qkv.bias": "model-00001-of-00004.safetensors",
588
+ "visual.blocks.27.attn.qkv.weight": "model-00001-of-00004.safetensors",
589
+ "visual.blocks.27.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
590
+ "visual.blocks.27.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
591
+ "visual.blocks.27.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
592
+ "visual.blocks.27.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
593
+ "visual.blocks.27.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
594
+ "visual.blocks.27.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
595
+ "visual.blocks.27.norm1.weight": "model-00001-of-00004.safetensors",
596
+ "visual.blocks.27.norm2.weight": "model-00001-of-00004.safetensors",
597
+ "visual.blocks.28.attn.proj.bias": "model-00001-of-00004.safetensors",
598
+ "visual.blocks.28.attn.proj.weight": "model-00001-of-00004.safetensors",
599
+ "visual.blocks.28.attn.qkv.bias": "model-00001-of-00004.safetensors",
600
+ "visual.blocks.28.attn.qkv.weight": "model-00001-of-00004.safetensors",
601
+ "visual.blocks.28.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
602
+ "visual.blocks.28.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
603
+ "visual.blocks.28.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
604
+ "visual.blocks.28.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
605
+ "visual.blocks.28.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
606
+ "visual.blocks.28.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
607
+ "visual.blocks.28.norm1.weight": "model-00001-of-00004.safetensors",
608
+ "visual.blocks.28.norm2.weight": "model-00001-of-00004.safetensors",
609
+ "visual.blocks.29.attn.proj.bias": "model-00001-of-00004.safetensors",
610
+ "visual.blocks.29.attn.proj.weight": "model-00001-of-00004.safetensors",
611
+ "visual.blocks.29.attn.qkv.bias": "model-00001-of-00004.safetensors",
612
+ "visual.blocks.29.attn.qkv.weight": "model-00001-of-00004.safetensors",
613
+ "visual.blocks.29.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
614
+ "visual.blocks.29.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
615
+ "visual.blocks.29.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
616
+ "visual.blocks.29.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
617
+ "visual.blocks.29.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
618
+ "visual.blocks.29.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
619
+ "visual.blocks.29.norm1.weight": "model-00001-of-00004.safetensors",
620
+ "visual.blocks.29.norm2.weight": "model-00001-of-00004.safetensors",
621
+ "visual.blocks.3.attn.proj.bias": "model-00001-of-00004.safetensors",
622
+ "visual.blocks.3.attn.proj.weight": "model-00001-of-00004.safetensors",
623
+ "visual.blocks.3.attn.qkv.bias": "model-00001-of-00004.safetensors",
624
+ "visual.blocks.3.attn.qkv.weight": "model-00001-of-00004.safetensors",
625
+ "visual.blocks.3.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
626
+ "visual.blocks.3.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
627
+ "visual.blocks.3.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
628
+ "visual.blocks.3.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
629
+ "visual.blocks.3.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
630
+ "visual.blocks.3.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
631
+ "visual.blocks.3.norm1.weight": "model-00001-of-00004.safetensors",
632
+ "visual.blocks.3.norm2.weight": "model-00001-of-00004.safetensors",
633
+ "visual.blocks.30.attn.proj.bias": "model-00001-of-00004.safetensors",
634
+ "visual.blocks.30.attn.proj.weight": "model-00001-of-00004.safetensors",
635
+ "visual.blocks.30.attn.qkv.bias": "model-00001-of-00004.safetensors",
636
+ "visual.blocks.30.attn.qkv.weight": "model-00001-of-00004.safetensors",
637
+ "visual.blocks.30.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
638
+ "visual.blocks.30.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
639
+ "visual.blocks.30.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
640
+ "visual.blocks.30.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
641
+ "visual.blocks.30.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
642
+ "visual.blocks.30.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
643
+ "visual.blocks.30.norm1.weight": "model-00001-of-00004.safetensors",
644
+ "visual.blocks.30.norm2.weight": "model-00001-of-00004.safetensors",
645
+ "visual.blocks.31.attn.proj.bias": "model-00001-of-00004.safetensors",
646
+ "visual.blocks.31.attn.proj.weight": "model-00001-of-00004.safetensors",
647
+ "visual.blocks.31.attn.qkv.bias": "model-00001-of-00004.safetensors",
648
+ "visual.blocks.31.attn.qkv.weight": "model-00001-of-00004.safetensors",
649
+ "visual.blocks.31.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
650
+ "visual.blocks.31.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
651
+ "visual.blocks.31.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
652
+ "visual.blocks.31.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
653
+ "visual.blocks.31.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
654
+ "visual.blocks.31.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
655
+ "visual.blocks.31.norm1.weight": "model-00001-of-00004.safetensors",
656
+ "visual.blocks.31.norm2.weight": "model-00001-of-00004.safetensors",
657
+ "visual.blocks.4.attn.proj.bias": "model-00001-of-00004.safetensors",
658
+ "visual.blocks.4.attn.proj.weight": "model-00001-of-00004.safetensors",
659
+ "visual.blocks.4.attn.qkv.bias": "model-00001-of-00004.safetensors",
660
+ "visual.blocks.4.attn.qkv.weight": "model-00001-of-00004.safetensors",
661
+ "visual.blocks.4.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
662
+ "visual.blocks.4.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
663
+ "visual.blocks.4.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
664
+ "visual.blocks.4.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
665
+ "visual.blocks.4.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
666
+ "visual.blocks.4.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
667
+ "visual.blocks.4.norm1.weight": "model-00001-of-00004.safetensors",
668
+ "visual.blocks.4.norm2.weight": "model-00001-of-00004.safetensors",
669
+ "visual.blocks.5.attn.proj.bias": "model-00001-of-00004.safetensors",
670
+ "visual.blocks.5.attn.proj.weight": "model-00001-of-00004.safetensors",
671
+ "visual.blocks.5.attn.qkv.bias": "model-00001-of-00004.safetensors",
672
+ "visual.blocks.5.attn.qkv.weight": "model-00001-of-00004.safetensors",
673
+ "visual.blocks.5.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
674
+ "visual.blocks.5.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
675
+ "visual.blocks.5.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
676
+ "visual.blocks.5.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
677
+ "visual.blocks.5.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
678
+ "visual.blocks.5.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
679
+ "visual.blocks.5.norm1.weight": "model-00001-of-00004.safetensors",
680
+ "visual.blocks.5.norm2.weight": "model-00001-of-00004.safetensors",
681
+ "visual.blocks.6.attn.proj.bias": "model-00001-of-00004.safetensors",
682
+ "visual.blocks.6.attn.proj.weight": "model-00001-of-00004.safetensors",
683
+ "visual.blocks.6.attn.qkv.bias": "model-00001-of-00004.safetensors",
684
+ "visual.blocks.6.attn.qkv.weight": "model-00001-of-00004.safetensors",
685
+ "visual.blocks.6.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
686
+ "visual.blocks.6.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
687
+ "visual.blocks.6.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
688
+ "visual.blocks.6.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
689
+ "visual.blocks.6.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
690
+ "visual.blocks.6.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
691
+ "visual.blocks.6.norm1.weight": "model-00001-of-00004.safetensors",
692
+ "visual.blocks.6.norm2.weight": "model-00001-of-00004.safetensors",
693
+ "visual.blocks.7.attn.proj.bias": "model-00001-of-00004.safetensors",
694
+ "visual.blocks.7.attn.proj.weight": "model-00001-of-00004.safetensors",
695
+ "visual.blocks.7.attn.qkv.bias": "model-00001-of-00004.safetensors",
696
+ "visual.blocks.7.attn.qkv.weight": "model-00001-of-00004.safetensors",
697
+ "visual.blocks.7.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
698
+ "visual.blocks.7.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
699
+ "visual.blocks.7.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
700
+ "visual.blocks.7.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
701
+ "visual.blocks.7.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
702
+ "visual.blocks.7.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
703
+ "visual.blocks.7.norm1.weight": "model-00001-of-00004.safetensors",
704
+ "visual.blocks.7.norm2.weight": "model-00001-of-00004.safetensors",
705
+ "visual.blocks.8.attn.proj.bias": "model-00001-of-00004.safetensors",
706
+ "visual.blocks.8.attn.proj.weight": "model-00001-of-00004.safetensors",
707
+ "visual.blocks.8.attn.qkv.bias": "model-00001-of-00004.safetensors",
708
+ "visual.blocks.8.attn.qkv.weight": "model-00001-of-00004.safetensors",
709
+ "visual.blocks.8.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
710
+ "visual.blocks.8.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
711
+ "visual.blocks.8.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
712
+ "visual.blocks.8.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
713
+ "visual.blocks.8.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
714
+ "visual.blocks.8.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
715
+ "visual.blocks.8.norm1.weight": "model-00001-of-00004.safetensors",
716
+ "visual.blocks.8.norm2.weight": "model-00001-of-00004.safetensors",
717
+ "visual.blocks.9.attn.proj.bias": "model-00001-of-00004.safetensors",
718
+ "visual.blocks.9.attn.proj.weight": "model-00001-of-00004.safetensors",
719
+ "visual.blocks.9.attn.qkv.bias": "model-00001-of-00004.safetensors",
720
+ "visual.blocks.9.attn.qkv.weight": "model-00001-of-00004.safetensors",
721
+ "visual.blocks.9.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
722
+ "visual.blocks.9.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
723
+ "visual.blocks.9.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
724
+ "visual.blocks.9.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
725
+ "visual.blocks.9.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
726
+ "visual.blocks.9.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
727
+ "visual.blocks.9.norm1.weight": "model-00001-of-00004.safetensors",
728
+ "visual.blocks.9.norm2.weight": "model-00001-of-00004.safetensors",
729
+ "visual.merger.ln_q.weight": "model-00001-of-00004.safetensors",
730
+ "visual.merger.mlp.0.bias": "model-00001-of-00004.safetensors",
731
+ "visual.merger.mlp.0.weight": "model-00001-of-00004.safetensors",
732
+ "visual.merger.mlp.2.bias": "model-00001-of-00004.safetensors",
733
+ "visual.merger.mlp.2.weight": "model-00001-of-00004.safetensors",
734
+ "visual.patch_embed.proj.weight": "model-00001-of-00004.safetensors",
735
+ "visual.q_former_1.attn.k_proj.bias": "model-00001-of-00004.safetensors",
736
+ "visual.q_former_1.attn.k_proj.weight": "model-00001-of-00004.safetensors",
737
+ "visual.q_former_1.attn.o_proj.weight": "model-00001-of-00004.safetensors",
738
+ "visual.q_former_1.attn.q_proj.bias": "model-00001-of-00004.safetensors",
739
+ "visual.q_former_1.attn.q_proj.weight": "model-00001-of-00004.safetensors",
740
+ "visual.q_former_1.attn.v_proj.bias": "model-00001-of-00004.safetensors",
741
+ "visual.q_former_1.attn.v_proj.weight": "model-00001-of-00004.safetensors",
742
+ "visual.q_former_1.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
743
+ "visual.q_former_1.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
744
+ "visual.q_former_1.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
745
+ "visual.q_former_1.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
746
+ "visual.q_former_1.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
747
+ "visual.q_former_1.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
748
+ "visual.q_former_1.norm1.weight": "model-00001-of-00004.safetensors",
749
+ "visual.q_former_1.norm2.weight": "model-00001-of-00004.safetensors",
750
+ "visual.q_former_2.attn.k_proj.bias": "model-00001-of-00004.safetensors",
751
+ "visual.q_former_2.attn.k_proj.weight": "model-00001-of-00004.safetensors",
752
+ "visual.q_former_2.attn.o_proj.weight": "model-00001-of-00004.safetensors",
753
+ "visual.q_former_2.attn.q_proj.bias": "model-00001-of-00004.safetensors",
754
+ "visual.q_former_2.attn.q_proj.weight": "model-00001-of-00004.safetensors",
755
+ "visual.q_former_2.attn.v_proj.bias": "model-00001-of-00004.safetensors",
756
+ "visual.q_former_2.attn.v_proj.weight": "model-00001-of-00004.safetensors",
757
+ "visual.q_former_2.mlp.down_proj.bias": "model-00001-of-00004.safetensors",
758
+ "visual.q_former_2.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
759
+ "visual.q_former_2.mlp.gate_proj.bias": "model-00001-of-00004.safetensors",
760
+ "visual.q_former_2.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
761
+ "visual.q_former_2.mlp.up_proj.bias": "model-00001-of-00004.safetensors",
762
+ "visual.q_former_2.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
763
+ "visual.q_former_2.norm1.weight": "model-00001-of-00004.safetensors",
764
+ "visual.q_former_2.norm2.weight": "model-00001-of-00004.safetensors",
765
+ "visual.q_former_norm.weight": "model-00001-of-00004.safetensors",
766
+ "visual.q_former_queries": "model-00001-of-00004.safetensors",
767
+ "visual.spatial_encoder.blocks.0.attn.proj.bias": "model-00001-of-00004.safetensors",
768
+ "visual.spatial_encoder.blocks.0.attn.proj.weight": "model-00001-of-00004.safetensors",
769
+ "visual.spatial_encoder.blocks.0.attn.qkv.bias": "model-00001-of-00004.safetensors",
770
+ "visual.spatial_encoder.blocks.0.attn.qkv.weight": "model-00001-of-00004.safetensors",
771
+ "visual.spatial_encoder.blocks.0.ls1.gamma": "model-00001-of-00004.safetensors",
772
+ "visual.spatial_encoder.blocks.0.ls2.gamma": "model-00001-of-00004.safetensors",
773
+ "visual.spatial_encoder.blocks.0.mlp.fc1.bias": "model-00001-of-00004.safetensors",
774
+ "visual.spatial_encoder.blocks.0.mlp.fc1.weight": "model-00001-of-00004.safetensors",
775
+ "visual.spatial_encoder.blocks.0.mlp.fc2.bias": "model-00001-of-00004.safetensors",
776
+ "visual.spatial_encoder.blocks.0.mlp.fc2.weight": "model-00001-of-00004.safetensors",
777
+ "visual.spatial_encoder.blocks.0.norm1.bias": "model-00001-of-00004.safetensors",
778
+ "visual.spatial_encoder.blocks.0.norm1.weight": "model-00001-of-00004.safetensors",
779
+ "visual.spatial_encoder.blocks.0.norm2.bias": "model-00001-of-00004.safetensors",
780
+ "visual.spatial_encoder.blocks.0.norm2.weight": "model-00001-of-00004.safetensors",
781
+ "visual.spatial_encoder.blocks.1.attn.proj.bias": "model-00001-of-00004.safetensors",
782
+ "visual.spatial_encoder.blocks.1.attn.proj.weight": "model-00001-of-00004.safetensors",
783
+ "visual.spatial_encoder.blocks.1.attn.qkv.bias": "model-00001-of-00004.safetensors",
784
+ "visual.spatial_encoder.blocks.1.attn.qkv.weight": "model-00001-of-00004.safetensors",
785
+ "visual.spatial_encoder.blocks.1.ls1.gamma": "model-00001-of-00004.safetensors",
786
+ "visual.spatial_encoder.blocks.1.ls2.gamma": "model-00001-of-00004.safetensors",
787
+ "visual.spatial_encoder.blocks.1.mlp.fc1.bias": "model-00001-of-00004.safetensors",
788
+ "visual.spatial_encoder.blocks.1.mlp.fc1.weight": "model-00001-of-00004.safetensors",
789
+ "visual.spatial_encoder.blocks.1.mlp.fc2.bias": "model-00001-of-00004.safetensors",
790
+ "visual.spatial_encoder.blocks.1.mlp.fc2.weight": "model-00001-of-00004.safetensors",
791
+ "visual.spatial_encoder.blocks.1.norm1.bias": "model-00001-of-00004.safetensors",
792
+ "visual.spatial_encoder.blocks.1.norm1.weight": "model-00001-of-00004.safetensors",
793
+ "visual.spatial_encoder.blocks.1.norm2.bias": "model-00001-of-00004.safetensors",
794
+ "visual.spatial_encoder.blocks.1.norm2.weight": "model-00001-of-00004.safetensors",
795
+ "visual.spatial_encoder.blocks.10.attn.proj.bias": "model-00001-of-00004.safetensors",
796
+ "visual.spatial_encoder.blocks.10.attn.proj.weight": "model-00001-of-00004.safetensors",
797
+ "visual.spatial_encoder.blocks.10.attn.qkv.bias": "model-00001-of-00004.safetensors",
798
+ "visual.spatial_encoder.blocks.10.attn.qkv.weight": "model-00001-of-00004.safetensors",
799
+ "visual.spatial_encoder.blocks.10.ls1.gamma": "model-00001-of-00004.safetensors",
800
+ "visual.spatial_encoder.blocks.10.ls2.gamma": "model-00001-of-00004.safetensors",
801
+ "visual.spatial_encoder.blocks.10.mlp.fc1.bias": "model-00001-of-00004.safetensors",
802
+ "visual.spatial_encoder.blocks.10.mlp.fc1.weight": "model-00001-of-00004.safetensors",
803
+ "visual.spatial_encoder.blocks.10.mlp.fc2.bias": "model-00001-of-00004.safetensors",
804
+ "visual.spatial_encoder.blocks.10.mlp.fc2.weight": "model-00001-of-00004.safetensors",
805
+ "visual.spatial_encoder.blocks.10.norm1.bias": "model-00001-of-00004.safetensors",
806
+ "visual.spatial_encoder.blocks.10.norm1.weight": "model-00001-of-00004.safetensors",
807
+ "visual.spatial_encoder.blocks.10.norm2.bias": "model-00001-of-00004.safetensors",
808
+ "visual.spatial_encoder.blocks.10.norm2.weight": "model-00001-of-00004.safetensors",
809
+ "visual.spatial_encoder.blocks.11.attn.proj.bias": "model-00001-of-00004.safetensors",
810
+ "visual.spatial_encoder.blocks.11.attn.proj.weight": "model-00001-of-00004.safetensors",
811
+ "visual.spatial_encoder.blocks.11.attn.qkv.bias": "model-00001-of-00004.safetensors",
812
+ "visual.spatial_encoder.blocks.11.attn.qkv.weight": "model-00001-of-00004.safetensors",
813
+ "visual.spatial_encoder.blocks.11.ls1.gamma": "model-00001-of-00004.safetensors",
814
+ "visual.spatial_encoder.blocks.11.ls2.gamma": "model-00001-of-00004.safetensors",
815
+ "visual.spatial_encoder.blocks.11.mlp.fc1.bias": "model-00001-of-00004.safetensors",
816
+ "visual.spatial_encoder.blocks.11.mlp.fc1.weight": "model-00001-of-00004.safetensors",
817
+ "visual.spatial_encoder.blocks.11.mlp.fc2.bias": "model-00001-of-00004.safetensors",
818
+ "visual.spatial_encoder.blocks.11.mlp.fc2.weight": "model-00001-of-00004.safetensors",
819
+ "visual.spatial_encoder.blocks.11.norm1.bias": "model-00001-of-00004.safetensors",
820
+ "visual.spatial_encoder.blocks.11.norm1.weight": "model-00001-of-00004.safetensors",
821
+ "visual.spatial_encoder.blocks.11.norm2.bias": "model-00001-of-00004.safetensors",
822
+ "visual.spatial_encoder.blocks.11.norm2.weight": "model-00001-of-00004.safetensors",
823
+ "visual.spatial_encoder.blocks.12.attn.proj.bias": "model-00001-of-00004.safetensors",
824
+ "visual.spatial_encoder.blocks.12.attn.proj.weight": "model-00001-of-00004.safetensors",
825
+ "visual.spatial_encoder.blocks.12.attn.qkv.bias": "model-00001-of-00004.safetensors",
826
+ "visual.spatial_encoder.blocks.12.attn.qkv.weight": "model-00001-of-00004.safetensors",
827
+ "visual.spatial_encoder.blocks.12.ls1.gamma": "model-00001-of-00004.safetensors",
828
+ "visual.spatial_encoder.blocks.12.ls2.gamma": "model-00001-of-00004.safetensors",
829
+ "visual.spatial_encoder.blocks.12.mlp.fc1.bias": "model-00001-of-00004.safetensors",
830
+ "visual.spatial_encoder.blocks.12.mlp.fc1.weight": "model-00001-of-00004.safetensors",
831
+ "visual.spatial_encoder.blocks.12.mlp.fc2.bias": "model-00001-of-00004.safetensors",
832
+ "visual.spatial_encoder.blocks.12.mlp.fc2.weight": "model-00001-of-00004.safetensors",
833
+ "visual.spatial_encoder.blocks.12.norm1.bias": "model-00001-of-00004.safetensors",
834
+ "visual.spatial_encoder.blocks.12.norm1.weight": "model-00001-of-00004.safetensors",
835
+ "visual.spatial_encoder.blocks.12.norm2.bias": "model-00001-of-00004.safetensors",
836
+ "visual.spatial_encoder.blocks.12.norm2.weight": "model-00001-of-00004.safetensors",
837
+ "visual.spatial_encoder.blocks.13.attn.proj.bias": "model-00001-of-00004.safetensors",
838
+ "visual.spatial_encoder.blocks.13.attn.proj.weight": "model-00001-of-00004.safetensors",
839
+ "visual.spatial_encoder.blocks.13.attn.qkv.bias": "model-00001-of-00004.safetensors",
840
+ "visual.spatial_encoder.blocks.13.attn.qkv.weight": "model-00001-of-00004.safetensors",
841
+ "visual.spatial_encoder.blocks.13.ls1.gamma": "model-00001-of-00004.safetensors",
842
+ "visual.spatial_encoder.blocks.13.ls2.gamma": "model-00001-of-00004.safetensors",
843
+ "visual.spatial_encoder.blocks.13.mlp.fc1.bias": "model-00001-of-00004.safetensors",
844
+ "visual.spatial_encoder.blocks.13.mlp.fc1.weight": "model-00001-of-00004.safetensors",
845
+ "visual.spatial_encoder.blocks.13.mlp.fc2.bias": "model-00001-of-00004.safetensors",
846
+ "visual.spatial_encoder.blocks.13.mlp.fc2.weight": "model-00001-of-00004.safetensors",
847
+ "visual.spatial_encoder.blocks.13.norm1.bias": "model-00001-of-00004.safetensors",
848
+ "visual.spatial_encoder.blocks.13.norm1.weight": "model-00001-of-00004.safetensors",
849
+ "visual.spatial_encoder.blocks.13.norm2.bias": "model-00001-of-00004.safetensors",
850
+ "visual.spatial_encoder.blocks.13.norm2.weight": "model-00001-of-00004.safetensors",
851
+ "visual.spatial_encoder.blocks.14.attn.proj.bias": "model-00001-of-00004.safetensors",
852
+ "visual.spatial_encoder.blocks.14.attn.proj.weight": "model-00001-of-00004.safetensors",
853
+ "visual.spatial_encoder.blocks.14.attn.qkv.bias": "model-00001-of-00004.safetensors",
854
+ "visual.spatial_encoder.blocks.14.attn.qkv.weight": "model-00001-of-00004.safetensors",
855
+ "visual.spatial_encoder.blocks.14.ls1.gamma": "model-00001-of-00004.safetensors",
856
+ "visual.spatial_encoder.blocks.14.ls2.gamma": "model-00001-of-00004.safetensors",
857
+ "visual.spatial_encoder.blocks.14.mlp.fc1.bias": "model-00001-of-00004.safetensors",
858
+ "visual.spatial_encoder.blocks.14.mlp.fc1.weight": "model-00001-of-00004.safetensors",
859
+ "visual.spatial_encoder.blocks.14.mlp.fc2.bias": "model-00001-of-00004.safetensors",
860
+ "visual.spatial_encoder.blocks.14.mlp.fc2.weight": "model-00001-of-00004.safetensors",
861
+ "visual.spatial_encoder.blocks.14.norm1.bias": "model-00001-of-00004.safetensors",
862
+ "visual.spatial_encoder.blocks.14.norm1.weight": "model-00001-of-00004.safetensors",
863
+ "visual.spatial_encoder.blocks.14.norm2.bias": "model-00001-of-00004.safetensors",
864
+ "visual.spatial_encoder.blocks.14.norm2.weight": "model-00001-of-00004.safetensors",
865
+ "visual.spatial_encoder.blocks.15.attn.proj.bias": "model-00001-of-00004.safetensors",
866
+ "visual.spatial_encoder.blocks.15.attn.proj.weight": "model-00001-of-00004.safetensors",
867
+ "visual.spatial_encoder.blocks.15.attn.qkv.bias": "model-00001-of-00004.safetensors",
868
+ "visual.spatial_encoder.blocks.15.attn.qkv.weight": "model-00001-of-00004.safetensors",
869
+ "visual.spatial_encoder.blocks.15.ls1.gamma": "model-00001-of-00004.safetensors",
870
+ "visual.spatial_encoder.blocks.15.ls2.gamma": "model-00001-of-00004.safetensors",
871
+ "visual.spatial_encoder.blocks.15.mlp.fc1.bias": "model-00001-of-00004.safetensors",
872
+ "visual.spatial_encoder.blocks.15.mlp.fc1.weight": "model-00001-of-00004.safetensors",
873
+ "visual.spatial_encoder.blocks.15.mlp.fc2.bias": "model-00001-of-00004.safetensors",
874
+ "visual.spatial_encoder.blocks.15.mlp.fc2.weight": "model-00001-of-00004.safetensors",
875
+ "visual.spatial_encoder.blocks.15.norm1.bias": "model-00001-of-00004.safetensors",
876
+ "visual.spatial_encoder.blocks.15.norm1.weight": "model-00001-of-00004.safetensors",
877
+ "visual.spatial_encoder.blocks.15.norm2.bias": "model-00001-of-00004.safetensors",
878
+ "visual.spatial_encoder.blocks.15.norm2.weight": "model-00001-of-00004.safetensors",
879
+ "visual.spatial_encoder.blocks.16.attn.proj.bias": "model-00001-of-00004.safetensors",
880
+ "visual.spatial_encoder.blocks.16.attn.proj.weight": "model-00001-of-00004.safetensors",
881
+ "visual.spatial_encoder.blocks.16.attn.qkv.bias": "model-00001-of-00004.safetensors",
882
+ "visual.spatial_encoder.blocks.16.attn.qkv.weight": "model-00001-of-00004.safetensors",
883
+ "visual.spatial_encoder.blocks.16.ls1.gamma": "model-00001-of-00004.safetensors",
884
+ "visual.spatial_encoder.blocks.16.ls2.gamma": "model-00001-of-00004.safetensors",
885
+ "visual.spatial_encoder.blocks.16.mlp.fc1.bias": "model-00001-of-00004.safetensors",
886
+ "visual.spatial_encoder.blocks.16.mlp.fc1.weight": "model-00001-of-00004.safetensors",
887
+ "visual.spatial_encoder.blocks.16.mlp.fc2.bias": "model-00001-of-00004.safetensors",
888
+ "visual.spatial_encoder.blocks.16.mlp.fc2.weight": "model-00001-of-00004.safetensors",
889
+ "visual.spatial_encoder.blocks.16.norm1.bias": "model-00001-of-00004.safetensors",
890
+ "visual.spatial_encoder.blocks.16.norm1.weight": "model-00001-of-00004.safetensors",
891
+ "visual.spatial_encoder.blocks.16.norm2.bias": "model-00001-of-00004.safetensors",
892
+ "visual.spatial_encoder.blocks.16.norm2.weight": "model-00001-of-00004.safetensors",
893
+ "visual.spatial_encoder.blocks.17.attn.proj.bias": "model-00001-of-00004.safetensors",
894
+ "visual.spatial_encoder.blocks.17.attn.proj.weight": "model-00001-of-00004.safetensors",
895
+ "visual.spatial_encoder.blocks.17.attn.qkv.bias": "model-00001-of-00004.safetensors",
896
+ "visual.spatial_encoder.blocks.17.attn.qkv.weight": "model-00001-of-00004.safetensors",
897
+ "visual.spatial_encoder.blocks.17.ls1.gamma": "model-00001-of-00004.safetensors",
898
+ "visual.spatial_encoder.blocks.17.ls2.gamma": "model-00001-of-00004.safetensors",
899
+ "visual.spatial_encoder.blocks.17.mlp.fc1.bias": "model-00001-of-00004.safetensors",
900
+ "visual.spatial_encoder.blocks.17.mlp.fc1.weight": "model-00001-of-00004.safetensors",
901
+ "visual.spatial_encoder.blocks.17.mlp.fc2.bias": "model-00001-of-00004.safetensors",
902
+ "visual.spatial_encoder.blocks.17.mlp.fc2.weight": "model-00001-of-00004.safetensors",
903
+ "visual.spatial_encoder.blocks.17.norm1.bias": "model-00001-of-00004.safetensors",
904
+ "visual.spatial_encoder.blocks.17.norm1.weight": "model-00001-of-00004.safetensors",
905
+ "visual.spatial_encoder.blocks.17.norm2.bias": "model-00001-of-00004.safetensors",
906
+ "visual.spatial_encoder.blocks.17.norm2.weight": "model-00001-of-00004.safetensors",
907
+ "visual.spatial_encoder.blocks.18.attn.proj.bias": "model-00001-of-00004.safetensors",
908
+ "visual.spatial_encoder.blocks.18.attn.proj.weight": "model-00001-of-00004.safetensors",
909
+ "visual.spatial_encoder.blocks.18.attn.qkv.bias": "model-00001-of-00004.safetensors",
910
+ "visual.spatial_encoder.blocks.18.attn.qkv.weight": "model-00001-of-00004.safetensors",
911
+ "visual.spatial_encoder.blocks.18.ls1.gamma": "model-00001-of-00004.safetensors",
912
+ "visual.spatial_encoder.blocks.18.ls2.gamma": "model-00001-of-00004.safetensors",
913
+ "visual.spatial_encoder.blocks.18.mlp.fc1.bias": "model-00001-of-00004.safetensors",
914
+ "visual.spatial_encoder.blocks.18.mlp.fc1.weight": "model-00001-of-00004.safetensors",
915
+ "visual.spatial_encoder.blocks.18.mlp.fc2.bias": "model-00001-of-00004.safetensors",
916
+ "visual.spatial_encoder.blocks.18.mlp.fc2.weight": "model-00001-of-00004.safetensors",
917
+ "visual.spatial_encoder.blocks.18.norm1.bias": "model-00001-of-00004.safetensors",
918
+ "visual.spatial_encoder.blocks.18.norm1.weight": "model-00001-of-00004.safetensors",
919
+ "visual.spatial_encoder.blocks.18.norm2.bias": "model-00001-of-00004.safetensors",
920
+ "visual.spatial_encoder.blocks.18.norm2.weight": "model-00001-of-00004.safetensors",
921
+ "visual.spatial_encoder.blocks.19.attn.proj.bias": "model-00001-of-00004.safetensors",
922
+ "visual.spatial_encoder.blocks.19.attn.proj.weight": "model-00001-of-00004.safetensors",
923
+ "visual.spatial_encoder.blocks.19.attn.qkv.bias": "model-00001-of-00004.safetensors",
924
+ "visual.spatial_encoder.blocks.19.attn.qkv.weight": "model-00001-of-00004.safetensors",
925
+ "visual.spatial_encoder.blocks.19.ls1.gamma": "model-00001-of-00004.safetensors",
926
+ "visual.spatial_encoder.blocks.19.ls2.gamma": "model-00001-of-00004.safetensors",
927
+ "visual.spatial_encoder.blocks.19.mlp.fc1.bias": "model-00001-of-00004.safetensors",
928
+ "visual.spatial_encoder.blocks.19.mlp.fc1.weight": "model-00001-of-00004.safetensors",
929
+ "visual.spatial_encoder.blocks.19.mlp.fc2.bias": "model-00001-of-00004.safetensors",
930
+ "visual.spatial_encoder.blocks.19.mlp.fc2.weight": "model-00001-of-00004.safetensors",
931
+ "visual.spatial_encoder.blocks.19.norm1.bias": "model-00001-of-00004.safetensors",
932
+ "visual.spatial_encoder.blocks.19.norm1.weight": "model-00001-of-00004.safetensors",
933
+ "visual.spatial_encoder.blocks.19.norm2.bias": "model-00001-of-00004.safetensors",
934
+ "visual.spatial_encoder.blocks.19.norm2.weight": "model-00001-of-00004.safetensors",
935
+ "visual.spatial_encoder.blocks.2.attn.proj.bias": "model-00001-of-00004.safetensors",
936
+ "visual.spatial_encoder.blocks.2.attn.proj.weight": "model-00001-of-00004.safetensors",
937
+ "visual.spatial_encoder.blocks.2.attn.qkv.bias": "model-00001-of-00004.safetensors",
938
+ "visual.spatial_encoder.blocks.2.attn.qkv.weight": "model-00001-of-00004.safetensors",
939
+ "visual.spatial_encoder.blocks.2.ls1.gamma": "model-00001-of-00004.safetensors",
940
+ "visual.spatial_encoder.blocks.2.ls2.gamma": "model-00001-of-00004.safetensors",
941
+ "visual.spatial_encoder.blocks.2.mlp.fc1.bias": "model-00001-of-00004.safetensors",
942
+ "visual.spatial_encoder.blocks.2.mlp.fc1.weight": "model-00001-of-00004.safetensors",
943
+ "visual.spatial_encoder.blocks.2.mlp.fc2.bias": "model-00001-of-00004.safetensors",
944
+ "visual.spatial_encoder.blocks.2.mlp.fc2.weight": "model-00001-of-00004.safetensors",
945
+ "visual.spatial_encoder.blocks.2.norm1.bias": "model-00001-of-00004.safetensors",
946
+ "visual.spatial_encoder.blocks.2.norm1.weight": "model-00001-of-00004.safetensors",
947
+ "visual.spatial_encoder.blocks.2.norm2.bias": "model-00001-of-00004.safetensors",
948
+ "visual.spatial_encoder.blocks.2.norm2.weight": "model-00001-of-00004.safetensors",
949
+ "visual.spatial_encoder.blocks.20.attn.proj.bias": "model-00001-of-00004.safetensors",
950
+ "visual.spatial_encoder.blocks.20.attn.proj.weight": "model-00001-of-00004.safetensors",
951
+ "visual.spatial_encoder.blocks.20.attn.qkv.bias": "model-00001-of-00004.safetensors",
952
+ "visual.spatial_encoder.blocks.20.attn.qkv.weight": "model-00001-of-00004.safetensors",
953
+ "visual.spatial_encoder.blocks.20.ls1.gamma": "model-00001-of-00004.safetensors",
954
+ "visual.spatial_encoder.blocks.20.ls2.gamma": "model-00001-of-00004.safetensors",
955
+ "visual.spatial_encoder.blocks.20.mlp.fc1.bias": "model-00001-of-00004.safetensors",
956
+ "visual.spatial_encoder.blocks.20.mlp.fc1.weight": "model-00001-of-00004.safetensors",
957
+ "visual.spatial_encoder.blocks.20.mlp.fc2.bias": "model-00001-of-00004.safetensors",
958
+ "visual.spatial_encoder.blocks.20.mlp.fc2.weight": "model-00001-of-00004.safetensors",
959
+ "visual.spatial_encoder.blocks.20.norm1.bias": "model-00001-of-00004.safetensors",
960
+ "visual.spatial_encoder.blocks.20.norm1.weight": "model-00001-of-00004.safetensors",
961
+ "visual.spatial_encoder.blocks.20.norm2.bias": "model-00001-of-00004.safetensors",
962
+ "visual.spatial_encoder.blocks.20.norm2.weight": "model-00001-of-00004.safetensors",
963
+ "visual.spatial_encoder.blocks.21.attn.proj.bias": "model-00001-of-00004.safetensors",
964
+ "visual.spatial_encoder.blocks.21.attn.proj.weight": "model-00001-of-00004.safetensors",
965
+ "visual.spatial_encoder.blocks.21.attn.qkv.bias": "model-00001-of-00004.safetensors",
966
+ "visual.spatial_encoder.blocks.21.attn.qkv.weight": "model-00001-of-00004.safetensors",
967
+ "visual.spatial_encoder.blocks.21.ls1.gamma": "model-00001-of-00004.safetensors",
968
+ "visual.spatial_encoder.blocks.21.ls2.gamma": "model-00001-of-00004.safetensors",
969
+ "visual.spatial_encoder.blocks.21.mlp.fc1.bias": "model-00001-of-00004.safetensors",
970
+ "visual.spatial_encoder.blocks.21.mlp.fc1.weight": "model-00001-of-00004.safetensors",
971
+ "visual.spatial_encoder.blocks.21.mlp.fc2.bias": "model-00001-of-00004.safetensors",
972
+ "visual.spatial_encoder.blocks.21.mlp.fc2.weight": "model-00001-of-00004.safetensors",
973
+ "visual.spatial_encoder.blocks.21.norm1.bias": "model-00001-of-00004.safetensors",
974
+ "visual.spatial_encoder.blocks.21.norm1.weight": "model-00001-of-00004.safetensors",
975
+ "visual.spatial_encoder.blocks.21.norm2.bias": "model-00001-of-00004.safetensors",
976
+ "visual.spatial_encoder.blocks.21.norm2.weight": "model-00001-of-00004.safetensors",
977
+ "visual.spatial_encoder.blocks.22.attn.proj.bias": "model-00001-of-00004.safetensors",
978
+ "visual.spatial_encoder.blocks.22.attn.proj.weight": "model-00001-of-00004.safetensors",
979
+ "visual.spatial_encoder.blocks.22.attn.qkv.bias": "model-00001-of-00004.safetensors",
980
+ "visual.spatial_encoder.blocks.22.attn.qkv.weight": "model-00001-of-00004.safetensors",
981
+ "visual.spatial_encoder.blocks.22.ls1.gamma": "model-00001-of-00004.safetensors",
982
+ "visual.spatial_encoder.blocks.22.ls2.gamma": "model-00001-of-00004.safetensors",
983
+ "visual.spatial_encoder.blocks.22.mlp.fc1.bias": "model-00001-of-00004.safetensors",
984
+ "visual.spatial_encoder.blocks.22.mlp.fc1.weight": "model-00001-of-00004.safetensors",
985
+ "visual.spatial_encoder.blocks.22.mlp.fc2.bias": "model-00001-of-00004.safetensors",
986
+ "visual.spatial_encoder.blocks.22.mlp.fc2.weight": "model-00001-of-00004.safetensors",
987
+ "visual.spatial_encoder.blocks.22.norm1.bias": "model-00001-of-00004.safetensors",
988
+ "visual.spatial_encoder.blocks.22.norm1.weight": "model-00001-of-00004.safetensors",
989
+ "visual.spatial_encoder.blocks.22.norm2.bias": "model-00001-of-00004.safetensors",
990
+ "visual.spatial_encoder.blocks.22.norm2.weight": "model-00001-of-00004.safetensors",
991
+ "visual.spatial_encoder.blocks.23.attn.proj.bias": "model-00001-of-00004.safetensors",
992
+ "visual.spatial_encoder.blocks.23.attn.proj.weight": "model-00001-of-00004.safetensors",
993
+ "visual.spatial_encoder.blocks.23.attn.qkv.bias": "model-00001-of-00004.safetensors",
994
+ "visual.spatial_encoder.blocks.23.attn.qkv.weight": "model-00001-of-00004.safetensors",
995
+ "visual.spatial_encoder.blocks.23.ls1.gamma": "model-00001-of-00004.safetensors",
996
+ "visual.spatial_encoder.blocks.23.ls2.gamma": "model-00001-of-00004.safetensors",
997
+ "visual.spatial_encoder.blocks.23.mlp.fc1.bias": "model-00001-of-00004.safetensors",
998
+ "visual.spatial_encoder.blocks.23.mlp.fc1.weight": "model-00001-of-00004.safetensors",
999
+ "visual.spatial_encoder.blocks.23.mlp.fc2.bias": "model-00001-of-00004.safetensors",
1000
+ "visual.spatial_encoder.blocks.23.mlp.fc2.weight": "model-00001-of-00004.safetensors",
1001
+ "visual.spatial_encoder.blocks.23.norm1.bias": "model-00001-of-00004.safetensors",
1002
+ "visual.spatial_encoder.blocks.23.norm1.weight": "model-00001-of-00004.safetensors",
1003
+ "visual.spatial_encoder.blocks.23.norm2.bias": "model-00001-of-00004.safetensors",
1004
+ "visual.spatial_encoder.blocks.23.norm2.weight": "model-00001-of-00004.safetensors",
1005
+ "visual.spatial_encoder.blocks.3.attn.proj.bias": "model-00001-of-00004.safetensors",
1006
+ "visual.spatial_encoder.blocks.3.attn.proj.weight": "model-00001-of-00004.safetensors",
1007
+ "visual.spatial_encoder.blocks.3.attn.qkv.bias": "model-00001-of-00004.safetensors",
1008
+ "visual.spatial_encoder.blocks.3.attn.qkv.weight": "model-00001-of-00004.safetensors",
1009
+ "visual.spatial_encoder.blocks.3.ls1.gamma": "model-00001-of-00004.safetensors",
1010
+ "visual.spatial_encoder.blocks.3.ls2.gamma": "model-00001-of-00004.safetensors",
1011
+ "visual.spatial_encoder.blocks.3.mlp.fc1.bias": "model-00001-of-00004.safetensors",
1012
+ "visual.spatial_encoder.blocks.3.mlp.fc1.weight": "model-00001-of-00004.safetensors",
1013
+ "visual.spatial_encoder.blocks.3.mlp.fc2.bias": "model-00001-of-00004.safetensors",
1014
+ "visual.spatial_encoder.blocks.3.mlp.fc2.weight": "model-00001-of-00004.safetensors",
1015
+ "visual.spatial_encoder.blocks.3.norm1.bias": "model-00001-of-00004.safetensors",
1016
+ "visual.spatial_encoder.blocks.3.norm1.weight": "model-00001-of-00004.safetensors",
1017
+ "visual.spatial_encoder.blocks.3.norm2.bias": "model-00001-of-00004.safetensors",
1018
+ "visual.spatial_encoder.blocks.3.norm2.weight": "model-00001-of-00004.safetensors",
1019
+ "visual.spatial_encoder.blocks.4.attn.proj.bias": "model-00001-of-00004.safetensors",
1020
+ "visual.spatial_encoder.blocks.4.attn.proj.weight": "model-00001-of-00004.safetensors",
1021
+ "visual.spatial_encoder.blocks.4.attn.qkv.bias": "model-00001-of-00004.safetensors",
1022
+ "visual.spatial_encoder.blocks.4.attn.qkv.weight": "model-00001-of-00004.safetensors",
1023
+ "visual.spatial_encoder.blocks.4.ls1.gamma": "model-00001-of-00004.safetensors",
1024
+ "visual.spatial_encoder.blocks.4.ls2.gamma": "model-00001-of-00004.safetensors",
1025
+ "visual.spatial_encoder.blocks.4.mlp.fc1.bias": "model-00001-of-00004.safetensors",
1026
+ "visual.spatial_encoder.blocks.4.mlp.fc1.weight": "model-00001-of-00004.safetensors",
1027
+ "visual.spatial_encoder.blocks.4.mlp.fc2.bias": "model-00001-of-00004.safetensors",
1028
+ "visual.spatial_encoder.blocks.4.mlp.fc2.weight": "model-00001-of-00004.safetensors",
1029
+ "visual.spatial_encoder.blocks.4.norm1.bias": "model-00001-of-00004.safetensors",
1030
+ "visual.spatial_encoder.blocks.4.norm1.weight": "model-00001-of-00004.safetensors",
1031
+ "visual.spatial_encoder.blocks.4.norm2.bias": "model-00001-of-00004.safetensors",
1032
+ "visual.spatial_encoder.blocks.4.norm2.weight": "model-00001-of-00004.safetensors",
1033
+ "visual.spatial_encoder.blocks.5.attn.proj.bias": "model-00001-of-00004.safetensors",
1034
+ "visual.spatial_encoder.blocks.5.attn.proj.weight": "model-00001-of-00004.safetensors",
1035
+ "visual.spatial_encoder.blocks.5.attn.qkv.bias": "model-00001-of-00004.safetensors",
1036
+ "visual.spatial_encoder.blocks.5.attn.qkv.weight": "model-00001-of-00004.safetensors",
1037
+ "visual.spatial_encoder.blocks.5.ls1.gamma": "model-00001-of-00004.safetensors",
1038
+ "visual.spatial_encoder.blocks.5.ls2.gamma": "model-00001-of-00004.safetensors",
1039
+ "visual.spatial_encoder.blocks.5.mlp.fc1.bias": "model-00001-of-00004.safetensors",
1040
+ "visual.spatial_encoder.blocks.5.mlp.fc1.weight": "model-00001-of-00004.safetensors",
1041
+ "visual.spatial_encoder.blocks.5.mlp.fc2.bias": "model-00001-of-00004.safetensors",
1042
+ "visual.spatial_encoder.blocks.5.mlp.fc2.weight": "model-00001-of-00004.safetensors",
1043
+ "visual.spatial_encoder.blocks.5.norm1.bias": "model-00001-of-00004.safetensors",
1044
+ "visual.spatial_encoder.blocks.5.norm1.weight": "model-00001-of-00004.safetensors",
1045
+ "visual.spatial_encoder.blocks.5.norm2.bias": "model-00001-of-00004.safetensors",
1046
+ "visual.spatial_encoder.blocks.5.norm2.weight": "model-00001-of-00004.safetensors",
1047
+ "visual.spatial_encoder.blocks.6.attn.proj.bias": "model-00001-of-00004.safetensors",
1048
+ "visual.spatial_encoder.blocks.6.attn.proj.weight": "model-00001-of-00004.safetensors",
1049
+ "visual.spatial_encoder.blocks.6.attn.qkv.bias": "model-00001-of-00004.safetensors",
1050
+ "visual.spatial_encoder.blocks.6.attn.qkv.weight": "model-00001-of-00004.safetensors",
1051
+ "visual.spatial_encoder.blocks.6.ls1.gamma": "model-00001-of-00004.safetensors",
1052
+ "visual.spatial_encoder.blocks.6.ls2.gamma": "model-00001-of-00004.safetensors",
1053
+ "visual.spatial_encoder.blocks.6.mlp.fc1.bias": "model-00001-of-00004.safetensors",
1054
+ "visual.spatial_encoder.blocks.6.mlp.fc1.weight": "model-00001-of-00004.safetensors",
1055
+ "visual.spatial_encoder.blocks.6.mlp.fc2.bias": "model-00001-of-00004.safetensors",
1056
+ "visual.spatial_encoder.blocks.6.mlp.fc2.weight": "model-00001-of-00004.safetensors",
1057
+ "visual.spatial_encoder.blocks.6.norm1.bias": "model-00001-of-00004.safetensors",
1058
+ "visual.spatial_encoder.blocks.6.norm1.weight": "model-00001-of-00004.safetensors",
1059
+ "visual.spatial_encoder.blocks.6.norm2.bias": "model-00001-of-00004.safetensors",
1060
+ "visual.spatial_encoder.blocks.6.norm2.weight": "model-00001-of-00004.safetensors",
1061
+ "visual.spatial_encoder.blocks.7.attn.proj.bias": "model-00001-of-00004.safetensors",
1062
+ "visual.spatial_encoder.blocks.7.attn.proj.weight": "model-00001-of-00004.safetensors",
1063
+ "visual.spatial_encoder.blocks.7.attn.qkv.bias": "model-00001-of-00004.safetensors",
1064
+ "visual.spatial_encoder.blocks.7.attn.qkv.weight": "model-00001-of-00004.safetensors",
1065
+ "visual.spatial_encoder.blocks.7.ls1.gamma": "model-00001-of-00004.safetensors",
1066
+ "visual.spatial_encoder.blocks.7.ls2.gamma": "model-00001-of-00004.safetensors",
1067
+ "visual.spatial_encoder.blocks.7.mlp.fc1.bias": "model-00001-of-00004.safetensors",
1068
+ "visual.spatial_encoder.blocks.7.mlp.fc1.weight": "model-00001-of-00004.safetensors",
1069
+ "visual.spatial_encoder.blocks.7.mlp.fc2.bias": "model-00001-of-00004.safetensors",
1070
+ "visual.spatial_encoder.blocks.7.mlp.fc2.weight": "model-00001-of-00004.safetensors",
1071
+ "visual.spatial_encoder.blocks.7.norm1.bias": "model-00001-of-00004.safetensors",
1072
+ "visual.spatial_encoder.blocks.7.norm1.weight": "model-00001-of-00004.safetensors",
1073
+ "visual.spatial_encoder.blocks.7.norm2.bias": "model-00001-of-00004.safetensors",
1074
+ "visual.spatial_encoder.blocks.7.norm2.weight": "model-00001-of-00004.safetensors",
1075
+ "visual.spatial_encoder.blocks.8.attn.proj.bias": "model-00001-of-00004.safetensors",
1076
+ "visual.spatial_encoder.blocks.8.attn.proj.weight": "model-00001-of-00004.safetensors",
1077
+ "visual.spatial_encoder.blocks.8.attn.qkv.bias": "model-00001-of-00004.safetensors",
1078
+ "visual.spatial_encoder.blocks.8.attn.qkv.weight": "model-00001-of-00004.safetensors",
1079
+ "visual.spatial_encoder.blocks.8.ls1.gamma": "model-00001-of-00004.safetensors",
1080
+ "visual.spatial_encoder.blocks.8.ls2.gamma": "model-00001-of-00004.safetensors",
1081
+ "visual.spatial_encoder.blocks.8.mlp.fc1.bias": "model-00001-of-00004.safetensors",
1082
+ "visual.spatial_encoder.blocks.8.mlp.fc1.weight": "model-00001-of-00004.safetensors",
1083
+ "visual.spatial_encoder.blocks.8.mlp.fc2.bias": "model-00001-of-00004.safetensors",
1084
+ "visual.spatial_encoder.blocks.8.mlp.fc2.weight": "model-00001-of-00004.safetensors",
1085
+ "visual.spatial_encoder.blocks.8.norm1.bias": "model-00001-of-00004.safetensors",
1086
+ "visual.spatial_encoder.blocks.8.norm1.weight": "model-00001-of-00004.safetensors",
1087
+ "visual.spatial_encoder.blocks.8.norm2.bias": "model-00001-of-00004.safetensors",
1088
+ "visual.spatial_encoder.blocks.8.norm2.weight": "model-00001-of-00004.safetensors",
1089
+ "visual.spatial_encoder.blocks.9.attn.proj.bias": "model-00001-of-00004.safetensors",
1090
+ "visual.spatial_encoder.blocks.9.attn.proj.weight": "model-00001-of-00004.safetensors",
1091
+ "visual.spatial_encoder.blocks.9.attn.qkv.bias": "model-00001-of-00004.safetensors",
1092
+ "visual.spatial_encoder.blocks.9.attn.qkv.weight": "model-00001-of-00004.safetensors",
1093
+ "visual.spatial_encoder.blocks.9.ls1.gamma": "model-00001-of-00004.safetensors",
1094
+ "visual.spatial_encoder.blocks.9.ls2.gamma": "model-00001-of-00004.safetensors",
1095
+ "visual.spatial_encoder.blocks.9.mlp.fc1.bias": "model-00001-of-00004.safetensors",
1096
+ "visual.spatial_encoder.blocks.9.mlp.fc1.weight": "model-00001-of-00004.safetensors",
1097
+ "visual.spatial_encoder.blocks.9.mlp.fc2.bias": "model-00001-of-00004.safetensors",
1098
+ "visual.spatial_encoder.blocks.9.mlp.fc2.weight": "model-00001-of-00004.safetensors",
1099
+ "visual.spatial_encoder.blocks.9.norm1.bias": "model-00001-of-00004.safetensors",
1100
+ "visual.spatial_encoder.blocks.9.norm1.weight": "model-00001-of-00004.safetensors",
1101
+ "visual.spatial_encoder.blocks.9.norm2.bias": "model-00001-of-00004.safetensors",
1102
+ "visual.spatial_encoder.blocks.9.norm2.weight": "model-00001-of-00004.safetensors",
1103
+ "visual.spatial_encoder.cls_token": "model-00001-of-00004.safetensors",
1104
+ "visual.spatial_encoder.norm.bias": "model-00001-of-00004.safetensors",
1105
+ "visual.spatial_encoder.norm.weight": "model-00001-of-00004.safetensors",
1106
+ "visual.spatial_encoder.patch_embed.proj.bias": "model-00001-of-00004.safetensors",
1107
+ "visual.spatial_encoder.patch_embed.proj.weight": "model-00001-of-00004.safetensors",
1108
+ "visual.spatial_encoder.pos_embed": "model-00001-of-00004.safetensors",
1109
+ "visual.spatial_encoder.register_tokens": "model-00001-of-00004.safetensors",
1110
+ "visual.spatial_merger.ln_q.weight": "model-00001-of-00004.safetensors",
1111
+ "visual.spatial_merger.mlp.0.bias": "model-00001-of-00004.safetensors",
1112
+ "visual.spatial_merger.mlp.0.weight": "model-00001-of-00004.safetensors",
1113
+ "visual.spatial_merger.mlp.2.bias": "model-00001-of-00004.safetensors",
1114
+ "visual.spatial_merger.mlp.2.weight": "model-00001-of-00004.safetensors"
1115
+ }
1116
+ }
preprocessor_config.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "do_convert_rgb": true,
3
+ "do_normalize": true,
4
+ "do_rescale": true,
5
+ "do_resize": true,
6
+ "image_mean": [
7
+ 0.48145466,
8
+ 0.4578275,
9
+ 0.40821073
10
+ ],
11
+ "image_processor_type": "Qwen2VLImageProcessor",
12
+ "image_std": [
13
+ 0.26862954,
14
+ 0.26130258,
15
+ 0.27577711
16
+ ],
17
+ "max_pixels": 230400,
18
+ "merge_size": 2,
19
+ "min_pixels": 784,
20
+ "patch_size": 14,
21
+ "processor_class": "Qwen2_5_VLProcessor",
22
+ "resample": 3,
23
+ "rescale_factor": 0.00392156862745098,
24
+ "size": {
25
+ "longest_edge": 230400,
26
+ "shortest_edge": 784
27
+ },
28
+ "temporal_patch_size": 2
29
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<|im_start|>",
4
+ "<|im_end|>",
5
+ "<|object_ref_start|>",
6
+ "<|object_ref_end|>",
7
+ "<|box_start|>",
8
+ "<|box_end|>",
9
+ "<|quad_start|>",
10
+ "<|quad_end|>",
11
+ "<|vision_start|>",
12
+ "<|vision_end|>",
13
+ "<|vision_pad|>",
14
+ "<|image_pad|>",
15
+ "<|video_pad|>"
16
+ ],
17
+ "eos_token": {
18
+ "content": "<|im_end|>",
19
+ "lstrip": false,
20
+ "normalized": false,
21
+ "rstrip": false,
22
+ "single_word": false
23
+ },
24
+ "pad_token": {
25
+ "content": "<|endoftext|>",
26
+ "lstrip": false,
27
+ "normalized": false,
28
+ "rstrip": false,
29
+ "single_word": false
30
+ }
31
+ }
tokenizer_config.json ADDED
@@ -0,0 +1,209 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": false,
3
+ "add_prefix_space": false,
4
+ "added_tokens_decoder": {
5
+ "151643": {
6
+ "content": "<|endoftext|>",
7
+ "lstrip": false,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false,
11
+ "special": true
12
+ },
13
+ "151644": {
14
+ "content": "<|im_start|>",
15
+ "lstrip": false,
16
+ "normalized": false,
17
+ "rstrip": false,
18
+ "single_word": false,
19
+ "special": true
20
+ },
21
+ "151645": {
22
+ "content": "<|im_end|>",
23
+ "lstrip": false,
24
+ "normalized": false,
25
+ "rstrip": false,
26
+ "single_word": false,
27
+ "special": true
28
+ },
29
+ "151646": {
30
+ "content": "<|object_ref_start|>",
31
+ "lstrip": false,
32
+ "normalized": false,
33
+ "rstrip": false,
34
+ "single_word": false,
35
+ "special": true
36
+ },
37
+ "151647": {
38
+ "content": "<|object_ref_end|>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false,
43
+ "special": true
44
+ },
45
+ "151648": {
46
+ "content": "<|box_start|>",
47
+ "lstrip": false,
48
+ "normalized": false,
49
+ "rstrip": false,
50
+ "single_word": false,
51
+ "special": true
52
+ },
53
+ "151649": {
54
+ "content": "<|box_end|>",
55
+ "lstrip": false,
56
+ "normalized": false,
57
+ "rstrip": false,
58
+ "single_word": false,
59
+ "special": true
60
+ },
61
+ "151650": {
62
+ "content": "<|quad_start|>",
63
+ "lstrip": false,
64
+ "normalized": false,
65
+ "rstrip": false,
66
+ "single_word": false,
67
+ "special": true
68
+ },
69
+ "151651": {
70
+ "content": "<|quad_end|>",
71
+ "lstrip": false,
72
+ "normalized": false,
73
+ "rstrip": false,
74
+ "single_word": false,
75
+ "special": true
76
+ },
77
+ "151652": {
78
+ "content": "<|vision_start|>",
79
+ "lstrip": false,
80
+ "normalized": false,
81
+ "rstrip": false,
82
+ "single_word": false,
83
+ "special": true
84
+ },
85
+ "151653": {
86
+ "content": "<|vision_end|>",
87
+ "lstrip": false,
88
+ "normalized": false,
89
+ "rstrip": false,
90
+ "single_word": false,
91
+ "special": true
92
+ },
93
+ "151654": {
94
+ "content": "<|vision_pad|>",
95
+ "lstrip": false,
96
+ "normalized": false,
97
+ "rstrip": false,
98
+ "single_word": false,
99
+ "special": true
100
+ },
101
+ "151655": {
102
+ "content": "<|image_pad|>",
103
+ "lstrip": false,
104
+ "normalized": false,
105
+ "rstrip": false,
106
+ "single_word": false,
107
+ "special": true
108
+ },
109
+ "151656": {
110
+ "content": "<|video_pad|>",
111
+ "lstrip": false,
112
+ "normalized": false,
113
+ "rstrip": false,
114
+ "single_word": false,
115
+ "special": true
116
+ },
117
+ "151657": {
118
+ "content": "<tool_call>",
119
+ "lstrip": false,
120
+ "normalized": false,
121
+ "rstrip": false,
122
+ "single_word": false,
123
+ "special": false
124
+ },
125
+ "151658": {
126
+ "content": "</tool_call>",
127
+ "lstrip": false,
128
+ "normalized": false,
129
+ "rstrip": false,
130
+ "single_word": false,
131
+ "special": false
132
+ },
133
+ "151659": {
134
+ "content": "<|fim_prefix|>",
135
+ "lstrip": false,
136
+ "normalized": false,
137
+ "rstrip": false,
138
+ "single_word": false,
139
+ "special": false
140
+ },
141
+ "151660": {
142
+ "content": "<|fim_middle|>",
143
+ "lstrip": false,
144
+ "normalized": false,
145
+ "rstrip": false,
146
+ "single_word": false,
147
+ "special": false
148
+ },
149
+ "151661": {
150
+ "content": "<|fim_suffix|>",
151
+ "lstrip": false,
152
+ "normalized": false,
153
+ "rstrip": false,
154
+ "single_word": false,
155
+ "special": false
156
+ },
157
+ "151662": {
158
+ "content": "<|fim_pad|>",
159
+ "lstrip": false,
160
+ "normalized": false,
161
+ "rstrip": false,
162
+ "single_word": false,
163
+ "special": false
164
+ },
165
+ "151663": {
166
+ "content": "<|repo_name|>",
167
+ "lstrip": false,
168
+ "normalized": false,
169
+ "rstrip": false,
170
+ "single_word": false,
171
+ "special": false
172
+ },
173
+ "151664": {
174
+ "content": "<|file_sep|>",
175
+ "lstrip": false,
176
+ "normalized": false,
177
+ "rstrip": false,
178
+ "single_word": false,
179
+ "special": false
180
+ }
181
+ },
182
+ "additional_special_tokens": [
183
+ "<|im_start|>",
184
+ "<|im_end|>",
185
+ "<|object_ref_start|>",
186
+ "<|object_ref_end|>",
187
+ "<|box_start|>",
188
+ "<|box_end|>",
189
+ "<|quad_start|>",
190
+ "<|quad_end|>",
191
+ "<|vision_start|>",
192
+ "<|vision_end|>",
193
+ "<|vision_pad|>",
194
+ "<|image_pad|>",
195
+ "<|video_pad|>"
196
+ ],
197
+ "bos_token": null,
198
+ "chat_template": "{% set image_count = namespace(value=0) %}{% set video_count = namespace(value=0) %}{% for message in messages %}{% if loop.first and message['role'] != 'system' %}<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n{% endif %}<|im_start|>{{ message['role'] }}\n{% if message['content'] is string %}{{ message['content'] }}<|im_end|>\n{% else %}{% for content in message['content'] %}{% if content['type'] == 'image' or 'image' in content or 'image_url' in content %}{% set image_count.value = image_count.value + 1 %}{% if add_vision_id %}Picture {{ image_count.value }}: {% endif %}<|vision_start|><|image_pad|><|vision_end|>{% elif content['type'] == 'video' or 'video' in content %}{% set video_count.value = video_count.value + 1 %}{% if add_vision_id %}Video {{ video_count.value }}: {% endif %}<|vision_start|><|video_pad|><|vision_end|>{% elif 'text' in content %}{{ content['text'] }}{% endif %}{% endfor %}<|im_end|>\n{% endif %}{% endfor %}{% if add_generation_prompt %}<|im_start|>assistant\n{% endif %}",
199
+ "clean_up_tokenization_spaces": false,
200
+ "eos_token": "<|im_end|>",
201
+ "errors": "replace",
202
+ "extra_special_tokens": {},
203
+ "model_max_length": 8192,
204
+ "pad_token": "<|endoftext|>",
205
+ "padding_side": "right",
206
+ "split_special_tokens": false,
207
+ "tokenizer_class": "Qwen2Tokenizer",
208
+ "unk_token": null
209
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff