Rnfudge commited on
Commit
ce2afea
·
verified ·
1 Parent(s): 72edaa7

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 2560,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": true,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,843 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - unsloth
4
+ - sentence-transformers
5
+ - sentence-similarity
6
+ - feature-extraction
7
+ - dense
8
+ - generated_from_trainer
9
+ - dataset_size:223748
10
+ - loss:MultipleNegativesRankingLoss
11
+ widget:
12
+ - source_sentence: What is the significance of the IPv6 multicast address ff02::1?
13
+ sentences:
14
+ - Felt board for classroom activities
15
+ - In the provided network output, the frequent appearance of `ff020000000000000000000000000001`
16
+ across various interfaces like `lo`, `eth0`, and `eth1` indicates that these interfaces
17
+ are correctly configured for basic IPv6 operations. Every active IPv6 interface
18
+ on a segment must listen for messages sent to `ff02::1` to participate in essential
19
+ link-local protocols, making its presence a standard and expected entry.
20
+ - Not all customizations are supported across all snapd image types or models. For
21
+ example, certain customizations might be unsupported for UC20+ or classic models,
22
+ leading to errors. Additionally, if a gadget snap itself defines `defaults` in
23
+ its `meta/gadget.yaml`, these can be overridden or complemented by the `Customizations`
24
+ provided during the `SetupSeed` call, affecting system services like SSH.
25
+ - source_sentence: vein
26
+ sentences:
27
+ - blood vessel
28
+ - 'The `hkdf.Key` function requires several inputs: the underlying hash function
29
+ for HMAC (e.g., `sha256.New`), the master `secret` material, an optional `salt`
30
+ value, context-specific `info`, and the desired `keyLen` for the output derived
31
+ key. These parameters collectively guide the key derivation process.'
32
+ - egg-laying
33
+ - source_sentence: How are special file types determined in file status?
34
+ sentences:
35
+ - Integrated into the *ensure loop*, the `TaskRunner`'s `Ensure` method is invoked
36
+ periodically to manage task execution. It's responsible for spawning goroutines
37
+ to concurrently execute task handlers, whether for their primary 'do' logic or
38
+ their 'undo' logic in case of failures. High-level system parts can also trigger
39
+ its execution proactively using `State.EnsureBefore`.
40
+ - File type identification within the `fileStat` population involves a critical
41
+ step where the `fs.sys.Mode` value is masked with `syscall.S_IFMT`. This operation
42
+ allows the function to discern whether the file is a block device (`S_IFBLK`),
43
+ a character device (`S_IFCHR`), a named pipe (`S_IFIFO`), a socket (`S_IFSOCK`),
44
+ or a regular file (`S_IFREG`), applying the appropriate `FileMode` flags.
45
+ - Volatility acceptance
46
+ - source_sentence: mitre
47
+ sentences:
48
+ - ocean liner
49
+ - It becomes necessary because, during the initial `mmap` of an output buffer, no
50
+ code signature typically exists. After the signature is finally created, the kernel's
51
+ cached view might not reflect this change. Therefore, `purgeSignatureCache` explicitly
52
+ clears this cache to prevent problems related to stale signature information.
53
+ - Clerical cap
54
+ - source_sentence: craniofacial
55
+ sentences:
56
+ - head and face structure
57
+ - Planned destruction of structures using explosives or machinery
58
+ - Anchor-positive pairs are fundamental to contrastive learning, serving to define
59
+ what the model should consider as semantically similar data points, guiding it
60
+ to learn meaningful representations.
61
+ pipeline_tag: sentence-similarity
62
+ library_name: sentence-transformers
63
+ ---
64
+
65
+ # SentenceTransformer
66
+
67
+ This model was finetuned with [Unsloth](https://github.com/unslothai/unsloth).
68
+
69
+ [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
70
+
71
+
72
+ This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps sentences & paragraphs to a 2560-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
73
+
74
+ ## Model Details
75
+
76
+ ### Model Description
77
+ - **Model Type:** Sentence Transformer
78
+ <!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
79
+ - **Maximum Sequence Length:** 8192 tokens
80
+ - **Output Dimensionality:** 2560 dimensions
81
+ - **Similarity Function:** Cosine Similarity
82
+ <!-- - **Training Dataset:** Unknown -->
83
+ <!-- - **Language:** Unknown -->
84
+ <!-- - **License:** Unknown -->
85
+
86
+ ### Model Sources
87
+
88
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
89
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/huggingface/sentence-transformers)
90
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
91
+
92
+ ### Full Model Architecture
93
+
94
+ ```
95
+ SentenceTransformer(
96
+ (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'PeftModelForFeatureExtraction'})
97
+ (1): Pooling({'word_embedding_dimension': 2560, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': True, 'include_prompt': True})
98
+ (2): Normalize()
99
+ )
100
+ ```
101
+
102
+ ## Usage
103
+
104
+ ### Direct Usage (Sentence Transformers)
105
+
106
+ First install the Sentence Transformers library:
107
+
108
+ ```bash
109
+ pip install -U sentence-transformers
110
+ ```
111
+
112
+ Then you can load this model and run inference.
113
+ ```python
114
+ from sentence_transformers import SentenceTransformer
115
+
116
+ # Download from the 🤗 Hub
117
+ model = SentenceTransformer("sentence_transformers_model_id")
118
+ # Run inference
119
+ sentences = [
120
+ 'craniofacial',
121
+ 'head and face structure',
122
+ 'Anchor-positive pairs are fundamental to contrastive learning, serving to define what the model should consider as semantically similar data points, guiding it to learn meaningful representations.',
123
+ ]
124
+ embeddings = model.encode(sentences)
125
+ print(embeddings.shape)
126
+ # [3, 2560]
127
+
128
+ # Get the similarity scores for the embeddings
129
+ similarities = model.similarity(embeddings, embeddings)
130
+ print(similarities)
131
+ # tensor([[1.0000, 0.7268, 0.0036],
132
+ # [0.7268, 1.0000, 0.0179],
133
+ # [0.0036, 0.0179, 1.0000]])
134
+ ```
135
+
136
+ <!--
137
+ ### Direct Usage (Transformers)
138
+
139
+ <details><summary>Click to see the direct usage in Transformers</summary>
140
+
141
+ </details>
142
+ -->
143
+
144
+ <!--
145
+ ### Downstream Usage (Sentence Transformers)
146
+
147
+ You can finetune this model on your own dataset.
148
+
149
+ <details><summary>Click to expand</summary>
150
+
151
+ </details>
152
+ -->
153
+
154
+ <!--
155
+ ### Out-of-Scope Use
156
+
157
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
158
+ -->
159
+
160
+ <!--
161
+ ## Bias, Risks and Limitations
162
+
163
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
164
+ -->
165
+
166
+ <!--
167
+ ### Recommendations
168
+
169
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
170
+ -->
171
+
172
+ ## Training Details
173
+
174
+ ### Training Dataset
175
+
176
+ #### Unnamed Dataset
177
+
178
+ * Size: 223,748 training samples
179
+ * Columns: <code>anchor</code> and <code>positive</code>
180
+ * Approximate statistics based on the first 1000 samples:
181
+ | | anchor | positive |
182
+ |:--------|:---------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
183
+ | type | string | string |
184
+ | details | <ul><li>min: 2 tokens</li><li>mean: 8.95 tokens</li><li>max: 33 tokens</li></ul> | <ul><li>min: 2 tokens</li><li>mean: 38.48 tokens</li><li>max: 124 tokens</li></ul> |
185
+ * Samples:
186
+ | anchor | positive |
187
+ |:-----------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
188
+ | <code>groupthink</code> | <code>Psychological tendency for group conformity</code> |
189
+ | <code>customs and border protection</code> | <code>DHS component enforcing trade and immigration laws</code> |
190
+ | <code>What is the meaning and purpose of the `//go:noescape` directive in Go functions?</code> | <code>The `//go:noescape` comment is a hint to the Go compiler. It asserts that none of the pointer parameters of the decorated function will escape the function's stack frame. This is primarily used for performance tuning in low-level code, ensuring that objects pointed to by function arguments are not allocated on the heap, thus avoiding garbage collection cycles.</code> |
191
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
192
+ ```json
193
+ {
194
+ "scale": 20.0,
195
+ "similarity_fct": "cos_sim",
196
+ "gather_across_devices": false,
197
+ "directions": [
198
+ "query_to_doc"
199
+ ],
200
+ "partition_mode": "joint",
201
+ "hardness_mode": null,
202
+ "hardness_strength": 0.0
203
+ }
204
+ ```
205
+
206
+ ### Training Hyperparameters
207
+ #### Non-Default Hyperparameters
208
+
209
+ - `per_device_train_batch_size`: 64
210
+ - `gradient_accumulation_steps`: 8
211
+ - `learning_rate`: 3e-05
212
+ - `num_train_epochs`: 1
213
+ - `lr_scheduler_type`: constant_with_warmup
214
+ - `warmup_ratio`: 0.03
215
+ - `bf16`: True
216
+ - `batch_sampler`: no_duplicates
217
+
218
+ #### All Hyperparameters
219
+ <details><summary>Click to expand</summary>
220
+
221
+ - `overwrite_output_dir`: False
222
+ - `do_predict`: False
223
+ - `eval_strategy`: no
224
+ - `prediction_loss_only`: True
225
+ - `per_device_train_batch_size`: 64
226
+ - `per_device_eval_batch_size`: 8
227
+ - `per_gpu_train_batch_size`: None
228
+ - `per_gpu_eval_batch_size`: None
229
+ - `gradient_accumulation_steps`: 8
230
+ - `eval_accumulation_steps`: None
231
+ - `torch_empty_cache_steps`: None
232
+ - `learning_rate`: 3e-05
233
+ - `weight_decay`: 0.0
234
+ - `adam_beta1`: 0.9
235
+ - `adam_beta2`: 0.999
236
+ - `adam_epsilon`: 1e-08
237
+ - `max_grad_norm`: 1.0
238
+ - `num_train_epochs`: 1
239
+ - `max_steps`: -1
240
+ - `lr_scheduler_type`: constant_with_warmup
241
+ - `lr_scheduler_kwargs`: {}
242
+ - `warmup_ratio`: 0.03
243
+ - `warmup_steps`: 0
244
+ - `log_level`: passive
245
+ - `log_level_replica`: warning
246
+ - `log_on_each_node`: True
247
+ - `logging_nan_inf_filter`: True
248
+ - `save_safetensors`: True
249
+ - `save_on_each_node`: False
250
+ - `save_only_model`: False
251
+ - `restore_callback_states_from_checkpoint`: False
252
+ - `no_cuda`: False
253
+ - `use_cpu`: False
254
+ - `use_mps_device`: False
255
+ - `seed`: 42
256
+ - `data_seed`: None
257
+ - `jit_mode_eval`: False
258
+ - `use_ipex`: False
259
+ - `bf16`: True
260
+ - `fp16`: False
261
+ - `fp16_opt_level`: O1
262
+ - `half_precision_backend`: auto
263
+ - `bf16_full_eval`: False
264
+ - `fp16_full_eval`: False
265
+ - `tf32`: None
266
+ - `local_rank`: 0
267
+ - `ddp_backend`: None
268
+ - `tpu_num_cores`: None
269
+ - `tpu_metrics_debug`: False
270
+ - `debug`: []
271
+ - `dataloader_drop_last`: False
272
+ - `dataloader_num_workers`: 0
273
+ - `dataloader_prefetch_factor`: None
274
+ - `past_index`: -1
275
+ - `disable_tqdm`: False
276
+ - `remove_unused_columns`: True
277
+ - `label_names`: None
278
+ - `load_best_model_at_end`: False
279
+ - `ignore_data_skip`: False
280
+ - `fsdp`: []
281
+ - `fsdp_min_num_params`: 0
282
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
283
+ - `fsdp_transformer_layer_cls_to_wrap`: None
284
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
285
+ - `parallelism_config`: None
286
+ - `deepspeed`: None
287
+ - `label_smoothing_factor`: 0.0
288
+ - `optim`: adamw_torch_fused
289
+ - `optim_args`: None
290
+ - `adafactor`: False
291
+ - `group_by_length`: False
292
+ - `length_column_name`: length
293
+ - `ddp_find_unused_parameters`: None
294
+ - `ddp_bucket_cap_mb`: None
295
+ - `ddp_broadcast_buffers`: False
296
+ - `dataloader_pin_memory`: True
297
+ - `dataloader_persistent_workers`: False
298
+ - `skip_memory_metrics`: True
299
+ - `use_legacy_prediction_loop`: False
300
+ - `push_to_hub`: False
301
+ - `resume_from_checkpoint`: None
302
+ - `hub_model_id`: None
303
+ - `hub_strategy`: every_save
304
+ - `hub_private_repo`: None
305
+ - `hub_always_push`: False
306
+ - `hub_revision`: None
307
+ - `gradient_checkpointing`: False
308
+ - `gradient_checkpointing_kwargs`: None
309
+ - `include_inputs_for_metrics`: False
310
+ - `include_for_metrics`: []
311
+ - `eval_do_concat_batches`: True
312
+ - `fp16_backend`: auto
313
+ - `push_to_hub_model_id`: None
314
+ - `push_to_hub_organization`: None
315
+ - `mp_parameters`:
316
+ - `auto_find_batch_size`: False
317
+ - `full_determinism`: False
318
+ - `torchdynamo`: None
319
+ - `ray_scope`: last
320
+ - `ddp_timeout`: 1800
321
+ - `torch_compile`: False
322
+ - `torch_compile_backend`: None
323
+ - `torch_compile_mode`: None
324
+ - `include_tokens_per_second`: False
325
+ - `include_num_input_tokens_seen`: False
326
+ - `neftune_noise_alpha`: None
327
+ - `optim_target_modules`: None
328
+ - `batch_eval_metrics`: False
329
+ - `eval_on_start`: False
330
+ - `use_liger_kernel`: False
331
+ - `liger_kernel_config`: None
332
+ - `eval_use_gather_object`: False
333
+ - `average_tokens_across_devices`: False
334
+ - `prompts`: None
335
+ - `batch_sampler`: no_duplicates
336
+ - `multi_dataset_batch_sampler`: proportional
337
+ - `router_mapping`: {}
338
+ - `learning_rate_mapping`: {}
339
+
340
+ </details>
341
+
342
+ ### Training Logs
343
+ <details><summary>Click to expand</summary>
344
+
345
+ | Epoch | Step | Training Loss |
346
+ |:------:|:----:|:-------------:|
347
+ | 0.0023 | 1 | 0.5184 |
348
+ | 0.0046 | 2 | 0.5683 |
349
+ | 0.0069 | 3 | 0.5821 |
350
+ | 0.0092 | 4 | 0.4948 |
351
+ | 0.0114 | 5 | 0.4001 |
352
+ | 0.0137 | 6 | 0.3097 |
353
+ | 0.0160 | 7 | 0.257 |
354
+ | 0.0183 | 8 | 0.2752 |
355
+ | 0.0206 | 9 | 0.2311 |
356
+ | 0.0229 | 10 | 0.1433 |
357
+ | 0.0252 | 11 | 0.2507 |
358
+ | 0.0275 | 12 | 0.1944 |
359
+ | 0.0297 | 13 | 0.2052 |
360
+ | 0.0320 | 14 | 0.1044 |
361
+ | 0.0343 | 15 | 0.2027 |
362
+ | 0.0366 | 16 | 0.1969 |
363
+ | 0.0389 | 17 | 0.1833 |
364
+ | 0.0412 | 18 | 0.1641 |
365
+ | 0.0435 | 19 | 0.1629 |
366
+ | 0.0458 | 20 | 0.1702 |
367
+ | 0.0480 | 21 | 0.1855 |
368
+ | 0.0503 | 22 | 0.1697 |
369
+ | 0.0526 | 23 | 0.116 |
370
+ | 0.0549 | 24 | 0.1373 |
371
+ | 0.0572 | 25 | 0.1323 |
372
+ | 0.0595 | 26 | 0.1349 |
373
+ | 0.0618 | 27 | 0.1199 |
374
+ | 0.0641 | 28 | 0.1353 |
375
+ | 0.0663 | 29 | 0.143 |
376
+ | 0.0686 | 30 | 0.1305 |
377
+ | 0.0709 | 31 | 0.1088 |
378
+ | 0.0732 | 32 | 0.0908 |
379
+ | 0.0755 | 33 | 0.1502 |
380
+ | 0.0778 | 34 | 0.1139 |
381
+ | 0.0801 | 35 | 0.1311 |
382
+ | 0.0824 | 36 | 0.1291 |
383
+ | 0.0846 | 37 | 0.0977 |
384
+ | 0.0869 | 38 | 0.0962 |
385
+ | 0.0892 | 39 | 0.1166 |
386
+ | 0.0915 | 40 | 0.0965 |
387
+ | 0.0938 | 41 | 0.1242 |
388
+ | 0.0961 | 42 | 0.0705 |
389
+ | 0.0984 | 43 | 0.0813 |
390
+ | 0.1007 | 44 | 0.1545 |
391
+ | 0.1029 | 45 | 0.0868 |
392
+ | 0.1052 | 46 | 0.0987 |
393
+ | 0.1075 | 47 | 0.0938 |
394
+ | 0.1098 | 48 | 0.1086 |
395
+ | 0.1121 | 49 | 0.0982 |
396
+ | 0.1144 | 50 | 0.0817 |
397
+ | 0.1167 | 51 | 0.0527 |
398
+ | 0.1190 | 52 | 0.0986 |
399
+ | 0.1212 | 53 | 0.098 |
400
+ | 0.1235 | 54 | 0.1074 |
401
+ | 0.1258 | 55 | 0.1396 |
402
+ | 0.1281 | 56 | 0.1101 |
403
+ | 0.1304 | 57 | 0.0829 |
404
+ | 0.1327 | 58 | 0.1261 |
405
+ | 0.1350 | 59 | 0.048 |
406
+ | 0.1373 | 60 | 0.1215 |
407
+ | 0.1395 | 61 | 0.0981 |
408
+ | 0.1418 | 62 | 0.0739 |
409
+ | 0.1441 | 63 | 0.0525 |
410
+ | 0.1464 | 64 | 0.0757 |
411
+ | 0.1487 | 65 | 0.0543 |
412
+ | 0.1510 | 66 | 0.0878 |
413
+ | 0.1533 | 67 | 0.0791 |
414
+ | 0.1556 | 68 | 0.0816 |
415
+ | 0.1578 | 69 | 0.0999 |
416
+ | 0.1601 | 70 | 0.086 |
417
+ | 0.1624 | 71 | 0.0775 |
418
+ | 0.1647 | 72 | 0.1048 |
419
+ | 0.1670 | 73 | 0.0552 |
420
+ | 0.1693 | 74 | 0.0619 |
421
+ | 0.1716 | 75 | 0.0667 |
422
+ | 0.1739 | 76 | 0.0787 |
423
+ | 0.1762 | 77 | 0.1022 |
424
+ | 0.1784 | 78 | 0.0937 |
425
+ | 0.1807 | 79 | 0.0751 |
426
+ | 0.1830 | 80 | 0.0642 |
427
+ | 0.1853 | 81 | 0.0508 |
428
+ | 0.1876 | 82 | 0.1169 |
429
+ | 0.1899 | 83 | 0.09 |
430
+ | 0.1922 | 84 | 0.0725 |
431
+ | 0.1945 | 85 | 0.0476 |
432
+ | 0.1967 | 86 | 0.0737 |
433
+ | 0.1990 | 87 | 0.0968 |
434
+ | 0.2013 | 88 | 0.0988 |
435
+ | 0.2036 | 89 | 0.0575 |
436
+ | 0.2059 | 90 | 0.0629 |
437
+ | 0.2082 | 91 | 0.0627 |
438
+ | 0.2105 | 92 | 0.0565 |
439
+ | 0.2128 | 93 | 0.0696 |
440
+ | 0.2150 | 94 | 0.0413 |
441
+ | 0.2173 | 95 | 0.0625 |
442
+ | 0.2196 | 96 | 0.0593 |
443
+ | 0.2219 | 97 | 0.0511 |
444
+ | 0.2242 | 98 | 0.1168 |
445
+ | 0.2265 | 99 | 0.0601 |
446
+ | 0.2288 | 100 | 0.0919 |
447
+ | 0.2311 | 101 | 0.0471 |
448
+ | 0.2333 | 102 | 0.0701 |
449
+ | 0.2356 | 103 | 0.1032 |
450
+ | 0.2379 | 104 | 0.0823 |
451
+ | 0.2402 | 105 | 0.0825 |
452
+ | 0.2425 | 106 | 0.0626 |
453
+ | 0.2448 | 107 | 0.0821 |
454
+ | 0.2471 | 108 | 0.0532 |
455
+ | 0.2494 | 109 | 0.1171 |
456
+ | 0.2516 | 110 | 0.0814 |
457
+ | 0.2539 | 111 | 0.1167 |
458
+ | 0.2562 | 112 | 0.0918 |
459
+ | 0.2585 | 113 | 0.0704 |
460
+ | 0.2608 | 114 | 0.0726 |
461
+ | 0.2631 | 115 | 0.0522 |
462
+ | 0.2654 | 116 | 0.0628 |
463
+ | 0.2677 | 117 | 0.0716 |
464
+ | 0.2699 | 118 | 0.0676 |
465
+ | 0.2722 | 119 | 0.0616 |
466
+ | 0.2745 | 120 | 0.0505 |
467
+ | 0.2768 | 121 | 0.0653 |
468
+ | 0.2791 | 122 | 0.051 |
469
+ | 0.2814 | 123 | 0.0888 |
470
+ | 0.2837 | 124 | 0.1061 |
471
+ | 0.2860 | 125 | 0.104 |
472
+ | 0.2882 | 126 | 0.095 |
473
+ | 0.2905 | 127 | 0.0715 |
474
+ | 0.2928 | 128 | 0.0766 |
475
+ | 0.2951 | 129 | 0.076 |
476
+ | 0.2974 | 130 | 0.1154 |
477
+ | 0.2997 | 131 | 0.0463 |
478
+ | 0.3020 | 132 | 0.0596 |
479
+ | 0.3043 | 133 | 0.0705 |
480
+ | 0.3065 | 134 | 0.0654 |
481
+ | 0.3088 | 135 | 0.0802 |
482
+ | 0.3111 | 136 | 0.0882 |
483
+ | 0.3134 | 137 | 0.0872 |
484
+ | 0.3157 | 138 | 0.0853 |
485
+ | 0.3180 | 139 | 0.0661 |
486
+ | 0.3203 | 140 | 0.0633 |
487
+ | 0.3226 | 141 | 0.0784 |
488
+ | 0.3248 | 142 | 0.0832 |
489
+ | 0.3271 | 143 | 0.0799 |
490
+ | 0.3294 | 144 | 0.0954 |
491
+ | 0.3317 | 145 | 0.0744 |
492
+ | 0.3340 | 146 | 0.0559 |
493
+ | 0.3363 | 147 | 0.0892 |
494
+ | 0.3386 | 148 | 0.0424 |
495
+ | 0.3409 | 149 | 0.0742 |
496
+ | 0.3432 | 150 | 0.1025 |
497
+ | 0.3454 | 151 | 0.0814 |
498
+ | 0.3477 | 152 | 0.051 |
499
+ | 0.3500 | 153 | 0.1313 |
500
+ | 0.3523 | 154 | 0.0645 |
501
+ | 0.3546 | 155 | 0.1006 |
502
+ | 0.3569 | 156 | 0.0524 |
503
+ | 0.3592 | 157 | 0.0635 |
504
+ | 0.3615 | 158 | 0.0467 |
505
+ | 0.3637 | 159 | 0.0741 |
506
+ | 0.3660 | 160 | 0.0593 |
507
+ | 0.3683 | 161 | 0.0698 |
508
+ | 0.3706 | 162 | 0.0835 |
509
+ | 0.3729 | 163 | 0.0715 |
510
+ | 0.3752 | 164 | 0.0628 |
511
+ | 0.3775 | 165 | 0.0772 |
512
+ | 0.3798 | 166 | 0.1167 |
513
+ | 0.3820 | 167 | 0.0981 |
514
+ | 0.3843 | 168 | 0.0595 |
515
+ | 0.3866 | 169 | 0.041 |
516
+ | 0.3889 | 170 | 0.0728 |
517
+ | 0.3912 | 171 | 0.0937 |
518
+ | 0.3935 | 172 | 0.0757 |
519
+ | 0.3958 | 173 | 0.0603 |
520
+ | 0.3981 | 174 | 0.0542 |
521
+ | 0.4003 | 175 | 0.0701 |
522
+ | 0.4026 | 176 | 0.0372 |
523
+ | 0.4049 | 177 | 0.125 |
524
+ | 0.4072 | 178 | 0.0545 |
525
+ | 0.4095 | 179 | 0.0476 |
526
+ | 0.4118 | 180 | 0.0516 |
527
+ | 0.4141 | 181 | 0.1243 |
528
+ | 0.4164 | 182 | 0.0599 |
529
+ | 0.4186 | 183 | 0.1026 |
530
+ | 0.4209 | 184 | 0.077 |
531
+ | 0.4232 | 185 | 0.0732 |
532
+ | 0.4255 | 186 | 0.0798 |
533
+ | 0.4278 | 187 | 0.0538 |
534
+ | 0.4301 | 188 | 0.0679 |
535
+ | 0.4324 | 189 | 0.0759 |
536
+ | 0.4347 | 190 | 0.0761 |
537
+ | 0.4369 | 191 | 0.0557 |
538
+ | 0.4392 | 192 | 0.0534 |
539
+ | 0.4415 | 193 | 0.0747 |
540
+ | 0.4438 | 194 | 0.0672 |
541
+ | 0.4461 | 195 | 0.0376 |
542
+ | 0.4484 | 196 | 0.0466 |
543
+ | 0.4507 | 197 | 0.0783 |
544
+ | 0.4530 | 198 | 0.0864 |
545
+ | 0.4552 | 199 | 0.0423 |
546
+ | 0.4575 | 200 | 0.0708 |
547
+ | 0.4598 | 201 | 0.0429 |
548
+ | 0.4621 | 202 | 0.0718 |
549
+ | 0.4644 | 203 | 0.0802 |
550
+ | 0.4667 | 204 | 0.073 |
551
+ | 0.4690 | 205 | 0.0628 |
552
+ | 0.4713 | 206 | 0.055 |
553
+ | 0.4735 | 207 | 0.0468 |
554
+ | 0.4758 | 208 | 0.0536 |
555
+ | 0.4781 | 209 | 0.0429 |
556
+ | 0.4804 | 210 | 0.0388 |
557
+ | 0.4827 | 211 | 0.0962 |
558
+ | 0.4850 | 212 | 0.0475 |
559
+ | 0.4873 | 213 | 0.0589 |
560
+ | 0.4896 | 214 | 0.0606 |
561
+ | 0.4919 | 215 | 0.0512 |
562
+ | 0.4941 | 216 | 0.0836 |
563
+ | 0.4964 | 217 | 0.0659 |
564
+ | 0.4987 | 218 | 0.0924 |
565
+ | 0.5010 | 219 | 0.0711 |
566
+ | 0.5033 | 220 | 0.0676 |
567
+ | 0.5056 | 221 | 0.0393 |
568
+ | 0.5079 | 222 | 0.0668 |
569
+ | 0.5102 | 223 | 0.0511 |
570
+ | 0.5124 | 224 | 0.0575 |
571
+ | 0.5147 | 225 | 0.0594 |
572
+ | 0.5170 | 226 | 0.126 |
573
+ | 0.5193 | 227 | 0.0787 |
574
+ | 0.5216 | 228 | 0.0509 |
575
+ | 0.5239 | 229 | 0.0684 |
576
+ | 0.5262 | 230 | 0.0792 |
577
+ | 0.5285 | 231 | 0.0501 |
578
+ | 0.5307 | 232 | 0.0988 |
579
+ | 0.5330 | 233 | 0.0414 |
580
+ | 0.5353 | 234 | 0.0596 |
581
+ | 0.5376 | 235 | 0.0607 |
582
+ | 0.5399 | 236 | 0.0556 |
583
+ | 0.5422 | 237 | 0.0578 |
584
+ | 0.5445 | 238 | 0.0238 |
585
+ | 0.5468 | 239 | 0.0509 |
586
+ | 0.5490 | 240 | 0.0431 |
587
+ | 0.5513 | 241 | 0.0377 |
588
+ | 0.5536 | 242 | 0.0814 |
589
+ | 0.5559 | 243 | 0.0779 |
590
+ | 0.5582 | 244 | 0.0574 |
591
+ | 0.5605 | 245 | 0.0681 |
592
+ | 0.5628 | 246 | 0.0513 |
593
+ | 0.5651 | 247 | 0.0573 |
594
+ | 0.5673 | 248 | 0.0758 |
595
+ | 0.5696 | 249 | 0.0442 |
596
+ | 0.5719 | 250 | 0.0458 |
597
+ | 0.5742 | 251 | 0.0853 |
598
+ | 0.5765 | 252 | 0.0825 |
599
+ | 0.5788 | 253 | 0.065 |
600
+ | 0.5811 | 254 | 0.0429 |
601
+ | 0.5834 | 255 | 0.0438 |
602
+ | 0.5856 | 256 | 0.1028 |
603
+ | 0.5879 | 257 | 0.04 |
604
+ | 0.5902 | 258 | 0.0406 |
605
+ | 0.5925 | 259 | 0.0465 |
606
+ | 0.5948 | 260 | 0.068 |
607
+ | 0.5971 | 261 | 0.0532 |
608
+ | 0.5994 | 262 | 0.0503 |
609
+ | 0.6017 | 263 | 0.0421 |
610
+ | 0.6039 | 264 | 0.0663 |
611
+ | 0.6062 | 265 | 0.0621 |
612
+ | 0.6085 | 266 | 0.0845 |
613
+ | 0.6108 | 267 | 0.049 |
614
+ | 0.6131 | 268 | 0.0503 |
615
+ | 0.6154 | 269 | 0.0392 |
616
+ | 0.6177 | 270 | 0.0505 |
617
+ | 0.6200 | 271 | 0.0594 |
618
+ | 0.6222 | 272 | 0.0573 |
619
+ | 0.6245 | 273 | 0.0383 |
620
+ | 0.6268 | 274 | 0.0568 |
621
+ | 0.6291 | 275 | 0.0386 |
622
+ | 0.6314 | 276 | 0.0573 |
623
+ | 0.6337 | 277 | 0.0397 |
624
+ | 0.6360 | 278 | 0.0459 |
625
+ | 0.6383 | 279 | 0.0624 |
626
+ | 0.6405 | 280 | 0.0706 |
627
+ | 0.6428 | 281 | 0.0743 |
628
+ | 0.6451 | 282 | 0.0405 |
629
+ | 0.6474 | 283 | 0.0761 |
630
+ | 0.6497 | 284 | 0.0583 |
631
+ | 0.6520 | 285 | 0.0444 |
632
+ | 0.6543 | 286 | 0.0305 |
633
+ | 0.6566 | 287 | 0.0716 |
634
+ | 0.6589 | 288 | 0.041 |
635
+ | 0.6611 | 289 | 0.043 |
636
+ | 0.6634 | 290 | 0.0574 |
637
+ | 0.6657 | 291 | 0.0479 |
638
+ | 0.6680 | 292 | 0.062 |
639
+ | 0.6703 | 293 | 0.0441 |
640
+ | 0.6726 | 294 | 0.0657 |
641
+ | 0.6749 | 295 | 0.0515 |
642
+ | 0.6772 | 296 | 0.0718 |
643
+ | 0.6794 | 297 | 0.0839 |
644
+ | 0.6817 | 298 | 0.0751 |
645
+ | 0.6840 | 299 | 0.073 |
646
+ | 0.6863 | 300 | 0.0656 |
647
+ | 0.6886 | 301 | 0.0717 |
648
+ | 0.6909 | 302 | 0.0457 |
649
+ | 0.6932 | 303 | 0.0761 |
650
+ | 0.6955 | 304 | 0.0557 |
651
+ | 0.6977 | 305 | 0.0646 |
652
+ | 0.7000 | 306 | 0.0688 |
653
+ | 0.7023 | 307 | 0.0396 |
654
+ | 0.7046 | 308 | 0.0444 |
655
+ | 0.7069 | 309 | 0.0627 |
656
+ | 0.7092 | 310 | 0.0594 |
657
+ | 0.7115 | 311 | 0.0496 |
658
+ | 0.7138 | 312 | 0.0406 |
659
+ | 0.7160 | 313 | 0.0513 |
660
+ | 0.7183 | 314 | 0.0483 |
661
+ | 0.7206 | 315 | 0.0527 |
662
+ | 0.7229 | 316 | 0.0646 |
663
+ | 0.7252 | 317 | 0.0351 |
664
+ | 0.7275 | 318 | 0.0432 |
665
+ | 0.7298 | 319 | 0.06 |
666
+ | 0.7321 | 320 | 0.0487 |
667
+ | 0.7343 | 321 | 0.0398 |
668
+ | 0.7366 | 322 | 0.0279 |
669
+ | 0.7389 | 323 | 0.0594 |
670
+ | 0.7412 | 324 | 0.0808 |
671
+ | 0.7435 | 325 | 0.0461 |
672
+ | 0.7458 | 326 | 0.0452 |
673
+ | 0.7481 | 327 | 0.0887 |
674
+ | 0.7504 | 328 | 0.057 |
675
+ | 0.7526 | 329 | 0.082 |
676
+ | 0.7549 | 330 | 0.0693 |
677
+ | 0.7572 | 331 | 0.0245 |
678
+ | 0.7595 | 332 | 0.0476 |
679
+ | 0.7618 | 333 | 0.051 |
680
+ | 0.7641 | 334 | 0.0539 |
681
+ | 0.7664 | 335 | 0.0325 |
682
+ | 0.7687 | 336 | 0.0431 |
683
+ | 0.7709 | 337 | 0.0534 |
684
+ | 0.7732 | 338 | 0.0346 |
685
+ | 0.7755 | 339 | 0.0577 |
686
+ | 0.7778 | 340 | 0.086 |
687
+ | 0.7801 | 341 | 0.0705 |
688
+ | 0.7824 | 342 | 0.0412 |
689
+ | 0.7847 | 343 | 0.0426 |
690
+ | 0.7870 | 344 | 0.0829 |
691
+ | 0.7892 | 345 | 0.0767 |
692
+ | 0.7915 | 346 | 0.0702 |
693
+ | 0.7938 | 347 | 0.0662 |
694
+ | 0.7961 | 348 | 0.0436 |
695
+ | 0.7984 | 349 | 0.0292 |
696
+ | 0.8007 | 350 | 0.0586 |
697
+ | 0.8030 | 351 | 0.0416 |
698
+ | 0.8053 | 352 | 0.0874 |
699
+ | 0.8075 | 353 | 0.0378 |
700
+ | 0.8098 | 354 | 0.036 |
701
+ | 0.8121 | 355 | 0.0426 |
702
+ | 0.8144 | 356 | 0.0375 |
703
+ | 0.8167 | 357 | 0.0296 |
704
+ | 0.8190 | 358 | 0.0535 |
705
+ | 0.8213 | 359 | 0.0654 |
706
+ | 0.8236 | 360 | 0.0756 |
707
+ | 0.8259 | 361 | 0.0591 |
708
+ | 0.8281 | 362 | 0.0603 |
709
+ | 0.8304 | 363 | 0.0664 |
710
+ | 0.8327 | 364 | 0.0403 |
711
+ | 0.8350 | 365 | 0.0418 |
712
+ | 0.8373 | 366 | 0.047 |
713
+ | 0.8396 | 367 | 0.077 |
714
+ | 0.8419 | 368 | 0.0597 |
715
+ | 0.8442 | 369 | 0.0683 |
716
+ | 0.8464 | 370 | 0.0557 |
717
+ | 0.8487 | 371 | 0.0487 |
718
+ | 0.8510 | 372 | 0.0499 |
719
+ | 0.8533 | 373 | 0.0328 |
720
+ | 0.8556 | 374 | 0.0211 |
721
+ | 0.8579 | 375 | 0.0411 |
722
+ | 0.8602 | 376 | 0.0648 |
723
+ | 0.8625 | 377 | 0.0583 |
724
+ | 0.8647 | 378 | 0.0483 |
725
+ | 0.8670 | 379 | 0.0362 |
726
+ | 0.8693 | 380 | 0.0616 |
727
+ | 0.8716 | 381 | 0.0634 |
728
+ | 0.8739 | 382 | 0.0542 |
729
+ | 0.8762 | 383 | 0.053 |
730
+ | 0.8785 | 384 | 0.0436 |
731
+ | 0.8808 | 385 | 0.0426 |
732
+ | 0.8830 | 386 | 0.0503 |
733
+ | 0.8853 | 387 | 0.0522 |
734
+ | 0.8876 | 388 | 0.083 |
735
+ | 0.8899 | 389 | 0.0317 |
736
+ | 0.8922 | 390 | 0.0571 |
737
+ | 0.8945 | 391 | 0.0464 |
738
+ | 0.8968 | 392 | 0.0179 |
739
+ | 0.8991 | 393 | 0.0389 |
740
+ | 0.9013 | 394 | 0.0317 |
741
+ | 0.9036 | 395 | 0.0605 |
742
+ | 0.9059 | 396 | 0.0389 |
743
+ | 0.9082 | 397 | 0.0407 |
744
+ | 0.9105 | 398 | 0.0478 |
745
+ | 0.9128 | 399 | 0.0304 |
746
+ | 0.9151 | 400 | 0.0572 |
747
+ | 0.9174 | 401 | 0.037 |
748
+ | 0.9196 | 402 | 0.062 |
749
+ | 0.9219 | 403 | 0.0539 |
750
+ | 0.9242 | 404 | 0.039 |
751
+ | 0.9265 | 405 | 0.0265 |
752
+ | 0.9288 | 406 | 0.0398 |
753
+ | 0.9311 | 407 | 0.0369 |
754
+ | 0.9334 | 408 | 0.053 |
755
+ | 0.9357 | 409 | 0.0503 |
756
+ | 0.9379 | 410 | 0.0535 |
757
+ | 0.9402 | 411 | 0.0645 |
758
+ | 0.9425 | 412 | 0.0328 |
759
+ | 0.9448 | 413 | 0.0438 |
760
+ | 0.9471 | 414 | 0.0435 |
761
+ | 0.9494 | 415 | 0.1018 |
762
+ | 0.9517 | 416 | 0.0403 |
763
+ | 0.9540 | 417 | 0.0577 |
764
+ | 0.9562 | 418 | 0.0234 |
765
+ | 0.9585 | 419 | 0.041 |
766
+ | 0.9608 | 420 | 0.0226 |
767
+ | 0.9631 | 421 | 0.0497 |
768
+ | 0.9654 | 422 | 0.0493 |
769
+ | 0.9677 | 423 | 0.0223 |
770
+ | 0.9700 | 424 | 0.0192 |
771
+ | 0.9723 | 425 | 0.0322 |
772
+ | 0.9745 | 426 | 0.0483 |
773
+ | 0.9768 | 427 | 0.041 |
774
+ | 0.9791 | 428 | 0.0628 |
775
+ | 0.9814 | 429 | 0.0861 |
776
+ | 0.9837 | 430 | 0.0645 |
777
+ | 0.9860 | 431 | 0.0386 |
778
+ | 0.9883 | 432 | 0.0378 |
779
+ | 0.9906 | 433 | 0.0613 |
780
+ | 0.9929 | 434 | 0.067 |
781
+ | 0.9951 | 435 | 0.049 |
782
+ | 0.9974 | 436 | 0.0644 |
783
+ | 0.9997 | 437 | 0.02 |
784
+ | 1.0 | 438 | 0.0001 |
785
+
786
+ </details>
787
+
788
+ ### Framework Versions
789
+ - Python: 3.12.3
790
+ - Sentence Transformers: 5.3.0
791
+ - Transformers: 4.56.2
792
+ - PyTorch: 2.10.0+cu128
793
+ - Accelerate: 1.13.0
794
+ - Datasets: 4.3.0
795
+ - Tokenizers: 0.22.2
796
+
797
+ ## Citation
798
+
799
+ ### BibTeX
800
+
801
+ #### Sentence Transformers
802
+ ```bibtex
803
+ @inproceedings{reimers-2019-sentence-bert,
804
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
805
+ author = "Reimers, Nils and Gurevych, Iryna",
806
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
807
+ month = "11",
808
+ year = "2019",
809
+ publisher = "Association for Computational Linguistics",
810
+ url = "https://arxiv.org/abs/1908.10084",
811
+ }
812
+ ```
813
+
814
+ #### MultipleNegativesRankingLoss
815
+ ```bibtex
816
+ @misc{oord2019representationlearningcontrastivepredictive,
817
+ title={Representation Learning with Contrastive Predictive Coding},
818
+ author={Aaron van den Oord and Yazhe Li and Oriol Vinyals},
819
+ year={2019},
820
+ eprint={1807.03748},
821
+ archivePrefix={arXiv},
822
+ primaryClass={cs.LG},
823
+ url={https://arxiv.org/abs/1807.03748},
824
+ }
825
+ ```
826
+
827
+ <!--
828
+ ## Glossary
829
+
830
+ *Clearly define terms in order to be accessible across audiences.*
831
+ -->
832
+
833
+ <!--
834
+ ## Model Card Authors
835
+
836
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
837
+ -->
838
+
839
+ <!--
840
+ ## Model Card Contact
841
+
842
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
843
+ -->
added_tokens.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "</tool_call>": 151658,
3
+ "<tool_call>": 151657,
4
+ "<|box_end|>": 151649,
5
+ "<|box_start|>": 151648,
6
+ "<|endoftext|>": 151643,
7
+ "<|file_sep|>": 151664,
8
+ "<|fim_middle|>": 151660,
9
+ "<|fim_pad|>": 151662,
10
+ "<|fim_prefix|>": 151659,
11
+ "<|fim_suffix|>": 151661,
12
+ "<|im_end|>": 151645,
13
+ "<|im_start|>": 151644,
14
+ "<|image_pad|>": 151655,
15
+ "<|object_ref_end|>": 151647,
16
+ "<|object_ref_start|>": 151646,
17
+ "<|quad_end|>": 151651,
18
+ "<|quad_start|>": 151650,
19
+ "<|repo_name|>": 151663,
20
+ "<|video_pad|>": 151656,
21
+ "<|vision_end|>": 151653,
22
+ "<|vision_pad|>": 151654,
23
+ "<|vision_start|>": 151652
24
+ }
chat_template.jinja ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {%- if tools %}
2
+ {{- '<|im_start|>system\n' }}
3
+ {%- if messages[0]['role'] == 'system' %}
4
+ {{- messages[0]['content'] }}
5
+ {%- else %}
6
+ {{- 'You are a helpful assistant.' }}
7
+ {%- endif %}
8
+ {{- "\n\n# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
9
+ {%- for tool in tools %}
10
+ {{- "\n" }}
11
+ {{- tool | tojson }}
12
+ {%- endfor %}
13
+ {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
14
+ {%- else %}
15
+ {%- if messages[0]['role'] == 'system' %}
16
+ {{- '<|im_start|>system\n' + messages[0]['content'] + '<|im_end|>\n' }}
17
+ {%- else %}
18
+ {{- '<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n' }}
19
+ {%- endif %}
20
+ {%- endif %}
21
+ {%- for message in messages %}
22
+ {%- if (message.role == "user") or (message.role == "system" and not loop.first) or (message.role == "assistant" and not message.tool_calls) %}
23
+ {{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}
24
+ {%- elif message.role == "assistant" %}
25
+ {{- '<|im_start|>' + message.role }}
26
+ {%- if message.content %}
27
+ {{- '\n' + message.content }}
28
+ {%- endif %}
29
+ {%- for tool_call in message.tool_calls %}
30
+ {%- if tool_call.function is defined %}
31
+ {%- set tool_call = tool_call.function %}
32
+ {%- endif %}
33
+ {{- '\n<tool_call>\n{"name": "' }}
34
+ {{- tool_call.name }}
35
+ {{- '", "arguments": ' }}
36
+ {{- tool_call.arguments | tojson }}
37
+ {{- '}\n</tool_call>' }}
38
+ {%- endfor %}
39
+ {{- '<|im_end|>\n' }}
40
+ {%- elif message.role == "tool" %}
41
+ {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != "tool") %}
42
+ {{- '<|im_start|>user' }}
43
+ {%- endif %}
44
+ {{- '\n<tool_response>\n' }}
45
+ {{- message.content }}
46
+ {{- '\n</tool_response>' }}
47
+ {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
48
+ {{- '<|im_end|>\n' }}
49
+ {%- endif %}
50
+ {%- endif %}
51
+ {%- endfor %}
52
+ {%- if add_generation_prompt %}
53
+ {{- '<|im_start|>assistant\n' }}
54
+ {%- endif %}
config.json ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "Qwen3ForCausalLM"
4
+ ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 151643,
8
+ "torch_dtype": "bfloat16",
9
+ "eos_token_id": 151645,
10
+ "head_dim": 128,
11
+ "hidden_act": "silu",
12
+ "hidden_size": 2560,
13
+ "initializer_range": 0.02,
14
+ "intermediate_size": 9728,
15
+ "layer_types": [
16
+ "full_attention",
17
+ "full_attention",
18
+ "full_attention",
19
+ "full_attention",
20
+ "full_attention",
21
+ "full_attention",
22
+ "full_attention",
23
+ "full_attention",
24
+ "full_attention",
25
+ "full_attention",
26
+ "full_attention",
27
+ "full_attention",
28
+ "full_attention",
29
+ "full_attention",
30
+ "full_attention",
31
+ "full_attention",
32
+ "full_attention",
33
+ "full_attention",
34
+ "full_attention",
35
+ "full_attention",
36
+ "full_attention",
37
+ "full_attention",
38
+ "full_attention",
39
+ "full_attention",
40
+ "full_attention",
41
+ "full_attention",
42
+ "full_attention",
43
+ "full_attention",
44
+ "full_attention",
45
+ "full_attention",
46
+ "full_attention",
47
+ "full_attention",
48
+ "full_attention",
49
+ "full_attention",
50
+ "full_attention",
51
+ "full_attention"
52
+ ],
53
+ "max_position_embeddings": 40960,
54
+ "max_window_layers": 36,
55
+ "model_name": "unsloth/Qwen3-Embedding-4B",
56
+ "model_type": "qwen3",
57
+ "num_attention_heads": 32,
58
+ "num_hidden_layers": 36,
59
+ "num_key_value_heads": 8,
60
+ "pad_token_id": 151643,
61
+ "rms_norm_eps": 1e-06,
62
+ "rope_scaling": null,
63
+ "rope_theta": 1000000,
64
+ "sliding_window": null,
65
+ "tie_word_embeddings": true,
66
+ "tokenizer_class": "Qwen2TokenizerFast",
67
+ "unsloth_version": "2026.3.8",
68
+ "use_cache": true,
69
+ "use_sliding_window": false,
70
+ "vocab_size": 151665
71
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "SentenceTransformer",
3
+ "__version__": {
4
+ "sentence_transformers": "5.3.0",
5
+ "transformers": "4.56.2",
6
+ "pytorch": "2.10.0+cu128"
7
+ },
8
+ "prompts": {
9
+ "query": "",
10
+ "document": ""
11
+ },
12
+ "default_prompt_name": null,
13
+ "similarity_fn_name": "cosine"
14
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model-00001-of-00002.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:592e692c4f1ff6b613f02ea0a77535be028a142e149547834eb8a87c2ddb762d
3
+ size 4965826464
model-00002-of-00002.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:10bc48f56d7dd975b3feefc30c9e457e56a30405c1ce01d152b1744905d89069
3
+ size 3077765624
model.safetensors.index.json ADDED
@@ -0,0 +1,405 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "metadata": {
3
+ "total_size": 8043548672
4
+ },
5
+ "weight_map": {
6
+ "embed_tokens.weight": "model-00001-of-00002.safetensors",
7
+ "layers.0.input_layernorm.weight": "model-00001-of-00002.safetensors",
8
+ "layers.0.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
9
+ "layers.0.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
10
+ "layers.0.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
11
+ "layers.0.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
12
+ "layers.0.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
13
+ "layers.0.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
14
+ "layers.0.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
15
+ "layers.0.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
16
+ "layers.0.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
17
+ "layers.0.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
18
+ "layers.1.input_layernorm.weight": "model-00001-of-00002.safetensors",
19
+ "layers.1.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
20
+ "layers.1.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
21
+ "layers.1.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
22
+ "layers.1.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
23
+ "layers.1.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
24
+ "layers.1.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
25
+ "layers.1.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
26
+ "layers.1.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
27
+ "layers.1.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
28
+ "layers.1.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
29
+ "layers.10.input_layernorm.weight": "model-00001-of-00002.safetensors",
30
+ "layers.10.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
31
+ "layers.10.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
32
+ "layers.10.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
33
+ "layers.10.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
34
+ "layers.10.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
35
+ "layers.10.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
36
+ "layers.10.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
37
+ "layers.10.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
38
+ "layers.10.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
39
+ "layers.10.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
40
+ "layers.11.input_layernorm.weight": "model-00001-of-00002.safetensors",
41
+ "layers.11.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
42
+ "layers.11.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
43
+ "layers.11.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
44
+ "layers.11.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
45
+ "layers.11.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
46
+ "layers.11.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
47
+ "layers.11.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
48
+ "layers.11.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
49
+ "layers.11.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
50
+ "layers.11.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
51
+ "layers.12.input_layernorm.weight": "model-00001-of-00002.safetensors",
52
+ "layers.12.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
53
+ "layers.12.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
54
+ "layers.12.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
55
+ "layers.12.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
56
+ "layers.12.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
57
+ "layers.12.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
58
+ "layers.12.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
59
+ "layers.12.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
60
+ "layers.12.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
61
+ "layers.12.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
62
+ "layers.13.input_layernorm.weight": "model-00001-of-00002.safetensors",
63
+ "layers.13.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
64
+ "layers.13.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
65
+ "layers.13.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
66
+ "layers.13.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
67
+ "layers.13.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
68
+ "layers.13.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
69
+ "layers.13.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
70
+ "layers.13.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
71
+ "layers.13.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
72
+ "layers.13.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
73
+ "layers.14.input_layernorm.weight": "model-00001-of-00002.safetensors",
74
+ "layers.14.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
75
+ "layers.14.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
76
+ "layers.14.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
77
+ "layers.14.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
78
+ "layers.14.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
79
+ "layers.14.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
80
+ "layers.14.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
81
+ "layers.14.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
82
+ "layers.14.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
83
+ "layers.14.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
84
+ "layers.15.input_layernorm.weight": "model-00001-of-00002.safetensors",
85
+ "layers.15.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
86
+ "layers.15.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
87
+ "layers.15.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
88
+ "layers.15.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
89
+ "layers.15.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
90
+ "layers.15.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
91
+ "layers.15.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
92
+ "layers.15.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
93
+ "layers.15.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
94
+ "layers.15.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
95
+ "layers.16.input_layernorm.weight": "model-00001-of-00002.safetensors",
96
+ "layers.16.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
97
+ "layers.16.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
98
+ "layers.16.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
99
+ "layers.16.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
100
+ "layers.16.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
101
+ "layers.16.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
102
+ "layers.16.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
103
+ "layers.16.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
104
+ "layers.16.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
105
+ "layers.16.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
106
+ "layers.17.input_layernorm.weight": "model-00001-of-00002.safetensors",
107
+ "layers.17.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
108
+ "layers.17.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
109
+ "layers.17.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
110
+ "layers.17.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
111
+ "layers.17.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
112
+ "layers.17.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
113
+ "layers.17.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
114
+ "layers.17.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
115
+ "layers.17.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
116
+ "layers.17.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
117
+ "layers.18.input_layernorm.weight": "model-00001-of-00002.safetensors",
118
+ "layers.18.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
119
+ "layers.18.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
120
+ "layers.18.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
121
+ "layers.18.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
122
+ "layers.18.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
123
+ "layers.18.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
124
+ "layers.18.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
125
+ "layers.18.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
126
+ "layers.18.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
127
+ "layers.18.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
128
+ "layers.19.input_layernorm.weight": "model-00001-of-00002.safetensors",
129
+ "layers.19.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
130
+ "layers.19.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
131
+ "layers.19.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
132
+ "layers.19.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
133
+ "layers.19.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
134
+ "layers.19.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
135
+ "layers.19.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
136
+ "layers.19.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
137
+ "layers.19.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
138
+ "layers.19.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
139
+ "layers.2.input_layernorm.weight": "model-00001-of-00002.safetensors",
140
+ "layers.2.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
141
+ "layers.2.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
142
+ "layers.2.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
143
+ "layers.2.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
144
+ "layers.2.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
145
+ "layers.2.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
146
+ "layers.2.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
147
+ "layers.2.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
148
+ "layers.2.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
149
+ "layers.2.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
150
+ "layers.20.input_layernorm.weight": "model-00002-of-00002.safetensors",
151
+ "layers.20.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
152
+ "layers.20.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
153
+ "layers.20.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
154
+ "layers.20.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
155
+ "layers.20.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
156
+ "layers.20.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
157
+ "layers.20.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
158
+ "layers.20.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
159
+ "layers.20.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
160
+ "layers.20.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
161
+ "layers.21.input_layernorm.weight": "model-00002-of-00002.safetensors",
162
+ "layers.21.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
163
+ "layers.21.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
164
+ "layers.21.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
165
+ "layers.21.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
166
+ "layers.21.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
167
+ "layers.21.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
168
+ "layers.21.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
169
+ "layers.21.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
170
+ "layers.21.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
171
+ "layers.21.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
172
+ "layers.22.input_layernorm.weight": "model-00002-of-00002.safetensors",
173
+ "layers.22.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
174
+ "layers.22.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
175
+ "layers.22.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
176
+ "layers.22.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
177
+ "layers.22.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
178
+ "layers.22.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
179
+ "layers.22.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
180
+ "layers.22.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
181
+ "layers.22.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
182
+ "layers.22.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
183
+ "layers.23.input_layernorm.weight": "model-00002-of-00002.safetensors",
184
+ "layers.23.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
185
+ "layers.23.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
186
+ "layers.23.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
187
+ "layers.23.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
188
+ "layers.23.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
189
+ "layers.23.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
190
+ "layers.23.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
191
+ "layers.23.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
192
+ "layers.23.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
193
+ "layers.23.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
194
+ "layers.24.input_layernorm.weight": "model-00002-of-00002.safetensors",
195
+ "layers.24.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
196
+ "layers.24.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
197
+ "layers.24.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
198
+ "layers.24.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
199
+ "layers.24.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
200
+ "layers.24.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
201
+ "layers.24.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
202
+ "layers.24.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
203
+ "layers.24.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
204
+ "layers.24.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
205
+ "layers.25.input_layernorm.weight": "model-00002-of-00002.safetensors",
206
+ "layers.25.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
207
+ "layers.25.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
208
+ "layers.25.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
209
+ "layers.25.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
210
+ "layers.25.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
211
+ "layers.25.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
212
+ "layers.25.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
213
+ "layers.25.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
214
+ "layers.25.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
215
+ "layers.25.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
216
+ "layers.26.input_layernorm.weight": "model-00002-of-00002.safetensors",
217
+ "layers.26.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
218
+ "layers.26.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
219
+ "layers.26.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
220
+ "layers.26.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
221
+ "layers.26.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
222
+ "layers.26.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
223
+ "layers.26.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
224
+ "layers.26.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
225
+ "layers.26.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
226
+ "layers.26.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
227
+ "layers.27.input_layernorm.weight": "model-00002-of-00002.safetensors",
228
+ "layers.27.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
229
+ "layers.27.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
230
+ "layers.27.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
231
+ "layers.27.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
232
+ "layers.27.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
233
+ "layers.27.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
234
+ "layers.27.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
235
+ "layers.27.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
236
+ "layers.27.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
237
+ "layers.27.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
238
+ "layers.28.input_layernorm.weight": "model-00002-of-00002.safetensors",
239
+ "layers.28.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
240
+ "layers.28.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
241
+ "layers.28.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
242
+ "layers.28.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
243
+ "layers.28.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
244
+ "layers.28.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
245
+ "layers.28.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
246
+ "layers.28.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
247
+ "layers.28.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
248
+ "layers.28.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
249
+ "layers.29.input_layernorm.weight": "model-00002-of-00002.safetensors",
250
+ "layers.29.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
251
+ "layers.29.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
252
+ "layers.29.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
253
+ "layers.29.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
254
+ "layers.29.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
255
+ "layers.29.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
256
+ "layers.29.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
257
+ "layers.29.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
258
+ "layers.29.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
259
+ "layers.29.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
260
+ "layers.3.input_layernorm.weight": "model-00001-of-00002.safetensors",
261
+ "layers.3.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
262
+ "layers.3.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
263
+ "layers.3.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
264
+ "layers.3.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
265
+ "layers.3.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
266
+ "layers.3.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
267
+ "layers.3.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
268
+ "layers.3.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
269
+ "layers.3.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
270
+ "layers.3.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
271
+ "layers.30.input_layernorm.weight": "model-00002-of-00002.safetensors",
272
+ "layers.30.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
273
+ "layers.30.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
274
+ "layers.30.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
275
+ "layers.30.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
276
+ "layers.30.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
277
+ "layers.30.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
278
+ "layers.30.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
279
+ "layers.30.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
280
+ "layers.30.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
281
+ "layers.30.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
282
+ "layers.31.input_layernorm.weight": "model-00002-of-00002.safetensors",
283
+ "layers.31.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
284
+ "layers.31.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
285
+ "layers.31.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
286
+ "layers.31.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
287
+ "layers.31.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
288
+ "layers.31.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
289
+ "layers.31.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
290
+ "layers.31.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
291
+ "layers.31.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
292
+ "layers.31.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
293
+ "layers.32.input_layernorm.weight": "model-00002-of-00002.safetensors",
294
+ "layers.32.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
295
+ "layers.32.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
296
+ "layers.32.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
297
+ "layers.32.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
298
+ "layers.32.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
299
+ "layers.32.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
300
+ "layers.32.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
301
+ "layers.32.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
302
+ "layers.32.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
303
+ "layers.32.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
304
+ "layers.33.input_layernorm.weight": "model-00002-of-00002.safetensors",
305
+ "layers.33.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
306
+ "layers.33.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
307
+ "layers.33.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
308
+ "layers.33.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
309
+ "layers.33.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
310
+ "layers.33.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
311
+ "layers.33.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
312
+ "layers.33.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
313
+ "layers.33.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
314
+ "layers.33.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
315
+ "layers.34.input_layernorm.weight": "model-00002-of-00002.safetensors",
316
+ "layers.34.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
317
+ "layers.34.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
318
+ "layers.34.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
319
+ "layers.34.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
320
+ "layers.34.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
321
+ "layers.34.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
322
+ "layers.34.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
323
+ "layers.34.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
324
+ "layers.34.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
325
+ "layers.34.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
326
+ "layers.35.input_layernorm.weight": "model-00002-of-00002.safetensors",
327
+ "layers.35.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
328
+ "layers.35.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
329
+ "layers.35.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
330
+ "layers.35.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
331
+ "layers.35.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
332
+ "layers.35.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
333
+ "layers.35.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
334
+ "layers.35.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
335
+ "layers.35.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
336
+ "layers.35.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
337
+ "layers.4.input_layernorm.weight": "model-00001-of-00002.safetensors",
338
+ "layers.4.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
339
+ "layers.4.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
340
+ "layers.4.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
341
+ "layers.4.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
342
+ "layers.4.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
343
+ "layers.4.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
344
+ "layers.4.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
345
+ "layers.4.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
346
+ "layers.4.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
347
+ "layers.4.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
348
+ "layers.5.input_layernorm.weight": "model-00001-of-00002.safetensors",
349
+ "layers.5.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
350
+ "layers.5.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
351
+ "layers.5.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
352
+ "layers.5.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
353
+ "layers.5.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
354
+ "layers.5.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
355
+ "layers.5.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
356
+ "layers.5.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
357
+ "layers.5.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
358
+ "layers.5.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
359
+ "layers.6.input_layernorm.weight": "model-00001-of-00002.safetensors",
360
+ "layers.6.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
361
+ "layers.6.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
362
+ "layers.6.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
363
+ "layers.6.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
364
+ "layers.6.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
365
+ "layers.6.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
366
+ "layers.6.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
367
+ "layers.6.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
368
+ "layers.6.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
369
+ "layers.6.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
370
+ "layers.7.input_layernorm.weight": "model-00001-of-00002.safetensors",
371
+ "layers.7.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
372
+ "layers.7.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
373
+ "layers.7.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
374
+ "layers.7.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
375
+ "layers.7.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
376
+ "layers.7.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
377
+ "layers.7.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
378
+ "layers.7.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
379
+ "layers.7.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
380
+ "layers.7.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
381
+ "layers.8.input_layernorm.weight": "model-00001-of-00002.safetensors",
382
+ "layers.8.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
383
+ "layers.8.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
384
+ "layers.8.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
385
+ "layers.8.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
386
+ "layers.8.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
387
+ "layers.8.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
388
+ "layers.8.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
389
+ "layers.8.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
390
+ "layers.8.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
391
+ "layers.8.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
392
+ "layers.9.input_layernorm.weight": "model-00001-of-00002.safetensors",
393
+ "layers.9.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
394
+ "layers.9.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
395
+ "layers.9.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
396
+ "layers.9.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
397
+ "layers.9.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
398
+ "layers.9.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
399
+ "layers.9.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
400
+ "layers.9.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
401
+ "layers.9.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
402
+ "layers.9.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
403
+ "norm.weight": "model-00002-of-00002.safetensors"
404
+ }
405
+ }
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 8192,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<|im_start|>",
4
+ "<|im_end|>",
5
+ "<|object_ref_start|>",
6
+ "<|object_ref_end|>",
7
+ "<|box_start|>",
8
+ "<|box_end|>",
9
+ "<|quad_start|>",
10
+ "<|quad_end|>",
11
+ "<|vision_start|>",
12
+ "<|vision_end|>",
13
+ "<|vision_pad|>",
14
+ "<|image_pad|>",
15
+ "<|video_pad|>"
16
+ ],
17
+ "eos_token": {
18
+ "content": "<|im_end|>",
19
+ "lstrip": false,
20
+ "normalized": false,
21
+ "rstrip": false,
22
+ "single_word": false
23
+ },
24
+ "pad_token": {
25
+ "content": "<|endoftext|>",
26
+ "lstrip": false,
27
+ "normalized": false,
28
+ "rstrip": false,
29
+ "single_word": false
30
+ }
31
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:17af9bd30dbbda177eda5d8835f90e4277910bedd0011f50077acee58008d28a
3
+ size 11423213
tokenizer_config.json ADDED
@@ -0,0 +1,209 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": false,
3
+ "add_prefix_space": false,
4
+ "added_tokens_decoder": {
5
+ "151643": {
6
+ "content": "<|endoftext|>",
7
+ "lstrip": false,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false,
11
+ "special": true
12
+ },
13
+ "151644": {
14
+ "content": "<|im_start|>",
15
+ "lstrip": false,
16
+ "normalized": false,
17
+ "rstrip": false,
18
+ "single_word": false,
19
+ "special": true
20
+ },
21
+ "151645": {
22
+ "content": "<|im_end|>",
23
+ "lstrip": false,
24
+ "normalized": false,
25
+ "rstrip": false,
26
+ "single_word": false,
27
+ "special": true
28
+ },
29
+ "151646": {
30
+ "content": "<|object_ref_start|>",
31
+ "lstrip": false,
32
+ "normalized": false,
33
+ "rstrip": false,
34
+ "single_word": false,
35
+ "special": true
36
+ },
37
+ "151647": {
38
+ "content": "<|object_ref_end|>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false,
43
+ "special": true
44
+ },
45
+ "151648": {
46
+ "content": "<|box_start|>",
47
+ "lstrip": false,
48
+ "normalized": false,
49
+ "rstrip": false,
50
+ "single_word": false,
51
+ "special": true
52
+ },
53
+ "151649": {
54
+ "content": "<|box_end|>",
55
+ "lstrip": false,
56
+ "normalized": false,
57
+ "rstrip": false,
58
+ "single_word": false,
59
+ "special": true
60
+ },
61
+ "151650": {
62
+ "content": "<|quad_start|>",
63
+ "lstrip": false,
64
+ "normalized": false,
65
+ "rstrip": false,
66
+ "single_word": false,
67
+ "special": true
68
+ },
69
+ "151651": {
70
+ "content": "<|quad_end|>",
71
+ "lstrip": false,
72
+ "normalized": false,
73
+ "rstrip": false,
74
+ "single_word": false,
75
+ "special": true
76
+ },
77
+ "151652": {
78
+ "content": "<|vision_start|>",
79
+ "lstrip": false,
80
+ "normalized": false,
81
+ "rstrip": false,
82
+ "single_word": false,
83
+ "special": true
84
+ },
85
+ "151653": {
86
+ "content": "<|vision_end|>",
87
+ "lstrip": false,
88
+ "normalized": false,
89
+ "rstrip": false,
90
+ "single_word": false,
91
+ "special": true
92
+ },
93
+ "151654": {
94
+ "content": "<|vision_pad|>",
95
+ "lstrip": false,
96
+ "normalized": false,
97
+ "rstrip": false,
98
+ "single_word": false,
99
+ "special": true
100
+ },
101
+ "151655": {
102
+ "content": "<|image_pad|>",
103
+ "lstrip": false,
104
+ "normalized": false,
105
+ "rstrip": false,
106
+ "single_word": false,
107
+ "special": true
108
+ },
109
+ "151656": {
110
+ "content": "<|video_pad|>",
111
+ "lstrip": false,
112
+ "normalized": false,
113
+ "rstrip": false,
114
+ "single_word": false,
115
+ "special": true
116
+ },
117
+ "151657": {
118
+ "content": "<tool_call>",
119
+ "lstrip": false,
120
+ "normalized": false,
121
+ "rstrip": false,
122
+ "single_word": false,
123
+ "special": false
124
+ },
125
+ "151658": {
126
+ "content": "</tool_call>",
127
+ "lstrip": false,
128
+ "normalized": false,
129
+ "rstrip": false,
130
+ "single_word": false,
131
+ "special": false
132
+ },
133
+ "151659": {
134
+ "content": "<|fim_prefix|>",
135
+ "lstrip": false,
136
+ "normalized": false,
137
+ "rstrip": false,
138
+ "single_word": false,
139
+ "special": false
140
+ },
141
+ "151660": {
142
+ "content": "<|fim_middle|>",
143
+ "lstrip": false,
144
+ "normalized": false,
145
+ "rstrip": false,
146
+ "single_word": false,
147
+ "special": false
148
+ },
149
+ "151661": {
150
+ "content": "<|fim_suffix|>",
151
+ "lstrip": false,
152
+ "normalized": false,
153
+ "rstrip": false,
154
+ "single_word": false,
155
+ "special": false
156
+ },
157
+ "151662": {
158
+ "content": "<|fim_pad|>",
159
+ "lstrip": false,
160
+ "normalized": false,
161
+ "rstrip": false,
162
+ "single_word": false,
163
+ "special": false
164
+ },
165
+ "151663": {
166
+ "content": "<|repo_name|>",
167
+ "lstrip": false,
168
+ "normalized": false,
169
+ "rstrip": false,
170
+ "single_word": false,
171
+ "special": false
172
+ },
173
+ "151664": {
174
+ "content": "<|file_sep|>",
175
+ "lstrip": false,
176
+ "normalized": false,
177
+ "rstrip": false,
178
+ "single_word": false,
179
+ "special": false
180
+ }
181
+ },
182
+ "additional_special_tokens": [
183
+ "<|im_start|>",
184
+ "<|im_end|>",
185
+ "<|object_ref_start|>",
186
+ "<|object_ref_end|>",
187
+ "<|box_start|>",
188
+ "<|box_end|>",
189
+ "<|quad_start|>",
190
+ "<|quad_end|>",
191
+ "<|vision_start|>",
192
+ "<|vision_end|>",
193
+ "<|vision_pad|>",
194
+ "<|image_pad|>",
195
+ "<|video_pad|>"
196
+ ],
197
+ "bos_token": null,
198
+ "clean_up_tokenization_spaces": false,
199
+ "eos_token": "<|im_end|>",
200
+ "errors": "replace",
201
+ "extra_special_tokens": {},
202
+ "model_max_length": 131072,
203
+ "pad_token": "<|endoftext|>",
204
+ "padding_side": "left",
205
+ "split_special_tokens": false,
206
+ "tokenizer_class": "Qwen2Tokenizer",
207
+ "unk_token": null,
208
+ "chat_template": "{%- if tools %}\n {{- '<|im_start|>system\\n' }}\n {%- if messages[0]['role'] == 'system' %}\n {{- messages[0]['content'] }}\n {%- else %}\n {{- 'You are a helpful assistant.' }}\n {%- endif %}\n {{- \"\\n\\n# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n {%- for tool in tools %}\n {{- \"\\n\" }}\n {{- tool | tojson }}\n {%- endfor %}\n {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n {%- if messages[0]['role'] == 'system' %}\n {{- '<|im_start|>system\\n' + messages[0]['content'] + '<|im_end|>\\n' }}\n {%- else %}\n {{- '<|im_start|>system\\nYou are a helpful assistant.<|im_end|>\\n' }}\n {%- endif %}\n{%- endif %}\n{%- for message in messages %}\n {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) or (message.role == \"assistant\" and not message.tool_calls) %}\n {{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>' + '\\n' }}\n {%- elif message.role == \"assistant\" %}\n {{- '<|im_start|>' + message.role }}\n {%- if message.content %}\n {{- '\\n' + message.content }}\n {%- endif %}\n {%- for tool_call in message.tool_calls %}\n {%- if tool_call.function is defined %}\n {%- set tool_call = tool_call.function %}\n {%- endif %}\n {{- '\\n<tool_call>\\n{\"name\": \"' }}\n {{- tool_call.name }}\n {{- '\", \"arguments\": ' }}\n {{- tool_call.arguments | tojson }}\n {{- '}\\n</tool_call>' }}\n {%- endfor %}\n {{- '<|im_end|>\\n' }}\n {%- elif message.role == \"tool\" %}\n {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != \"tool\") %}\n {{- '<|im_start|>user' }}\n {%- endif %}\n {{- '\\n<tool_response>\\n' }}\n {{- message.content }}\n {{- '\\n</tool_response>' }}\n {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n {{- '<|im_end|>\\n' }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n{%- endif %}\n"
209
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff