--- tags: - unsloth - sentence-transformers - sentence-similarity - feature-extraction - dense - generated_from_trainer - dataset_size:223748 - loss:MultipleNegativesRankingLoss widget: - source_sentence: What is the significance of the IPv6 multicast address ff02::1? sentences: - Felt board for classroom activities - >- In the provided network output, the frequent appearance of `ff020000000000000000000000000001` across various interfaces like `lo`, `eth0`, and `eth1` indicates that these interfaces are correctly configured for basic IPv6 operations. Every active IPv6 interface on a segment must listen for messages sent to `ff02::1` to participate in essential link-local protocols, making its presence a standard and expected entry. - >- Not all customizations are supported across all snapd image types or models. For example, certain customizations might be unsupported for UC20+ or classic models, leading to errors. Additionally, if a gadget snap itself defines `defaults` in its `meta/gadget.yaml`, these can be overridden or complemented by the `Customizations` provided during the `SetupSeed` call, affecting system services like SSH. - source_sentence: vein sentences: - blood vessel - >- The `hkdf.Key` function requires several inputs: the underlying hash function for HMAC (e.g., `sha256.New`), the master `secret` material, an optional `salt` value, context-specific `info`, and the desired `keyLen` for the output derived key. These parameters collectively guide the key derivation process. - egg-laying - source_sentence: How are special file types determined in file status? sentences: - >- Integrated into the *ensure loop*, the `TaskRunner`'s `Ensure` method is invoked periodically to manage task execution. It's responsible for spawning goroutines to concurrently execute task handlers, whether for their primary 'do' logic or their 'undo' logic in case of failures. High-level system parts can also trigger its execution proactively using `State.EnsureBefore`. - >- File type identification within the `fileStat` population involves a critical step where the `fs.sys.Mode` value is masked with `syscall.S_IFMT`. This operation allows the function to discern whether the file is a block device (`S_IFBLK`), a character device (`S_IFCHR`), a named pipe (`S_IFIFO`), a socket (`S_IFSOCK`), or a regular file (`S_IFREG`), applying the appropriate `FileMode` flags. - Volatility acceptance - source_sentence: mitre sentences: - ocean liner - >- It becomes necessary because, during the initial `mmap` of an output buffer, no code signature typically exists. After the signature is finally created, the kernel's cached view might not reflect this change. Therefore, `purgeSignatureCache` explicitly clears this cache to prevent problems related to stale signature information. - Clerical cap - source_sentence: craniofacial sentences: - head and face structure - Planned destruction of structures using explosives or machinery - >- Anchor-positive pairs are fundamental to contrastive learning, serving to define what the model should consider as semantically similar data points, guiding it to learn meaningful representations. pipeline_tag: sentence-similarity library_name: sentence-transformers license: gpl-3.0 language: - en base_model: - unsloth/Qwen3-Embedding-4B --- # SentenceTransformer This model was finetuned with [Unsloth](https://github.com/unslothai/unsloth). [](https://github.com/unslothai/unsloth) This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps sentences & paragraphs to a 2560-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Maximum Sequence Length:** 8192 tokens - **Output Dimensionality:** 2560 dimensions - **Similarity Function:** Cosine Similarity ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/huggingface/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'PeftModelForFeatureExtraction'}) (1): Pooling({'word_embedding_dimension': 2560, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': True, 'include_prompt': True}) (2): Normalize() ) ``` ## Evaluation Highlights ### Pre-Post Train Relevancy ![snapd-embedder-v1_query_7](https://cdn-uploads.huggingface.co/production/uploads/67baa513894904754186d3a2/4wkDeKjVwnpbQgS3YZtPy.png) ![snapd-embedder-v1_query_8](https://cdn-uploads.huggingface.co/production/uploads/67baa513894904754186d3a2/2I7aPWU8V56BnGCZt2Ol0.png) ![snapd-embedder-v1_query_9](https://cdn-uploads.huggingface.co/production/uploads/67baa513894904754186d3a2/Mi9SH41b1ESmNJAaUY9kC.png) ![snapd-embedder-v1_query_10](https://cdn-uploads.huggingface.co/production/uploads/67baa513894904754186d3a2/tJqwCmILKEkzyd3mJozfW.png) ### Pre/Post Train Spread ![snapd-embedder-v1_spread_query_7](https://cdn-uploads.huggingface.co/production/uploads/67baa513894904754186d3a2/hbgKvPDvRiRzruRoirCqV.png) ![snapd-embedder-v1_spread_query_8](https://cdn-uploads.huggingface.co/production/uploads/67baa513894904754186d3a2/X74bM7_8nAqrZzeK91fPG.png) ![snapd-embedder-v1_spread_query_9](https://cdn-uploads.huggingface.co/production/uploads/67baa513894904754186d3a2/uJKMhyMgb8ri-eCjfqmct.png) ![snapd-embedder-v1_spread_query_10](https://cdn-uploads.huggingface.co/production/uploads/67baa513894904754186d3a2/mJUxe0Ux3c830MQj4EMkR.png) ### Spread Summary ![snapd-embedder-v1_spread_summary](https://cdn-uploads.huggingface.co/production/uploads/67baa513894904754186d3a2/L5XWK19GIRsVq1u0vF683.png) ### Training Summary ![snapd-embedder-v1_stats](https://cdn-uploads.huggingface.co/production/uploads/67baa513894904754186d3a2/56GqD98jyVGkclIP1KLAG.png) ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("sentence_transformers_model_id") # Run inference sentences = [ 'craniofacial', 'head and face structure', 'Anchor-positive pairs are fundamental to contrastive learning, serving to define what the model should consider as semantically similar data points, guiding it to learn meaningful representations.', ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 2560] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities) # tensor([[1.0000, 0.7268, 0.0036], # [0.7268, 1.0000, 0.0179], # [0.0036, 0.0179, 1.0000]]) ``` ## Training Details ### Training Dataset #### Unnamed Dataset * Size: 223,748 training samples * Columns: anchor and positive * Approximate statistics based on the first 1000 samples: | | anchor | positive | |:--------|:---------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------| | type | string | string | | details | | | * Samples: | anchor | positive | |:-----------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | groupthink | Psychological tendency for group conformity | | customs and border protection | DHS component enforcing trade and immigration laws | | What is the meaning and purpose of the `//go:noescape` directive in Go functions? | The `//go:noescape` comment is a hint to the Go compiler. It asserts that none of the pointer parameters of the decorated function will escape the function's stack frame. This is primarily used for performance tuning in low-level code, ensuring that objects pointed to by function arguments are not allocated on the heap, thus avoiding garbage collection cycles. | * Loss: [MultipleNegativesRankingLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters: ```json { "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false, "directions": [ "query_to_doc" ], "partition_mode": "joint", "hardness_mode": null, "hardness_strength": 0.0 } ``` ### Training Hyperparameters #### Non-Default Hyperparameters - `per_device_train_batch_size`: 64 - `gradient_accumulation_steps`: 8 - `learning_rate`: 3e-05 - `num_train_epochs`: 1 - `lr_scheduler_type`: constant_with_warmup - `warmup_ratio`: 0.03 - `bf16`: True - `batch_sampler`: no_duplicates #### All Hyperparameters
Click to expand - `overwrite_output_dir`: False - `do_predict`: False - `eval_strategy`: no - `prediction_loss_only`: True - `per_device_train_batch_size`: 64 - `per_device_eval_batch_size`: 8 - `per_gpu_train_batch_size`: None - `per_gpu_eval_batch_size`: None - `gradient_accumulation_steps`: 8 - `eval_accumulation_steps`: None - `torch_empty_cache_steps`: None - `learning_rate`: 3e-05 - `weight_decay`: 0.0 - `adam_beta1`: 0.9 - `adam_beta2`: 0.999 - `adam_epsilon`: 1e-08 - `max_grad_norm`: 1.0 - `num_train_epochs`: 1 - `max_steps`: -1 - `lr_scheduler_type`: constant_with_warmup - `lr_scheduler_kwargs`: {} - `warmup_ratio`: 0.03 - `warmup_steps`: 0 - `log_level`: passive - `log_level_replica`: warning - `log_on_each_node`: True - `logging_nan_inf_filter`: True - `save_safetensors`: True - `save_on_each_node`: False - `save_only_model`: False - `restore_callback_states_from_checkpoint`: False - `no_cuda`: False - `use_cpu`: False - `use_mps_device`: False - `seed`: 42 - `data_seed`: None - `jit_mode_eval`: False - `use_ipex`: False - `bf16`: True - `fp16`: False - `fp16_opt_level`: O1 - `half_precision_backend`: auto - `bf16_full_eval`: False - `fp16_full_eval`: False - `tf32`: None - `local_rank`: 0 - `ddp_backend`: None - `tpu_num_cores`: None - `tpu_metrics_debug`: False - `debug`: [] - `dataloader_drop_last`: False - `dataloader_num_workers`: 0 - `dataloader_prefetch_factor`: None - `past_index`: -1 - `disable_tqdm`: False - `remove_unused_columns`: True - `label_names`: None - `load_best_model_at_end`: False - `ignore_data_skip`: False - `fsdp`: [] - `fsdp_min_num_params`: 0 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} - `fsdp_transformer_layer_cls_to_wrap`: None - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} - `parallelism_config`: None - `deepspeed`: None - `label_smoothing_factor`: 0.0 - `optim`: adamw_torch_fused - `optim_args`: None - `adafactor`: False - `group_by_length`: False - `length_column_name`: length - `ddp_find_unused_parameters`: None - `ddp_bucket_cap_mb`: None - `ddp_broadcast_buffers`: False - `dataloader_pin_memory`: True - `dataloader_persistent_workers`: False - `skip_memory_metrics`: True - `use_legacy_prediction_loop`: False - `push_to_hub`: False - `resume_from_checkpoint`: None - `hub_model_id`: None - `hub_strategy`: every_save - `hub_private_repo`: None - `hub_always_push`: False - `hub_revision`: None - `gradient_checkpointing`: False - `gradient_checkpointing_kwargs`: None - `include_inputs_for_metrics`: False - `include_for_metrics`: [] - `eval_do_concat_batches`: True - `fp16_backend`: auto - `push_to_hub_model_id`: None - `push_to_hub_organization`: None - `mp_parameters`: - `auto_find_batch_size`: False - `full_determinism`: False - `torchdynamo`: None - `ray_scope`: last - `ddp_timeout`: 1800 - `torch_compile`: False - `torch_compile_backend`: None - `torch_compile_mode`: None - `include_tokens_per_second`: False - `include_num_input_tokens_seen`: False - `neftune_noise_alpha`: None - `optim_target_modules`: None - `batch_eval_metrics`: False - `eval_on_start`: False - `use_liger_kernel`: False - `liger_kernel_config`: None - `eval_use_gather_object`: False - `average_tokens_across_devices`: False - `prompts`: None - `batch_sampler`: no_duplicates - `multi_dataset_batch_sampler`: proportional - `router_mapping`: {} - `learning_rate_mapping`: {}
### Training Logs
Click to expand | Epoch | Step | Training Loss | |:------:|:----:|:-------------:| | 0.0023 | 1 | 0.5184 | | 0.0046 | 2 | 0.5683 | | 0.0069 | 3 | 0.5821 | | 0.0092 | 4 | 0.4948 | | 0.0114 | 5 | 0.4001 | | 0.0137 | 6 | 0.3097 | | 0.0160 | 7 | 0.257 | | 0.0183 | 8 | 0.2752 | | 0.0206 | 9 | 0.2311 | | 0.0229 | 10 | 0.1433 | | 0.0252 | 11 | 0.2507 | | 0.0275 | 12 | 0.1944 | | 0.0297 | 13 | 0.2052 | | 0.0320 | 14 | 0.1044 | | 0.0343 | 15 | 0.2027 | | 0.0366 | 16 | 0.1969 | | 0.0389 | 17 | 0.1833 | | 0.0412 | 18 | 0.1641 | | 0.0435 | 19 | 0.1629 | | 0.0458 | 20 | 0.1702 | | 0.0480 | 21 | 0.1855 | | 0.0503 | 22 | 0.1697 | | 0.0526 | 23 | 0.116 | | 0.0549 | 24 | 0.1373 | | 0.0572 | 25 | 0.1323 | | 0.0595 | 26 | 0.1349 | | 0.0618 | 27 | 0.1199 | | 0.0641 | 28 | 0.1353 | | 0.0663 | 29 | 0.143 | | 0.0686 | 30 | 0.1305 | | 0.0709 | 31 | 0.1088 | | 0.0732 | 32 | 0.0908 | | 0.0755 | 33 | 0.1502 | | 0.0778 | 34 | 0.1139 | | 0.0801 | 35 | 0.1311 | | 0.0824 | 36 | 0.1291 | | 0.0846 | 37 | 0.0977 | | 0.0869 | 38 | 0.0962 | | 0.0892 | 39 | 0.1166 | | 0.0915 | 40 | 0.0965 | | 0.0938 | 41 | 0.1242 | | 0.0961 | 42 | 0.0705 | | 0.0984 | 43 | 0.0813 | | 0.1007 | 44 | 0.1545 | | 0.1029 | 45 | 0.0868 | | 0.1052 | 46 | 0.0987 | | 0.1075 | 47 | 0.0938 | | 0.1098 | 48 | 0.1086 | | 0.1121 | 49 | 0.0982 | | 0.1144 | 50 | 0.0817 | | 0.1167 | 51 | 0.0527 | | 0.1190 | 52 | 0.0986 | | 0.1212 | 53 | 0.098 | | 0.1235 | 54 | 0.1074 | | 0.1258 | 55 | 0.1396 | | 0.1281 | 56 | 0.1101 | | 0.1304 | 57 | 0.0829 | | 0.1327 | 58 | 0.1261 | | 0.1350 | 59 | 0.048 | | 0.1373 | 60 | 0.1215 | | 0.1395 | 61 | 0.0981 | | 0.1418 | 62 | 0.0739 | | 0.1441 | 63 | 0.0525 | | 0.1464 | 64 | 0.0757 | | 0.1487 | 65 | 0.0543 | | 0.1510 | 66 | 0.0878 | | 0.1533 | 67 | 0.0791 | | 0.1556 | 68 | 0.0816 | | 0.1578 | 69 | 0.0999 | | 0.1601 | 70 | 0.086 | | 0.1624 | 71 | 0.0775 | | 0.1647 | 72 | 0.1048 | | 0.1670 | 73 | 0.0552 | | 0.1693 | 74 | 0.0619 | | 0.1716 | 75 | 0.0667 | | 0.1739 | 76 | 0.0787 | | 0.1762 | 77 | 0.1022 | | 0.1784 | 78 | 0.0937 | | 0.1807 | 79 | 0.0751 | | 0.1830 | 80 | 0.0642 | | 0.1853 | 81 | 0.0508 | | 0.1876 | 82 | 0.1169 | | 0.1899 | 83 | 0.09 | | 0.1922 | 84 | 0.0725 | | 0.1945 | 85 | 0.0476 | | 0.1967 | 86 | 0.0737 | | 0.1990 | 87 | 0.0968 | | 0.2013 | 88 | 0.0988 | | 0.2036 | 89 | 0.0575 | | 0.2059 | 90 | 0.0629 | | 0.2082 | 91 | 0.0627 | | 0.2105 | 92 | 0.0565 | | 0.2128 | 93 | 0.0696 | | 0.2150 | 94 | 0.0413 | | 0.2173 | 95 | 0.0625 | | 0.2196 | 96 | 0.0593 | | 0.2219 | 97 | 0.0511 | | 0.2242 | 98 | 0.1168 | | 0.2265 | 99 | 0.0601 | | 0.2288 | 100 | 0.0919 | | 0.2311 | 101 | 0.0471 | | 0.2333 | 102 | 0.0701 | | 0.2356 | 103 | 0.1032 | | 0.2379 | 104 | 0.0823 | | 0.2402 | 105 | 0.0825 | | 0.2425 | 106 | 0.0626 | | 0.2448 | 107 | 0.0821 | | 0.2471 | 108 | 0.0532 | | 0.2494 | 109 | 0.1171 | | 0.2516 | 110 | 0.0814 | | 0.2539 | 111 | 0.1167 | | 0.2562 | 112 | 0.0918 | | 0.2585 | 113 | 0.0704 | | 0.2608 | 114 | 0.0726 | | 0.2631 | 115 | 0.0522 | | 0.2654 | 116 | 0.0628 | | 0.2677 | 117 | 0.0716 | | 0.2699 | 118 | 0.0676 | | 0.2722 | 119 | 0.0616 | | 0.2745 | 120 | 0.0505 | | 0.2768 | 121 | 0.0653 | | 0.2791 | 122 | 0.051 | | 0.2814 | 123 | 0.0888 | | 0.2837 | 124 | 0.1061 | | 0.2860 | 125 | 0.104 | | 0.2882 | 126 | 0.095 | | 0.2905 | 127 | 0.0715 | | 0.2928 | 128 | 0.0766 | | 0.2951 | 129 | 0.076 | | 0.2974 | 130 | 0.1154 | | 0.2997 | 131 | 0.0463 | | 0.3020 | 132 | 0.0596 | | 0.3043 | 133 | 0.0705 | | 0.3065 | 134 | 0.0654 | | 0.3088 | 135 | 0.0802 | | 0.3111 | 136 | 0.0882 | | 0.3134 | 137 | 0.0872 | | 0.3157 | 138 | 0.0853 | | 0.3180 | 139 | 0.0661 | | 0.3203 | 140 | 0.0633 | | 0.3226 | 141 | 0.0784 | | 0.3248 | 142 | 0.0832 | | 0.3271 | 143 | 0.0799 | | 0.3294 | 144 | 0.0954 | | 0.3317 | 145 | 0.0744 | | 0.3340 | 146 | 0.0559 | | 0.3363 | 147 | 0.0892 | | 0.3386 | 148 | 0.0424 | | 0.3409 | 149 | 0.0742 | | 0.3432 | 150 | 0.1025 | | 0.3454 | 151 | 0.0814 | | 0.3477 | 152 | 0.051 | | 0.3500 | 153 | 0.1313 | | 0.3523 | 154 | 0.0645 | | 0.3546 | 155 | 0.1006 | | 0.3569 | 156 | 0.0524 | | 0.3592 | 157 | 0.0635 | | 0.3615 | 158 | 0.0467 | | 0.3637 | 159 | 0.0741 | | 0.3660 | 160 | 0.0593 | | 0.3683 | 161 | 0.0698 | | 0.3706 | 162 | 0.0835 | | 0.3729 | 163 | 0.0715 | | 0.3752 | 164 | 0.0628 | | 0.3775 | 165 | 0.0772 | | 0.3798 | 166 | 0.1167 | | 0.3820 | 167 | 0.0981 | | 0.3843 | 168 | 0.0595 | | 0.3866 | 169 | 0.041 | | 0.3889 | 170 | 0.0728 | | 0.3912 | 171 | 0.0937 | | 0.3935 | 172 | 0.0757 | | 0.3958 | 173 | 0.0603 | | 0.3981 | 174 | 0.0542 | | 0.4003 | 175 | 0.0701 | | 0.4026 | 176 | 0.0372 | | 0.4049 | 177 | 0.125 | | 0.4072 | 178 | 0.0545 | | 0.4095 | 179 | 0.0476 | | 0.4118 | 180 | 0.0516 | | 0.4141 | 181 | 0.1243 | | 0.4164 | 182 | 0.0599 | | 0.4186 | 183 | 0.1026 | | 0.4209 | 184 | 0.077 | | 0.4232 | 185 | 0.0732 | | 0.4255 | 186 | 0.0798 | | 0.4278 | 187 | 0.0538 | | 0.4301 | 188 | 0.0679 | | 0.4324 | 189 | 0.0759 | | 0.4347 | 190 | 0.0761 | | 0.4369 | 191 | 0.0557 | | 0.4392 | 192 | 0.0534 | | 0.4415 | 193 | 0.0747 | | 0.4438 | 194 | 0.0672 | | 0.4461 | 195 | 0.0376 | | 0.4484 | 196 | 0.0466 | | 0.4507 | 197 | 0.0783 | | 0.4530 | 198 | 0.0864 | | 0.4552 | 199 | 0.0423 | | 0.4575 | 200 | 0.0708 | | 0.4598 | 201 | 0.0429 | | 0.4621 | 202 | 0.0718 | | 0.4644 | 203 | 0.0802 | | 0.4667 | 204 | 0.073 | | 0.4690 | 205 | 0.0628 | | 0.4713 | 206 | 0.055 | | 0.4735 | 207 | 0.0468 | | 0.4758 | 208 | 0.0536 | | 0.4781 | 209 | 0.0429 | | 0.4804 | 210 | 0.0388 | | 0.4827 | 211 | 0.0962 | | 0.4850 | 212 | 0.0475 | | 0.4873 | 213 | 0.0589 | | 0.4896 | 214 | 0.0606 | | 0.4919 | 215 | 0.0512 | | 0.4941 | 216 | 0.0836 | | 0.4964 | 217 | 0.0659 | | 0.4987 | 218 | 0.0924 | | 0.5010 | 219 | 0.0711 | | 0.5033 | 220 | 0.0676 | | 0.5056 | 221 | 0.0393 | | 0.5079 | 222 | 0.0668 | | 0.5102 | 223 | 0.0511 | | 0.5124 | 224 | 0.0575 | | 0.5147 | 225 | 0.0594 | | 0.5170 | 226 | 0.126 | | 0.5193 | 227 | 0.0787 | | 0.5216 | 228 | 0.0509 | | 0.5239 | 229 | 0.0684 | | 0.5262 | 230 | 0.0792 | | 0.5285 | 231 | 0.0501 | | 0.5307 | 232 | 0.0988 | | 0.5330 | 233 | 0.0414 | | 0.5353 | 234 | 0.0596 | | 0.5376 | 235 | 0.0607 | | 0.5399 | 236 | 0.0556 | | 0.5422 | 237 | 0.0578 | | 0.5445 | 238 | 0.0238 | | 0.5468 | 239 | 0.0509 | | 0.5490 | 240 | 0.0431 | | 0.5513 | 241 | 0.0377 | | 0.5536 | 242 | 0.0814 | | 0.5559 | 243 | 0.0779 | | 0.5582 | 244 | 0.0574 | | 0.5605 | 245 | 0.0681 | | 0.5628 | 246 | 0.0513 | | 0.5651 | 247 | 0.0573 | | 0.5673 | 248 | 0.0758 | | 0.5696 | 249 | 0.0442 | | 0.5719 | 250 | 0.0458 | | 0.5742 | 251 | 0.0853 | | 0.5765 | 252 | 0.0825 | | 0.5788 | 253 | 0.065 | | 0.5811 | 254 | 0.0429 | | 0.5834 | 255 | 0.0438 | | 0.5856 | 256 | 0.1028 | | 0.5879 | 257 | 0.04 | | 0.5902 | 258 | 0.0406 | | 0.5925 | 259 | 0.0465 | | 0.5948 | 260 | 0.068 | | 0.5971 | 261 | 0.0532 | | 0.5994 | 262 | 0.0503 | | 0.6017 | 263 | 0.0421 | | 0.6039 | 264 | 0.0663 | | 0.6062 | 265 | 0.0621 | | 0.6085 | 266 | 0.0845 | | 0.6108 | 267 | 0.049 | | 0.6131 | 268 | 0.0503 | | 0.6154 | 269 | 0.0392 | | 0.6177 | 270 | 0.0505 | | 0.6200 | 271 | 0.0594 | | 0.6222 | 272 | 0.0573 | | 0.6245 | 273 | 0.0383 | | 0.6268 | 274 | 0.0568 | | 0.6291 | 275 | 0.0386 | | 0.6314 | 276 | 0.0573 | | 0.6337 | 277 | 0.0397 | | 0.6360 | 278 | 0.0459 | | 0.6383 | 279 | 0.0624 | | 0.6405 | 280 | 0.0706 | | 0.6428 | 281 | 0.0743 | | 0.6451 | 282 | 0.0405 | | 0.6474 | 283 | 0.0761 | | 0.6497 | 284 | 0.0583 | | 0.6520 | 285 | 0.0444 | | 0.6543 | 286 | 0.0305 | | 0.6566 | 287 | 0.0716 | | 0.6589 | 288 | 0.041 | | 0.6611 | 289 | 0.043 | | 0.6634 | 290 | 0.0574 | | 0.6657 | 291 | 0.0479 | | 0.6680 | 292 | 0.062 | | 0.6703 | 293 | 0.0441 | | 0.6726 | 294 | 0.0657 | | 0.6749 | 295 | 0.0515 | | 0.6772 | 296 | 0.0718 | | 0.6794 | 297 | 0.0839 | | 0.6817 | 298 | 0.0751 | | 0.6840 | 299 | 0.073 | | 0.6863 | 300 | 0.0656 | | 0.6886 | 301 | 0.0717 | | 0.6909 | 302 | 0.0457 | | 0.6932 | 303 | 0.0761 | | 0.6955 | 304 | 0.0557 | | 0.6977 | 305 | 0.0646 | | 0.7000 | 306 | 0.0688 | | 0.7023 | 307 | 0.0396 | | 0.7046 | 308 | 0.0444 | | 0.7069 | 309 | 0.0627 | | 0.7092 | 310 | 0.0594 | | 0.7115 | 311 | 0.0496 | | 0.7138 | 312 | 0.0406 | | 0.7160 | 313 | 0.0513 | | 0.7183 | 314 | 0.0483 | | 0.7206 | 315 | 0.0527 | | 0.7229 | 316 | 0.0646 | | 0.7252 | 317 | 0.0351 | | 0.7275 | 318 | 0.0432 | | 0.7298 | 319 | 0.06 | | 0.7321 | 320 | 0.0487 | | 0.7343 | 321 | 0.0398 | | 0.7366 | 322 | 0.0279 | | 0.7389 | 323 | 0.0594 | | 0.7412 | 324 | 0.0808 | | 0.7435 | 325 | 0.0461 | | 0.7458 | 326 | 0.0452 | | 0.7481 | 327 | 0.0887 | | 0.7504 | 328 | 0.057 | | 0.7526 | 329 | 0.082 | | 0.7549 | 330 | 0.0693 | | 0.7572 | 331 | 0.0245 | | 0.7595 | 332 | 0.0476 | | 0.7618 | 333 | 0.051 | | 0.7641 | 334 | 0.0539 | | 0.7664 | 335 | 0.0325 | | 0.7687 | 336 | 0.0431 | | 0.7709 | 337 | 0.0534 | | 0.7732 | 338 | 0.0346 | | 0.7755 | 339 | 0.0577 | | 0.7778 | 340 | 0.086 | | 0.7801 | 341 | 0.0705 | | 0.7824 | 342 | 0.0412 | | 0.7847 | 343 | 0.0426 | | 0.7870 | 344 | 0.0829 | | 0.7892 | 345 | 0.0767 | | 0.7915 | 346 | 0.0702 | | 0.7938 | 347 | 0.0662 | | 0.7961 | 348 | 0.0436 | | 0.7984 | 349 | 0.0292 | | 0.8007 | 350 | 0.0586 | | 0.8030 | 351 | 0.0416 | | 0.8053 | 352 | 0.0874 | | 0.8075 | 353 | 0.0378 | | 0.8098 | 354 | 0.036 | | 0.8121 | 355 | 0.0426 | | 0.8144 | 356 | 0.0375 | | 0.8167 | 357 | 0.0296 | | 0.8190 | 358 | 0.0535 | | 0.8213 | 359 | 0.0654 | | 0.8236 | 360 | 0.0756 | | 0.8259 | 361 | 0.0591 | | 0.8281 | 362 | 0.0603 | | 0.8304 | 363 | 0.0664 | | 0.8327 | 364 | 0.0403 | | 0.8350 | 365 | 0.0418 | | 0.8373 | 366 | 0.047 | | 0.8396 | 367 | 0.077 | | 0.8419 | 368 | 0.0597 | | 0.8442 | 369 | 0.0683 | | 0.8464 | 370 | 0.0557 | | 0.8487 | 371 | 0.0487 | | 0.8510 | 372 | 0.0499 | | 0.8533 | 373 | 0.0328 | | 0.8556 | 374 | 0.0211 | | 0.8579 | 375 | 0.0411 | | 0.8602 | 376 | 0.0648 | | 0.8625 | 377 | 0.0583 | | 0.8647 | 378 | 0.0483 | | 0.8670 | 379 | 0.0362 | | 0.8693 | 380 | 0.0616 | | 0.8716 | 381 | 0.0634 | | 0.8739 | 382 | 0.0542 | | 0.8762 | 383 | 0.053 | | 0.8785 | 384 | 0.0436 | | 0.8808 | 385 | 0.0426 | | 0.8830 | 386 | 0.0503 | | 0.8853 | 387 | 0.0522 | | 0.8876 | 388 | 0.083 | | 0.8899 | 389 | 0.0317 | | 0.8922 | 390 | 0.0571 | | 0.8945 | 391 | 0.0464 | | 0.8968 | 392 | 0.0179 | | 0.8991 | 393 | 0.0389 | | 0.9013 | 394 | 0.0317 | | 0.9036 | 395 | 0.0605 | | 0.9059 | 396 | 0.0389 | | 0.9082 | 397 | 0.0407 | | 0.9105 | 398 | 0.0478 | | 0.9128 | 399 | 0.0304 | | 0.9151 | 400 | 0.0572 | | 0.9174 | 401 | 0.037 | | 0.9196 | 402 | 0.062 | | 0.9219 | 403 | 0.0539 | | 0.9242 | 404 | 0.039 | | 0.9265 | 405 | 0.0265 | | 0.9288 | 406 | 0.0398 | | 0.9311 | 407 | 0.0369 | | 0.9334 | 408 | 0.053 | | 0.9357 | 409 | 0.0503 | | 0.9379 | 410 | 0.0535 | | 0.9402 | 411 | 0.0645 | | 0.9425 | 412 | 0.0328 | | 0.9448 | 413 | 0.0438 | | 0.9471 | 414 | 0.0435 | | 0.9494 | 415 | 0.1018 | | 0.9517 | 416 | 0.0403 | | 0.9540 | 417 | 0.0577 | | 0.9562 | 418 | 0.0234 | | 0.9585 | 419 | 0.041 | | 0.9608 | 420 | 0.0226 | | 0.9631 | 421 | 0.0497 | | 0.9654 | 422 | 0.0493 | | 0.9677 | 423 | 0.0223 | | 0.9700 | 424 | 0.0192 | | 0.9723 | 425 | 0.0322 | | 0.9745 | 426 | 0.0483 | | 0.9768 | 427 | 0.041 | | 0.9791 | 428 | 0.0628 | | 0.9814 | 429 | 0.0861 | | 0.9837 | 430 | 0.0645 | | 0.9860 | 431 | 0.0386 | | 0.9883 | 432 | 0.0378 | | 0.9906 | 433 | 0.0613 | | 0.9929 | 434 | 0.067 | | 0.9951 | 435 | 0.049 | | 0.9974 | 436 | 0.0644 | | 0.9997 | 437 | 0.02 | | 1.0 | 438 | 0.0001 |
### Framework Versions - Python: 3.12.3 - Sentence Transformers: 5.3.0 - Transformers: 4.56.2 - PyTorch: 2.10.0+cu128 - Accelerate: 1.13.0 - Datasets: 4.3.0 - Tokenizers: 0.22.2 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ``` #### MultipleNegativesRankingLoss ```bibtex @misc{oord2019representationlearningcontrastivepredictive, title={Representation Learning with Contrastive Predictive Coding}, author={Aaron van den Oord and Yazhe Li and Oriol Vinyals}, year={2019}, eprint={1807.03748}, archivePrefix={arXiv}, primaryClass={cs.LG}, url={https://arxiv.org/abs/1807.03748}, } ```