metadata
tags:
- unsloth
- sentence-transformers
- sentence-similarity
- feature-extraction
- dense
- generated_from_trainer
- dataset_size:223748
- loss:MultipleNegativesRankingLoss
widget:
- source_sentence: What is the significance of the IPv6 multicast address ff02::1?
sentences:
- Felt board for classroom activities
- >-
In the provided network output, the frequent appearance of
`ff020000000000000000000000000001` across various interfaces like `lo`,
`eth0`, and `eth1` indicates that these interfaces are correctly
configured for basic IPv6 operations. Every active IPv6 interface on a
segment must listen for messages sent to `ff02::1` to participate in
essential link-local protocols, making its presence a standard and
expected entry.
- >-
Not all customizations are supported across all snapd image types or
models. For example, certain customizations might be unsupported for
UC20+ or classic models, leading to errors. Additionally, if a gadget
snap itself defines `defaults` in its `meta/gadget.yaml`, these can be
overridden or complemented by the `Customizations` provided during the
`SetupSeed` call, affecting system services like SSH.
- source_sentence: vein
sentences:
- blood vessel
- >-
The `hkdf.Key` function requires several inputs: the underlying hash
function for HMAC (e.g., `sha256.New`), the master `secret` material, an
optional `salt` value, context-specific `info`, and the desired `keyLen`
for the output derived key. These parameters collectively guide the key
derivation process.
- egg-laying
- source_sentence: How are special file types determined in file status?
sentences:
- >-
Integrated into the *ensure loop*, the `TaskRunner`'s `Ensure` method is
invoked periodically to manage task execution. It's responsible for
spawning goroutines to concurrently execute task handlers, whether for
their primary 'do' logic or their 'undo' logic in case of failures.
High-level system parts can also trigger its execution proactively using
`State.EnsureBefore`.
- >-
File type identification within the `fileStat` population involves a
critical step where the `fs.sys.Mode` value is masked with
`syscall.S_IFMT`. This operation allows the function to discern whether
the file is a block device (`S_IFBLK`), a character device (`S_IFCHR`),
a named pipe (`S_IFIFO`), a socket (`S_IFSOCK`), or a regular file
(`S_IFREG`), applying the appropriate `FileMode` flags.
- Volatility acceptance
- source_sentence: mitre
sentences:
- ocean liner
- >-
It becomes necessary because, during the initial `mmap` of an output
buffer, no code signature typically exists. After the signature is
finally created, the kernel's cached view might not reflect this change.
Therefore, `purgeSignatureCache` explicitly clears this cache to prevent
problems related to stale signature information.
- Clerical cap
- source_sentence: craniofacial
sentences:
- head and face structure
- Planned destruction of structures using explosives or machinery
- >-
Anchor-positive pairs are fundamental to contrastive learning, serving
to define what the model should consider as semantically similar data
points, guiding it to learn meaningful representations.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
license: gpl-3.0
language:
- en
base_model:
- unsloth/Qwen3-Embedding-4B
SentenceTransformer
This model was finetuned with Unsloth.
This is a sentence-transformers model trained. It maps sentences & paragraphs to a 2560-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Maximum Sequence Length: 8192 tokens
- Output Dimensionality: 2560 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'PeftModelForFeatureExtraction'})
(1): Pooling({'word_embedding_dimension': 2560, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': True, 'include_prompt': True})
(2): Normalize()
)
Evaluation Highlights
Pre-Post Train Relevancy
Pre/Post Train Spread
Spread Summary
Training Summary
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'craniofacial',
'head and face structure',
'Anchor-positive pairs are fundamental to contrastive learning, serving to define what the model should consider as semantically similar data points, guiding it to learn meaningful representations.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 2560]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.7268, 0.0036],
# [0.7268, 1.0000, 0.0179],
# [0.0036, 0.0179, 1.0000]])
Training Details
Training Dataset
Unnamed Dataset
- Size: 223,748 training samples
- Columns:
anchorandpositive - Approximate statistics based on the first 1000 samples:
anchor positive type string string details - min: 2 tokens
- mean: 8.95 tokens
- max: 33 tokens
- min: 2 tokens
- mean: 38.48 tokens
- max: 124 tokens
- Samples:
anchor positive groupthinkPsychological tendency for group conformitycustoms and border protectionDHS component enforcing trade and immigration lawsWhat is the meaning and purpose of the//go:noescapedirective in Go functions?The//go:noescapecomment is a hint to the Go compiler. It asserts that none of the pointer parameters of the decorated function will escape the function's stack frame. This is primarily used for performance tuning in low-level code, ensuring that objects pointed to by function arguments are not allocated on the heap, thus avoiding garbage collection cycles. - Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false, "directions": [ "query_to_doc" ], "partition_mode": "joint", "hardness_mode": null, "hardness_strength": 0.0 }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size: 64gradient_accumulation_steps: 8learning_rate: 3e-05num_train_epochs: 1lr_scheduler_type: constant_with_warmupwarmup_ratio: 0.03bf16: Truebatch_sampler: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 64per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 8eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 3e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 1max_steps: -1lr_scheduler_type: constant_with_warmuplr_scheduler_kwargs: {}warmup_ratio: 0.03warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Truefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}
Training Logs
Click to expand
| Epoch | Step | Training Loss |
|---|---|---|
| 0.0023 | 1 | 0.5184 |
| 0.0046 | 2 | 0.5683 |
| 0.0069 | 3 | 0.5821 |
| 0.0092 | 4 | 0.4948 |
| 0.0114 | 5 | 0.4001 |
| 0.0137 | 6 | 0.3097 |
| 0.0160 | 7 | 0.257 |
| 0.0183 | 8 | 0.2752 |
| 0.0206 | 9 | 0.2311 |
| 0.0229 | 10 | 0.1433 |
| 0.0252 | 11 | 0.2507 |
| 0.0275 | 12 | 0.1944 |
| 0.0297 | 13 | 0.2052 |
| 0.0320 | 14 | 0.1044 |
| 0.0343 | 15 | 0.2027 |
| 0.0366 | 16 | 0.1969 |
| 0.0389 | 17 | 0.1833 |
| 0.0412 | 18 | 0.1641 |
| 0.0435 | 19 | 0.1629 |
| 0.0458 | 20 | 0.1702 |
| 0.0480 | 21 | 0.1855 |
| 0.0503 | 22 | 0.1697 |
| 0.0526 | 23 | 0.116 |
| 0.0549 | 24 | 0.1373 |
| 0.0572 | 25 | 0.1323 |
| 0.0595 | 26 | 0.1349 |
| 0.0618 | 27 | 0.1199 |
| 0.0641 | 28 | 0.1353 |
| 0.0663 | 29 | 0.143 |
| 0.0686 | 30 | 0.1305 |
| 0.0709 | 31 | 0.1088 |
| 0.0732 | 32 | 0.0908 |
| 0.0755 | 33 | 0.1502 |
| 0.0778 | 34 | 0.1139 |
| 0.0801 | 35 | 0.1311 |
| 0.0824 | 36 | 0.1291 |
| 0.0846 | 37 | 0.0977 |
| 0.0869 | 38 | 0.0962 |
| 0.0892 | 39 | 0.1166 |
| 0.0915 | 40 | 0.0965 |
| 0.0938 | 41 | 0.1242 |
| 0.0961 | 42 | 0.0705 |
| 0.0984 | 43 | 0.0813 |
| 0.1007 | 44 | 0.1545 |
| 0.1029 | 45 | 0.0868 |
| 0.1052 | 46 | 0.0987 |
| 0.1075 | 47 | 0.0938 |
| 0.1098 | 48 | 0.1086 |
| 0.1121 | 49 | 0.0982 |
| 0.1144 | 50 | 0.0817 |
| 0.1167 | 51 | 0.0527 |
| 0.1190 | 52 | 0.0986 |
| 0.1212 | 53 | 0.098 |
| 0.1235 | 54 | 0.1074 |
| 0.1258 | 55 | 0.1396 |
| 0.1281 | 56 | 0.1101 |
| 0.1304 | 57 | 0.0829 |
| 0.1327 | 58 | 0.1261 |
| 0.1350 | 59 | 0.048 |
| 0.1373 | 60 | 0.1215 |
| 0.1395 | 61 | 0.0981 |
| 0.1418 | 62 | 0.0739 |
| 0.1441 | 63 | 0.0525 |
| 0.1464 | 64 | 0.0757 |
| 0.1487 | 65 | 0.0543 |
| 0.1510 | 66 | 0.0878 |
| 0.1533 | 67 | 0.0791 |
| 0.1556 | 68 | 0.0816 |
| 0.1578 | 69 | 0.0999 |
| 0.1601 | 70 | 0.086 |
| 0.1624 | 71 | 0.0775 |
| 0.1647 | 72 | 0.1048 |
| 0.1670 | 73 | 0.0552 |
| 0.1693 | 74 | 0.0619 |
| 0.1716 | 75 | 0.0667 |
| 0.1739 | 76 | 0.0787 |
| 0.1762 | 77 | 0.1022 |
| 0.1784 | 78 | 0.0937 |
| 0.1807 | 79 | 0.0751 |
| 0.1830 | 80 | 0.0642 |
| 0.1853 | 81 | 0.0508 |
| 0.1876 | 82 | 0.1169 |
| 0.1899 | 83 | 0.09 |
| 0.1922 | 84 | 0.0725 |
| 0.1945 | 85 | 0.0476 |
| 0.1967 | 86 | 0.0737 |
| 0.1990 | 87 | 0.0968 |
| 0.2013 | 88 | 0.0988 |
| 0.2036 | 89 | 0.0575 |
| 0.2059 | 90 | 0.0629 |
| 0.2082 | 91 | 0.0627 |
| 0.2105 | 92 | 0.0565 |
| 0.2128 | 93 | 0.0696 |
| 0.2150 | 94 | 0.0413 |
| 0.2173 | 95 | 0.0625 |
| 0.2196 | 96 | 0.0593 |
| 0.2219 | 97 | 0.0511 |
| 0.2242 | 98 | 0.1168 |
| 0.2265 | 99 | 0.0601 |
| 0.2288 | 100 | 0.0919 |
| 0.2311 | 101 | 0.0471 |
| 0.2333 | 102 | 0.0701 |
| 0.2356 | 103 | 0.1032 |
| 0.2379 | 104 | 0.0823 |
| 0.2402 | 105 | 0.0825 |
| 0.2425 | 106 | 0.0626 |
| 0.2448 | 107 | 0.0821 |
| 0.2471 | 108 | 0.0532 |
| 0.2494 | 109 | 0.1171 |
| 0.2516 | 110 | 0.0814 |
| 0.2539 | 111 | 0.1167 |
| 0.2562 | 112 | 0.0918 |
| 0.2585 | 113 | 0.0704 |
| 0.2608 | 114 | 0.0726 |
| 0.2631 | 115 | 0.0522 |
| 0.2654 | 116 | 0.0628 |
| 0.2677 | 117 | 0.0716 |
| 0.2699 | 118 | 0.0676 |
| 0.2722 | 119 | 0.0616 |
| 0.2745 | 120 | 0.0505 |
| 0.2768 | 121 | 0.0653 |
| 0.2791 | 122 | 0.051 |
| 0.2814 | 123 | 0.0888 |
| 0.2837 | 124 | 0.1061 |
| 0.2860 | 125 | 0.104 |
| 0.2882 | 126 | 0.095 |
| 0.2905 | 127 | 0.0715 |
| 0.2928 | 128 | 0.0766 |
| 0.2951 | 129 | 0.076 |
| 0.2974 | 130 | 0.1154 |
| 0.2997 | 131 | 0.0463 |
| 0.3020 | 132 | 0.0596 |
| 0.3043 | 133 | 0.0705 |
| 0.3065 | 134 | 0.0654 |
| 0.3088 | 135 | 0.0802 |
| 0.3111 | 136 | 0.0882 |
| 0.3134 | 137 | 0.0872 |
| 0.3157 | 138 | 0.0853 |
| 0.3180 | 139 | 0.0661 |
| 0.3203 | 140 | 0.0633 |
| 0.3226 | 141 | 0.0784 |
| 0.3248 | 142 | 0.0832 |
| 0.3271 | 143 | 0.0799 |
| 0.3294 | 144 | 0.0954 |
| 0.3317 | 145 | 0.0744 |
| 0.3340 | 146 | 0.0559 |
| 0.3363 | 147 | 0.0892 |
| 0.3386 | 148 | 0.0424 |
| 0.3409 | 149 | 0.0742 |
| 0.3432 | 150 | 0.1025 |
| 0.3454 | 151 | 0.0814 |
| 0.3477 | 152 | 0.051 |
| 0.3500 | 153 | 0.1313 |
| 0.3523 | 154 | 0.0645 |
| 0.3546 | 155 | 0.1006 |
| 0.3569 | 156 | 0.0524 |
| 0.3592 | 157 | 0.0635 |
| 0.3615 | 158 | 0.0467 |
| 0.3637 | 159 | 0.0741 |
| 0.3660 | 160 | 0.0593 |
| 0.3683 | 161 | 0.0698 |
| 0.3706 | 162 | 0.0835 |
| 0.3729 | 163 | 0.0715 |
| 0.3752 | 164 | 0.0628 |
| 0.3775 | 165 | 0.0772 |
| 0.3798 | 166 | 0.1167 |
| 0.3820 | 167 | 0.0981 |
| 0.3843 | 168 | 0.0595 |
| 0.3866 | 169 | 0.041 |
| 0.3889 | 170 | 0.0728 |
| 0.3912 | 171 | 0.0937 |
| 0.3935 | 172 | 0.0757 |
| 0.3958 | 173 | 0.0603 |
| 0.3981 | 174 | 0.0542 |
| 0.4003 | 175 | 0.0701 |
| 0.4026 | 176 | 0.0372 |
| 0.4049 | 177 | 0.125 |
| 0.4072 | 178 | 0.0545 |
| 0.4095 | 179 | 0.0476 |
| 0.4118 | 180 | 0.0516 |
| 0.4141 | 181 | 0.1243 |
| 0.4164 | 182 | 0.0599 |
| 0.4186 | 183 | 0.1026 |
| 0.4209 | 184 | 0.077 |
| 0.4232 | 185 | 0.0732 |
| 0.4255 | 186 | 0.0798 |
| 0.4278 | 187 | 0.0538 |
| 0.4301 | 188 | 0.0679 |
| 0.4324 | 189 | 0.0759 |
| 0.4347 | 190 | 0.0761 |
| 0.4369 | 191 | 0.0557 |
| 0.4392 | 192 | 0.0534 |
| 0.4415 | 193 | 0.0747 |
| 0.4438 | 194 | 0.0672 |
| 0.4461 | 195 | 0.0376 |
| 0.4484 | 196 | 0.0466 |
| 0.4507 | 197 | 0.0783 |
| 0.4530 | 198 | 0.0864 |
| 0.4552 | 199 | 0.0423 |
| 0.4575 | 200 | 0.0708 |
| 0.4598 | 201 | 0.0429 |
| 0.4621 | 202 | 0.0718 |
| 0.4644 | 203 | 0.0802 |
| 0.4667 | 204 | 0.073 |
| 0.4690 | 205 | 0.0628 |
| 0.4713 | 206 | 0.055 |
| 0.4735 | 207 | 0.0468 |
| 0.4758 | 208 | 0.0536 |
| 0.4781 | 209 | 0.0429 |
| 0.4804 | 210 | 0.0388 |
| 0.4827 | 211 | 0.0962 |
| 0.4850 | 212 | 0.0475 |
| 0.4873 | 213 | 0.0589 |
| 0.4896 | 214 | 0.0606 |
| 0.4919 | 215 | 0.0512 |
| 0.4941 | 216 | 0.0836 |
| 0.4964 | 217 | 0.0659 |
| 0.4987 | 218 | 0.0924 |
| 0.5010 | 219 | 0.0711 |
| 0.5033 | 220 | 0.0676 |
| 0.5056 | 221 | 0.0393 |
| 0.5079 | 222 | 0.0668 |
| 0.5102 | 223 | 0.0511 |
| 0.5124 | 224 | 0.0575 |
| 0.5147 | 225 | 0.0594 |
| 0.5170 | 226 | 0.126 |
| 0.5193 | 227 | 0.0787 |
| 0.5216 | 228 | 0.0509 |
| 0.5239 | 229 | 0.0684 |
| 0.5262 | 230 | 0.0792 |
| 0.5285 | 231 | 0.0501 |
| 0.5307 | 232 | 0.0988 |
| 0.5330 | 233 | 0.0414 |
| 0.5353 | 234 | 0.0596 |
| 0.5376 | 235 | 0.0607 |
| 0.5399 | 236 | 0.0556 |
| 0.5422 | 237 | 0.0578 |
| 0.5445 | 238 | 0.0238 |
| 0.5468 | 239 | 0.0509 |
| 0.5490 | 240 | 0.0431 |
| 0.5513 | 241 | 0.0377 |
| 0.5536 | 242 | 0.0814 |
| 0.5559 | 243 | 0.0779 |
| 0.5582 | 244 | 0.0574 |
| 0.5605 | 245 | 0.0681 |
| 0.5628 | 246 | 0.0513 |
| 0.5651 | 247 | 0.0573 |
| 0.5673 | 248 | 0.0758 |
| 0.5696 | 249 | 0.0442 |
| 0.5719 | 250 | 0.0458 |
| 0.5742 | 251 | 0.0853 |
| 0.5765 | 252 | 0.0825 |
| 0.5788 | 253 | 0.065 |
| 0.5811 | 254 | 0.0429 |
| 0.5834 | 255 | 0.0438 |
| 0.5856 | 256 | 0.1028 |
| 0.5879 | 257 | 0.04 |
| 0.5902 | 258 | 0.0406 |
| 0.5925 | 259 | 0.0465 |
| 0.5948 | 260 | 0.068 |
| 0.5971 | 261 | 0.0532 |
| 0.5994 | 262 | 0.0503 |
| 0.6017 | 263 | 0.0421 |
| 0.6039 | 264 | 0.0663 |
| 0.6062 | 265 | 0.0621 |
| 0.6085 | 266 | 0.0845 |
| 0.6108 | 267 | 0.049 |
| 0.6131 | 268 | 0.0503 |
| 0.6154 | 269 | 0.0392 |
| 0.6177 | 270 | 0.0505 |
| 0.6200 | 271 | 0.0594 |
| 0.6222 | 272 | 0.0573 |
| 0.6245 | 273 | 0.0383 |
| 0.6268 | 274 | 0.0568 |
| 0.6291 | 275 | 0.0386 |
| 0.6314 | 276 | 0.0573 |
| 0.6337 | 277 | 0.0397 |
| 0.6360 | 278 | 0.0459 |
| 0.6383 | 279 | 0.0624 |
| 0.6405 | 280 | 0.0706 |
| 0.6428 | 281 | 0.0743 |
| 0.6451 | 282 | 0.0405 |
| 0.6474 | 283 | 0.0761 |
| 0.6497 | 284 | 0.0583 |
| 0.6520 | 285 | 0.0444 |
| 0.6543 | 286 | 0.0305 |
| 0.6566 | 287 | 0.0716 |
| 0.6589 | 288 | 0.041 |
| 0.6611 | 289 | 0.043 |
| 0.6634 | 290 | 0.0574 |
| 0.6657 | 291 | 0.0479 |
| 0.6680 | 292 | 0.062 |
| 0.6703 | 293 | 0.0441 |
| 0.6726 | 294 | 0.0657 |
| 0.6749 | 295 | 0.0515 |
| 0.6772 | 296 | 0.0718 |
| 0.6794 | 297 | 0.0839 |
| 0.6817 | 298 | 0.0751 |
| 0.6840 | 299 | 0.073 |
| 0.6863 | 300 | 0.0656 |
| 0.6886 | 301 | 0.0717 |
| 0.6909 | 302 | 0.0457 |
| 0.6932 | 303 | 0.0761 |
| 0.6955 | 304 | 0.0557 |
| 0.6977 | 305 | 0.0646 |
| 0.7000 | 306 | 0.0688 |
| 0.7023 | 307 | 0.0396 |
| 0.7046 | 308 | 0.0444 |
| 0.7069 | 309 | 0.0627 |
| 0.7092 | 310 | 0.0594 |
| 0.7115 | 311 | 0.0496 |
| 0.7138 | 312 | 0.0406 |
| 0.7160 | 313 | 0.0513 |
| 0.7183 | 314 | 0.0483 |
| 0.7206 | 315 | 0.0527 |
| 0.7229 | 316 | 0.0646 |
| 0.7252 | 317 | 0.0351 |
| 0.7275 | 318 | 0.0432 |
| 0.7298 | 319 | 0.06 |
| 0.7321 | 320 | 0.0487 |
| 0.7343 | 321 | 0.0398 |
| 0.7366 | 322 | 0.0279 |
| 0.7389 | 323 | 0.0594 |
| 0.7412 | 324 | 0.0808 |
| 0.7435 | 325 | 0.0461 |
| 0.7458 | 326 | 0.0452 |
| 0.7481 | 327 | 0.0887 |
| 0.7504 | 328 | 0.057 |
| 0.7526 | 329 | 0.082 |
| 0.7549 | 330 | 0.0693 |
| 0.7572 | 331 | 0.0245 |
| 0.7595 | 332 | 0.0476 |
| 0.7618 | 333 | 0.051 |
| 0.7641 | 334 | 0.0539 |
| 0.7664 | 335 | 0.0325 |
| 0.7687 | 336 | 0.0431 |
| 0.7709 | 337 | 0.0534 |
| 0.7732 | 338 | 0.0346 |
| 0.7755 | 339 | 0.0577 |
| 0.7778 | 340 | 0.086 |
| 0.7801 | 341 | 0.0705 |
| 0.7824 | 342 | 0.0412 |
| 0.7847 | 343 | 0.0426 |
| 0.7870 | 344 | 0.0829 |
| 0.7892 | 345 | 0.0767 |
| 0.7915 | 346 | 0.0702 |
| 0.7938 | 347 | 0.0662 |
| 0.7961 | 348 | 0.0436 |
| 0.7984 | 349 | 0.0292 |
| 0.8007 | 350 | 0.0586 |
| 0.8030 | 351 | 0.0416 |
| 0.8053 | 352 | 0.0874 |
| 0.8075 | 353 | 0.0378 |
| 0.8098 | 354 | 0.036 |
| 0.8121 | 355 | 0.0426 |
| 0.8144 | 356 | 0.0375 |
| 0.8167 | 357 | 0.0296 |
| 0.8190 | 358 | 0.0535 |
| 0.8213 | 359 | 0.0654 |
| 0.8236 | 360 | 0.0756 |
| 0.8259 | 361 | 0.0591 |
| 0.8281 | 362 | 0.0603 |
| 0.8304 | 363 | 0.0664 |
| 0.8327 | 364 | 0.0403 |
| 0.8350 | 365 | 0.0418 |
| 0.8373 | 366 | 0.047 |
| 0.8396 | 367 | 0.077 |
| 0.8419 | 368 | 0.0597 |
| 0.8442 | 369 | 0.0683 |
| 0.8464 | 370 | 0.0557 |
| 0.8487 | 371 | 0.0487 |
| 0.8510 | 372 | 0.0499 |
| 0.8533 | 373 | 0.0328 |
| 0.8556 | 374 | 0.0211 |
| 0.8579 | 375 | 0.0411 |
| 0.8602 | 376 | 0.0648 |
| 0.8625 | 377 | 0.0583 |
| 0.8647 | 378 | 0.0483 |
| 0.8670 | 379 | 0.0362 |
| 0.8693 | 380 | 0.0616 |
| 0.8716 | 381 | 0.0634 |
| 0.8739 | 382 | 0.0542 |
| 0.8762 | 383 | 0.053 |
| 0.8785 | 384 | 0.0436 |
| 0.8808 | 385 | 0.0426 |
| 0.8830 | 386 | 0.0503 |
| 0.8853 | 387 | 0.0522 |
| 0.8876 | 388 | 0.083 |
| 0.8899 | 389 | 0.0317 |
| 0.8922 | 390 | 0.0571 |
| 0.8945 | 391 | 0.0464 |
| 0.8968 | 392 | 0.0179 |
| 0.8991 | 393 | 0.0389 |
| 0.9013 | 394 | 0.0317 |
| 0.9036 | 395 | 0.0605 |
| 0.9059 | 396 | 0.0389 |
| 0.9082 | 397 | 0.0407 |
| 0.9105 | 398 | 0.0478 |
| 0.9128 | 399 | 0.0304 |
| 0.9151 | 400 | 0.0572 |
| 0.9174 | 401 | 0.037 |
| 0.9196 | 402 | 0.062 |
| 0.9219 | 403 | 0.0539 |
| 0.9242 | 404 | 0.039 |
| 0.9265 | 405 | 0.0265 |
| 0.9288 | 406 | 0.0398 |
| 0.9311 | 407 | 0.0369 |
| 0.9334 | 408 | 0.053 |
| 0.9357 | 409 | 0.0503 |
| 0.9379 | 410 | 0.0535 |
| 0.9402 | 411 | 0.0645 |
| 0.9425 | 412 | 0.0328 |
| 0.9448 | 413 | 0.0438 |
| 0.9471 | 414 | 0.0435 |
| 0.9494 | 415 | 0.1018 |
| 0.9517 | 416 | 0.0403 |
| 0.9540 | 417 | 0.0577 |
| 0.9562 | 418 | 0.0234 |
| 0.9585 | 419 | 0.041 |
| 0.9608 | 420 | 0.0226 |
| 0.9631 | 421 | 0.0497 |
| 0.9654 | 422 | 0.0493 |
| 0.9677 | 423 | 0.0223 |
| 0.9700 | 424 | 0.0192 |
| 0.9723 | 425 | 0.0322 |
| 0.9745 | 426 | 0.0483 |
| 0.9768 | 427 | 0.041 |
| 0.9791 | 428 | 0.0628 |
| 0.9814 | 429 | 0.0861 |
| 0.9837 | 430 | 0.0645 |
| 0.9860 | 431 | 0.0386 |
| 0.9883 | 432 | 0.0378 |
| 0.9906 | 433 | 0.0613 |
| 0.9929 | 434 | 0.067 |
| 0.9951 | 435 | 0.049 |
| 0.9974 | 436 | 0.0644 |
| 0.9997 | 437 | 0.02 |
| 1.0 | 438 | 0.0001 |
Framework Versions
- Python: 3.12.3
- Sentence Transformers: 5.3.0
- Transformers: 4.56.2
- PyTorch: 2.10.0+cu128
- Accelerate: 1.13.0
- Datasets: 4.3.0
- Tokenizers: 0.22.2
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{oord2019representationlearningcontrastivepredictive,
title={Representation Learning with Contrastive Predictive Coding},
author={Aaron van den Oord and Yazhe Li and Oriol Vinyals},
year={2019},
eprint={1807.03748},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/1807.03748},
}










