metadata
tags:
- sentence-transformers
- cross-encoder
- reranker
- generated_from_trainer
- dataset_size:175555
- loss:BinaryCrossEntropyLoss
base_model: cross-encoder/ms-marco-MiniLM-L6-v2
pipeline_tag: text-ranking
library_name: sentence-transformers
metrics:
- map
- mrr@10
- ndcg@10
model-index:
- name: CrossEncoder based on cross-encoder/ms-marco-MiniLM-L6-v2
results:
- task:
type: cross-encoder-reranking
name: Cross Encoder Reranking
dataset:
name: reranker
type: reranker
metrics:
- type: map
value: 0.8938843308244206
name: Map
- type: mrr@10
value: 0.9404932678998363
name: Mrr@10
- type: ndcg@10
value: 0.9271673543093844
name: Ndcg@10
CrossEncoder based on cross-encoder/ms-marco-MiniLM-L6-v2
This is a Cross Encoder model finetuned from cross-encoder/ms-marco-MiniLM-L6-v2 using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
Model Details
Model Description
- Model Type: Cross Encoder
- Base model: cross-encoder/ms-marco-MiniLM-L6-v2
- Maximum Sequence Length: 512 tokens
- Number of Output Labels: 1 label
Model Sources
- Documentation: Sentence Transformers Documentation
- Documentation: Cross Encoder Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Cross Encoders on Hugging Face
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import CrossEncoder
# Download from the 🤗 Hub
model = CrossEncoder("cross_encoder_model_id")
# Get scores for pairs of texts
pairs = [
['what is the average payment volume per transaction for american express?', '(table): company the american express of payments volume ( billions ) is 637 ; the american express of total volume ( billions ) is 647 ; the american express of total transactions ( billions ) is 5.0 ; the american express of cards ( millions ) is 86 ;'],
['what is the average payment volume per transaction for american express?', '(text): largest operators of open-loop and closed-loop retail electronic payments networks the largest operators of open-loop and closed-loop retail electronic payments networks are visa , mastercard , american express , discover , jcb and diners club .'],
['what is the average payment volume per transaction for american express?', '(text): with the exception of discover , which primarily operates in the united states , all of the other network operators can be considered multi- national or global providers of payments network services .'],
['what is the average payment volume per transaction for american express?', '(text): based on payments volume , total volume , number of transactions and number of cards in circulation , visa is the largest retail electronic payments network in the world .'],
['what is the average payment volume per transaction for american express?', '(text): the following chart compares our network with those of our major competitors for calendar year 2007 : company payments volume volume transactions cards ( billions ) ( billions ) ( billions ) ( millions ) visa inc. ( 1 ) .'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)
# Or rank different texts based on similarity to a single text
ranks = model.rank(
'what is the average payment volume per transaction for american express?',
[
'(table): company the american express of payments volume ( billions ) is 637 ; the american express of total volume ( billions ) is 647 ; the american express of total transactions ( billions ) is 5.0 ; the american express of cards ( millions ) is 86 ;',
'(text): largest operators of open-loop and closed-loop retail electronic payments networks the largest operators of open-loop and closed-loop retail electronic payments networks are visa , mastercard , american express , discover , jcb and diners club .',
'(text): with the exception of discover , which primarily operates in the united states , all of the other network operators can be considered multi- national or global providers of payments network services .',
'(text): based on payments volume , total volume , number of transactions and number of cards in circulation , visa is the largest retail electronic payments network in the world .',
'(text): the following chart compares our network with those of our major competitors for calendar year 2007 : company payments volume volume transactions cards ( billions ) ( billions ) ( billions ) ( millions ) visa inc. ( 1 ) .',
]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
Evaluation
Metrics
Cross Encoder Reranking
- Dataset:
reranker - Evaluated with
CrossEncoderRerankingEvaluatorwith these parameters:{ "at_k": 10 }
| Metric | Value |
|---|---|
| map | 0.8939 |
| mrr@10 | 0.9405 |
| ndcg@10 | 0.9272 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 175,555 training samples
- Columns:
query,passage, andlabel - Approximate statistics based on the first 1000 samples:
query passage label type string string float details - min: 41 characters
- mean: 89.26 characters
- max: 186 characters
- min: 11 characters
- mean: 182.61 characters
- max: 1853 characters
- min: 0.0
- mean: 0.07
- max: 1.0
- Samples:
query passage label what is the the interest expense in 2009?(text): if libor changes by 100 basis points , our annual interest expense would change by $ 3.8 million .1.0what is the the interest expense in 2009?(text): interest rate to a variable interest rate based on the three-month libor plus 2.05% ( 2.05 % ) ( 2.34% ( 2.34 % ) as of october 31 , 2009 ) .0.0what is the the interest expense in 2009?(text): foreign currency exposure as more fully described in note 2i .0.0 - Loss:
BinaryCrossEntropyLosswith these parameters:{ "activation_fn": "torch.nn.modules.linear.Identity", "pos_weight": null }
Evaluation Dataset
Unnamed Dataset
- Size: 25,007 evaluation samples
- Columns:
query,passage, andlabel - Approximate statistics based on the first 1000 samples:
query passage label type string string float details - min: 52 characters
- mean: 86.04 characters
- max: 137 characters
- min: 11 characters
- mean: 166.61 characters
- max: 717 characters
- min: 0.0
- mean: 0.06
- max: 1.0
- Samples:
query passage label what is the average payment volume per transaction for american express?(table): company the american express of payments volume ( billions ) is 637 ; the american express of total volume ( billions ) is 647 ; the american express of total transactions ( billions ) is 5.0 ; the american express of cards ( millions ) is 86 ;1.0what is the average payment volume per transaction for american express?(text): largest operators of open-loop and closed-loop retail electronic payments networks the largest operators of open-loop and closed-loop retail electronic payments networks are visa , mastercard , american express , discover , jcb and diners club .0.0what is the average payment volume per transaction for american express?(text): with the exception of discover , which primarily operates in the united states , all of the other network operators can be considered multi- national or global providers of payments network services .0.0 - Loss:
BinaryCrossEntropyLosswith these parameters:{ "activation_fn": "torch.nn.modules.linear.Identity", "pos_weight": null }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: stepsper_device_train_batch_size: 64per_device_eval_batch_size: 64learning_rate: 0.0001weight_decay: 0.01num_train_epochs: 1warmup_ratio: 0.1fp16: Trueload_best_model_at_end: True
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 64per_device_eval_batch_size: 64per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 0.0001weight_decay: 0.01adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 1max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Trueignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}
Training Logs
Click to expand
| Epoch | Step | Training Loss | Validation Loss | reranker_ndcg@10 |
|---|---|---|---|---|
| 0.0036 | 10 | 0.3268 | - | - |
| 0.0073 | 20 | 0.247 | - | - |
| 0.0109 | 30 | 0.2451 | - | - |
| 0.0146 | 40 | 0.2029 | - | - |
| 0.0182 | 50 | 0.1739 | - | - |
| 0.0219 | 60 | 0.172 | - | - |
| 0.0255 | 70 | 0.1425 | - | - |
| 0.0292 | 80 | 0.138 | - | - |
| 0.0328 | 90 | 0.1304 | - | - |
| 0.0364 | 100 | 0.1561 | - | - |
| 0.0401 | 110 | 0.1627 | - | - |
| 0.0437 | 120 | 0.1974 | - | - |
| 0.0474 | 130 | 0.1339 | - | - |
| 0.0510 | 140 | 0.1137 | - | - |
| 0.0547 | 150 | 0.1333 | - | - |
| 0.0583 | 160 | 0.1296 | - | - |
| 0.0620 | 170 | 0.1723 | - | - |
| 0.0656 | 180 | 0.1099 | - | - |
| 0.0692 | 190 | 0.1105 | - | - |
| 0.0729 | 200 | 0.0917 | 0.1133 | 0.9034 |
| 0.0765 | 210 | 0.1012 | - | - |
| 0.0802 | 220 | 0.1296 | - | - |
| 0.0838 | 230 | 0.1332 | - | - |
| 0.0875 | 240 | 0.095 | - | - |
| 0.0911 | 250 | 0.1351 | - | - |
| 0.0948 | 260 | 0.1138 | - | - |
| 0.0984 | 270 | 0.1318 | - | - |
| 0.1020 | 280 | 0.1164 | - | - |
| 0.1057 | 290 | 0.1418 | - | - |
| 0.1093 | 300 | 0.1337 | - | - |
| 0.1130 | 310 | 0.1169 | - | - |
| 0.1166 | 320 | 0.1314 | - | - |
| 0.1203 | 330 | 0.1197 | - | - |
| 0.1239 | 340 | 0.1002 | - | - |
| 0.1276 | 350 | 0.1124 | - | - |
| 0.1312 | 360 | 0.0932 | - | - |
| 0.1348 | 370 | 0.1629 | - | - |
| 0.1385 | 380 | 0.1501 | - | - |
| 0.1421 | 390 | 0.1097 | - | - |
| 0.1458 | 400 | 0.0756 | 0.1138 | 0.8984 |
| 0.1494 | 410 | 0.1174 | - | - |
| 0.1531 | 420 | 0.1472 | - | - |
| 0.1567 | 430 | 0.1391 | - | - |
| 0.1603 | 440 | 0.1188 | - | - |
| 0.1640 | 450 | 0.1555 | - | - |
| 0.1676 | 460 | 0.1148 | - | - |
| 0.1713 | 470 | 0.0753 | - | - |
| 0.1749 | 480 | 0.104 | - | - |
| 0.1786 | 490 | 0.1313 | - | - |
| 0.1822 | 500 | 0.1125 | - | - |
| 0.1859 | 510 | 0.0772 | - | - |
| 0.1895 | 520 | 0.1045 | - | - |
| 0.1931 | 530 | 0.1101 | - | - |
| 0.1968 | 540 | 0.109 | - | - |
| 0.2004 | 550 | 0.124 | - | - |
| 0.2041 | 560 | 0.0934 | - | - |
| 0.2077 | 570 | 0.1305 | - | - |
| 0.2114 | 580 | 0.1163 | - | - |
| 0.2150 | 590 | 0.1004 | - | - |
| 0.2187 | 600 | 0.0917 | 0.1206 | 0.9025 |
| 0.2223 | 610 | 0.0942 | - | - |
| 0.2259 | 620 | 0.1223 | - | - |
| 0.2296 | 630 | 0.1156 | - | - |
| 0.2332 | 640 | 0.0924 | - | - |
| 0.2369 | 650 | 0.1372 | - | - |
| 0.2405 | 660 | 0.0984 | - | - |
| 0.2442 | 670 | 0.0876 | - | - |
| 0.2478 | 680 | 0.0926 | - | - |
| 0.2515 | 690 | 0.0819 | - | - |
| 0.2551 | 700 | 0.1034 | - | - |
| 0.2587 | 710 | 0.1022 | - | - |
| 0.2624 | 720 | 0.0661 | - | - |
| 0.2660 | 730 | 0.124 | - | - |
| 0.2697 | 740 | 0.1231 | - | - |
| 0.2733 | 750 | 0.1307 | - | - |
| 0.2770 | 760 | 0.0973 | - | - |
| 0.2806 | 770 | 0.0721 | - | - |
| 0.2843 | 780 | 0.0734 | - | - |
| 0.2879 | 790 | 0.0806 | - | - |
| 0.2915 | 800 | 0.0824 | 0.0996 | 0.9079 |
| 0.2952 | 810 | 0.1037 | - | - |
| 0.2988 | 820 | 0.0771 | - | - |
| 0.3025 | 830 | 0.1407 | - | - |
| 0.3061 | 840 | 0.1196 | - | - |
| 0.3098 | 850 | 0.1087 | - | - |
| 0.3134 | 860 | 0.0737 | - | - |
| 0.3171 | 870 | 0.0986 | - | - |
| 0.3207 | 880 | 0.1042 | - | - |
| 0.3243 | 890 | 0.0971 | - | - |
| 0.3280 | 900 | 0.0824 | - | - |
| 0.3316 | 910 | 0.0842 | - | - |
| 0.3353 | 920 | 0.1361 | - | - |
| 0.3389 | 930 | 0.086 | - | - |
| 0.3426 | 940 | 0.0861 | - | - |
| 0.3462 | 950 | 0.1039 | - | - |
| 0.3499 | 960 | 0.1085 | - | - |
| 0.3535 | 970 | 0.1316 | - | - |
| 0.3571 | 980 | 0.0806 | - | - |
| 0.3608 | 990 | 0.0873 | - | - |
| 0.3644 | 1000 | 0.0952 | 0.0981 | 0.9101 |
| 0.3681 | 1010 | 0.1194 | - | - |
| 0.3717 | 1020 | 0.1114 | - | - |
| 0.3754 | 1030 | 0.122 | - | - |
| 0.3790 | 1040 | 0.094 | - | - |
| 0.3827 | 1050 | 0.0971 | - | - |
| 0.3863 | 1060 | 0.1285 | - | - |
| 0.3899 | 1070 | 0.103 | - | - |
| 0.3936 | 1080 | 0.1065 | - | - |
| 0.3972 | 1090 | 0.0885 | - | - |
| 0.4009 | 1100 | 0.1022 | - | - |
| 0.4045 | 1110 | 0.1129 | - | - |
| 0.4082 | 1120 | 0.1229 | - | - |
| 0.4118 | 1130 | 0.0999 | - | - |
| 0.4155 | 1140 | 0.0879 | - | - |
| 0.4191 | 1150 | 0.0763 | - | - |
| 0.4227 | 1160 | 0.0852 | - | - |
| 0.4264 | 1170 | 0.0914 | - | - |
| 0.4300 | 1180 | 0.1004 | - | - |
| 0.4337 | 1190 | 0.1143 | - | - |
| 0.4373 | 1200 | 0.1364 | 0.0940 | 0.9246 |
| 0.4410 | 1210 | 0.1017 | - | - |
| 0.4446 | 1220 | 0.09 | - | - |
| 0.4483 | 1230 | 0.0687 | - | - |
| 0.4519 | 1240 | 0.0733 | - | - |
| 0.4555 | 1250 | 0.1049 | - | - |
| 0.4592 | 1260 | 0.0918 | - | - |
| 0.4628 | 1270 | 0.0848 | - | - |
| 0.4665 | 1280 | 0.0736 | - | - |
| 0.4701 | 1290 | 0.1129 | - | - |
| 0.4738 | 1300 | 0.0713 | - | - |
| 0.4774 | 1310 | 0.0876 | - | - |
| 0.4810 | 1320 | 0.0866 | - | - |
| 0.4847 | 1330 | 0.1016 | - | - |
| 0.4883 | 1340 | 0.1061 | - | - |
| 0.4920 | 1350 | 0.0791 | - | - |
| 0.4956 | 1360 | 0.0938 | - | - |
| 0.4993 | 1370 | 0.1235 | - | - |
| 0.5029 | 1380 | 0.0693 | - | - |
| 0.5066 | 1390 | 0.065 | - | - |
| 0.5102 | 1400 | 0.0839 | 0.1007 | 0.9214 |
| 0.5138 | 1410 | 0.0914 | - | - |
| 0.5175 | 1420 | 0.0786 | - | - |
| 0.5211 | 1430 | 0.0916 | - | - |
| 0.5248 | 1440 | 0.0606 | - | - |
| 0.5284 | 1450 | 0.1417 | - | - |
| 0.5321 | 1460 | 0.0856 | - | - |
| 0.5357 | 1470 | 0.0865 | - | - |
| 0.5394 | 1480 | 0.0917 | - | - |
| 0.5430 | 1490 | 0.0774 | - | - |
| 0.5466 | 1500 | 0.0951 | - | - |
| 0.5503 | 1510 | 0.074 | - | - |
| 0.5539 | 1520 | 0.0797 | - | - |
| 0.5576 | 1530 | 0.0817 | - | - |
| 0.5612 | 1540 | 0.1137 | - | - |
| 0.5649 | 1550 | 0.1139 | - | - |
| 0.5685 | 1560 | 0.0889 | - | - |
| 0.5722 | 1570 | 0.1075 | - | - |
| 0.5758 | 1580 | 0.1021 | - | - |
| 0.5794 | 1590 | 0.1115 | - | - |
| 0.5831 | 1600 | 0.1047 | 0.0952 | 0.9229 |
| 0.5867 | 1610 | 0.1056 | - | - |
| 0.5904 | 1620 | 0.116 | - | - |
| 0.5940 | 1630 | 0.0989 | - | - |
| 0.5977 | 1640 | 0.1102 | - | - |
| 0.6013 | 1650 | 0.1006 | - | - |
| 0.6050 | 1660 | 0.0956 | - | - |
| 0.6086 | 1670 | 0.1003 | - | - |
| 0.6122 | 1680 | 0.0984 | - | - |
| 0.6159 | 1690 | 0.0734 | - | - |
| 0.6195 | 1700 | 0.079 | - | - |
| 0.6232 | 1710 | 0.0872 | - | - |
| 0.6268 | 1720 | 0.1077 | - | - |
| 0.6305 | 1730 | 0.0833 | - | - |
| 0.6341 | 1740 | 0.0984 | - | - |
| 0.6378 | 1750 | 0.0727 | - | - |
| 0.6414 | 1760 | 0.1062 | - | - |
| 0.6450 | 1770 | 0.1013 | - | - |
| 0.6487 | 1780 | 0.0892 | - | - |
| 0.6523 | 1790 | 0.0765 | - | - |
| 0.6560 | 1800 | 0.0698 | 0.0962 | 0.9208 |
| 0.6596 | 1810 | 0.0658 | - | - |
| 0.6633 | 1820 | 0.1386 | - | - |
| 0.6669 | 1830 | 0.1094 | - | - |
| 0.6706 | 1840 | 0.103 | - | - |
| 0.6742 | 1850 | 0.1075 | - | - |
| 0.6778 | 1860 | 0.091 | - | - |
| 0.6815 | 1870 | 0.106 | - | - |
| 0.6851 | 1880 | 0.0753 | - | - |
| 0.6888 | 1890 | 0.0685 | - | - |
| 0.6924 | 1900 | 0.1045 | - | - |
| 0.6961 | 1910 | 0.087 | - | - |
| 0.6997 | 1920 | 0.0866 | - | - |
| 0.7034 | 1930 | 0.1253 | - | - |
| 0.7070 | 1940 | 0.0915 | - | - |
| 0.7106 | 1950 | 0.061 | - | - |
| 0.7143 | 1960 | 0.0744 | - | - |
| 0.7179 | 1970 | 0.0643 | - | - |
| 0.7216 | 1980 | 0.0571 | - | - |
| 0.7252 | 1990 | 0.1004 | - | - |
| 0.7289 | 2000 | 0.1075 | 0.0936 | 0.9237 |
| 0.7325 | 2010 | 0.0637 | - | - |
| 0.7362 | 2020 | 0.1167 | - | - |
| 0.7398 | 2030 | 0.1113 | - | - |
| 0.7434 | 2040 | 0.1314 | - | - |
| 0.7471 | 2050 | 0.0764 | - | - |
| 0.7507 | 2060 | 0.1297 | - | - |
| 0.7544 | 2070 | 0.0841 | - | - |
| 0.7580 | 2080 | 0.0967 | - | - |
| 0.7617 | 2090 | 0.0916 | - | - |
| 0.7653 | 2100 | 0.1196 | - | - |
| 0.7690 | 2110 | 0.1072 | - | - |
| 0.7726 | 2120 | 0.0974 | - | - |
| 0.7762 | 2130 | 0.0772 | - | - |
| 0.7799 | 2140 | 0.1147 | - | - |
| 0.7835 | 2150 | 0.1003 | - | - |
| 0.7872 | 2160 | 0.0944 | - | - |
| 0.7908 | 2170 | 0.0886 | - | - |
| 0.7945 | 2180 | 0.062 | - | - |
| 0.7981 | 2190 | 0.0817 | - | - |
| 0.8017 | 2200 | 0.1096 | 0.0919 | 0.9262 |
| 0.8054 | 2210 | 0.0821 | - | - |
| 0.8090 | 2220 | 0.0866 | - | - |
| 0.8127 | 2230 | 0.0824 | - | - |
| 0.8163 | 2240 | 0.108 | - | - |
| 0.8200 | 2250 | 0.0746 | - | - |
| 0.8236 | 2260 | 0.0708 | - | - |
| 0.8273 | 2270 | 0.0898 | - | - |
| 0.8309 | 2280 | 0.0876 | - | - |
| 0.8345 | 2290 | 0.0898 | - | - |
| 0.8382 | 2300 | 0.0935 | - | - |
| 0.8418 | 2310 | 0.0655 | - | - |
| 0.8455 | 2320 | 0.106 | - | - |
| 0.8491 | 2330 | 0.0806 | - | - |
| 0.8528 | 2340 | 0.091 | - | - |
| 0.8564 | 2350 | 0.0575 | - | - |
| 0.8601 | 2360 | 0.059 | - | - |
| 0.8637 | 2370 | 0.0889 | - | - |
| 0.8673 | 2380 | 0.0955 | - | - |
| 0.8710 | 2390 | 0.0841 | - | - |
| 0.8746 | 2400 | 0.0759 | 0.0896 | 0.9256 |
| 0.8783 | 2410 | 0.0558 | - | - |
| 0.8819 | 2420 | 0.0921 | - | - |
| 0.8856 | 2430 | 0.0865 | - | - |
| 0.8892 | 2440 | 0.0787 | - | - |
| 0.8929 | 2450 | 0.0803 | - | - |
| 0.8965 | 2460 | 0.0838 | - | - |
| 0.9001 | 2470 | 0.0837 | - | - |
| 0.9038 | 2480 | 0.097 | - | - |
| 0.9074 | 2490 | 0.0673 | - | - |
| 0.9111 | 2500 | 0.0944 | - | - |
| 0.9147 | 2510 | 0.0858 | - | - |
| 0.9184 | 2520 | 0.0761 | - | - |
| 0.9220 | 2530 | 0.0868 | - | - |
| 0.9257 | 2540 | 0.0398 | - | - |
| 0.9293 | 2550 | 0.0494 | - | - |
| 0.9329 | 2560 | 0.123 | - | - |
| 0.9366 | 2570 | 0.0956 | - | - |
| 0.9402 | 2580 | 0.065 | - | - |
| 0.9439 | 2590 | 0.0662 | - | - |
| 0.9475 | 2600 | 0.0747 | 0.0882 | 0.9272 |
Framework Versions
- Python: 3.12.12
- Sentence Transformers: 5.1.2
- Transformers: 4.57.3
- PyTorch: 2.9.0+cu126
- Accelerate: 1.12.0
- Datasets: 4.4.1
- Tokenizers: 0.22.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}