CrossEncoder based on cross-encoder/ms-marco-MiniLM-L6-v2

This is a Cross Encoder model finetuned from cross-encoder/ms-marco-MiniLM-L6-v2 using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.

Model Details

Model Description

Model Sources

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the 🤗 Hub
model = CrossEncoder("cross_encoder_model_id")
# Get scores for pairs of texts
pairs = [
    ['what is the average payment volume per transaction for american express?', '(table): company the american express of payments volume ( billions ) is 637 ; the american express of total volume ( billions ) is 647 ; the american express of total transactions ( billions ) is 5.0 ; the american express of cards ( millions ) is 86 ;'],
    ['what is the average payment volume per transaction for american express?', '(text): largest operators of open-loop and closed-loop retail electronic payments networks the largest operators of open-loop and closed-loop retail electronic payments networks are visa , mastercard , american express , discover , jcb and diners club .'],
    ['what is the average payment volume per transaction for american express?', '(text): with the exception of discover , which primarily operates in the united states , all of the other network operators can be considered multi- national or global providers of payments network services .'],
    ['what is the average payment volume per transaction for american express?', '(text): based on payments volume , total volume , number of transactions and number of cards in circulation , visa is the largest retail electronic payments network in the world .'],
    ['what is the average payment volume per transaction for american express?', '(text): the following chart compares our network with those of our major competitors for calendar year 2007 : company payments volume volume transactions cards ( billions ) ( billions ) ( billions ) ( millions ) visa inc. ( 1 ) .'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)

# Or rank different texts based on similarity to a single text
ranks = model.rank(
    'what is the average payment volume per transaction for american express?',
    [
        '(table): company the american express of payments volume ( billions ) is 637 ; the american express of total volume ( billions ) is 647 ; the american express of total transactions ( billions ) is 5.0 ; the american express of cards ( millions ) is 86 ;',
        '(text): largest operators of open-loop and closed-loop retail electronic payments networks the largest operators of open-loop and closed-loop retail electronic payments networks are visa , mastercard , american express , discover , jcb and diners club .',
        '(text): with the exception of discover , which primarily operates in the united states , all of the other network operators can be considered multi- national or global providers of payments network services .',
        '(text): based on payments volume , total volume , number of transactions and number of cards in circulation , visa is the largest retail electronic payments network in the world .',
        '(text): the following chart compares our network with those of our major competitors for calendar year 2007 : company payments volume volume transactions cards ( billions ) ( billions ) ( billions ) ( millions ) visa inc. ( 1 ) .',
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Evaluation

Metrics

Cross Encoder Reranking

Metric Value
map 0.8939
mrr@10 0.9405
ndcg@10 0.9272

Training Details

Training Dataset

Unnamed Dataset

  • Size: 175,555 training samples
  • Columns: query, passage, and label
  • Approximate statistics based on the first 1000 samples:
    query passage label
    type string string float
    details
    • min: 41 characters
    • mean: 89.26 characters
    • max: 186 characters
    • min: 11 characters
    • mean: 182.61 characters
    • max: 1853 characters
    • min: 0.0
    • mean: 0.07
    • max: 1.0
  • Samples:
    query passage label
    what is the the interest expense in 2009? (text): if libor changes by 100 basis points , our annual interest expense would change by $ 3.8 million . 1.0
    what is the the interest expense in 2009? (text): interest rate to a variable interest rate based on the three-month libor plus 2.05% ( 2.05 % ) ( 2.34% ( 2.34 % ) as of october 31 , 2009 ) . 0.0
    what is the the interest expense in 2009? (text): foreign currency exposure as more fully described in note 2i . 0.0
  • Loss: BinaryCrossEntropyLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "pos_weight": null
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 25,007 evaluation samples
  • Columns: query, passage, and label
  • Approximate statistics based on the first 1000 samples:
    query passage label
    type string string float
    details
    • min: 52 characters
    • mean: 86.04 characters
    • max: 137 characters
    • min: 11 characters
    • mean: 166.61 characters
    • max: 717 characters
    • min: 0.0
    • mean: 0.06
    • max: 1.0
  • Samples:
    query passage label
    what is the average payment volume per transaction for american express? (table): company the american express of payments volume ( billions ) is 637 ; the american express of total volume ( billions ) is 647 ; the american express of total transactions ( billions ) is 5.0 ; the american express of cards ( millions ) is 86 ; 1.0
    what is the average payment volume per transaction for american express? (text): largest operators of open-loop and closed-loop retail electronic payments networks the largest operators of open-loop and closed-loop retail electronic payments networks are visa , mastercard , american express , discover , jcb and diners club . 0.0
    what is the average payment volume per transaction for american express? (text): with the exception of discover , which primarily operates in the united states , all of the other network operators can be considered multi- national or global providers of payments network services . 0.0
  • Loss: BinaryCrossEntropyLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "pos_weight": null
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • learning_rate: 0.0001
  • weight_decay: 0.01
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • fp16: True
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 0.0001
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss reranker_ndcg@10
0.0036 10 0.3268 - -
0.0073 20 0.247 - -
0.0109 30 0.2451 - -
0.0146 40 0.2029 - -
0.0182 50 0.1739 - -
0.0219 60 0.172 - -
0.0255 70 0.1425 - -
0.0292 80 0.138 - -
0.0328 90 0.1304 - -
0.0364 100 0.1561 - -
0.0401 110 0.1627 - -
0.0437 120 0.1974 - -
0.0474 130 0.1339 - -
0.0510 140 0.1137 - -
0.0547 150 0.1333 - -
0.0583 160 0.1296 - -
0.0620 170 0.1723 - -
0.0656 180 0.1099 - -
0.0692 190 0.1105 - -
0.0729 200 0.0917 0.1133 0.9034
0.0765 210 0.1012 - -
0.0802 220 0.1296 - -
0.0838 230 0.1332 - -
0.0875 240 0.095 - -
0.0911 250 0.1351 - -
0.0948 260 0.1138 - -
0.0984 270 0.1318 - -
0.1020 280 0.1164 - -
0.1057 290 0.1418 - -
0.1093 300 0.1337 - -
0.1130 310 0.1169 - -
0.1166 320 0.1314 - -
0.1203 330 0.1197 - -
0.1239 340 0.1002 - -
0.1276 350 0.1124 - -
0.1312 360 0.0932 - -
0.1348 370 0.1629 - -
0.1385 380 0.1501 - -
0.1421 390 0.1097 - -
0.1458 400 0.0756 0.1138 0.8984
0.1494 410 0.1174 - -
0.1531 420 0.1472 - -
0.1567 430 0.1391 - -
0.1603 440 0.1188 - -
0.1640 450 0.1555 - -
0.1676 460 0.1148 - -
0.1713 470 0.0753 - -
0.1749 480 0.104 - -
0.1786 490 0.1313 - -
0.1822 500 0.1125 - -
0.1859 510 0.0772 - -
0.1895 520 0.1045 - -
0.1931 530 0.1101 - -
0.1968 540 0.109 - -
0.2004 550 0.124 - -
0.2041 560 0.0934 - -
0.2077 570 0.1305 - -
0.2114 580 0.1163 - -
0.2150 590 0.1004 - -
0.2187 600 0.0917 0.1206 0.9025
0.2223 610 0.0942 - -
0.2259 620 0.1223 - -
0.2296 630 0.1156 - -
0.2332 640 0.0924 - -
0.2369 650 0.1372 - -
0.2405 660 0.0984 - -
0.2442 670 0.0876 - -
0.2478 680 0.0926 - -
0.2515 690 0.0819 - -
0.2551 700 0.1034 - -
0.2587 710 0.1022 - -
0.2624 720 0.0661 - -
0.2660 730 0.124 - -
0.2697 740 0.1231 - -
0.2733 750 0.1307 - -
0.2770 760 0.0973 - -
0.2806 770 0.0721 - -
0.2843 780 0.0734 - -
0.2879 790 0.0806 - -
0.2915 800 0.0824 0.0996 0.9079
0.2952 810 0.1037 - -
0.2988 820 0.0771 - -
0.3025 830 0.1407 - -
0.3061 840 0.1196 - -
0.3098 850 0.1087 - -
0.3134 860 0.0737 - -
0.3171 870 0.0986 - -
0.3207 880 0.1042 - -
0.3243 890 0.0971 - -
0.3280 900 0.0824 - -
0.3316 910 0.0842 - -
0.3353 920 0.1361 - -
0.3389 930 0.086 - -
0.3426 940 0.0861 - -
0.3462 950 0.1039 - -
0.3499 960 0.1085 - -
0.3535 970 0.1316 - -
0.3571 980 0.0806 - -
0.3608 990 0.0873 - -
0.3644 1000 0.0952 0.0981 0.9101
0.3681 1010 0.1194 - -
0.3717 1020 0.1114 - -
0.3754 1030 0.122 - -
0.3790 1040 0.094 - -
0.3827 1050 0.0971 - -
0.3863 1060 0.1285 - -
0.3899 1070 0.103 - -
0.3936 1080 0.1065 - -
0.3972 1090 0.0885 - -
0.4009 1100 0.1022 - -
0.4045 1110 0.1129 - -
0.4082 1120 0.1229 - -
0.4118 1130 0.0999 - -
0.4155 1140 0.0879 - -
0.4191 1150 0.0763 - -
0.4227 1160 0.0852 - -
0.4264 1170 0.0914 - -
0.4300 1180 0.1004 - -
0.4337 1190 0.1143 - -
0.4373 1200 0.1364 0.0940 0.9246
0.4410 1210 0.1017 - -
0.4446 1220 0.09 - -
0.4483 1230 0.0687 - -
0.4519 1240 0.0733 - -
0.4555 1250 0.1049 - -
0.4592 1260 0.0918 - -
0.4628 1270 0.0848 - -
0.4665 1280 0.0736 - -
0.4701 1290 0.1129 - -
0.4738 1300 0.0713 - -
0.4774 1310 0.0876 - -
0.4810 1320 0.0866 - -
0.4847 1330 0.1016 - -
0.4883 1340 0.1061 - -
0.4920 1350 0.0791 - -
0.4956 1360 0.0938 - -
0.4993 1370 0.1235 - -
0.5029 1380 0.0693 - -
0.5066 1390 0.065 - -
0.5102 1400 0.0839 0.1007 0.9214
0.5138 1410 0.0914 - -
0.5175 1420 0.0786 - -
0.5211 1430 0.0916 - -
0.5248 1440 0.0606 - -
0.5284 1450 0.1417 - -
0.5321 1460 0.0856 - -
0.5357 1470 0.0865 - -
0.5394 1480 0.0917 - -
0.5430 1490 0.0774 - -
0.5466 1500 0.0951 - -
0.5503 1510 0.074 - -
0.5539 1520 0.0797 - -
0.5576 1530 0.0817 - -
0.5612 1540 0.1137 - -
0.5649 1550 0.1139 - -
0.5685 1560 0.0889 - -
0.5722 1570 0.1075 - -
0.5758 1580 0.1021 - -
0.5794 1590 0.1115 - -
0.5831 1600 0.1047 0.0952 0.9229
0.5867 1610 0.1056 - -
0.5904 1620 0.116 - -
0.5940 1630 0.0989 - -
0.5977 1640 0.1102 - -
0.6013 1650 0.1006 - -
0.6050 1660 0.0956 - -
0.6086 1670 0.1003 - -
0.6122 1680 0.0984 - -
0.6159 1690 0.0734 - -
0.6195 1700 0.079 - -
0.6232 1710 0.0872 - -
0.6268 1720 0.1077 - -
0.6305 1730 0.0833 - -
0.6341 1740 0.0984 - -
0.6378 1750 0.0727 - -
0.6414 1760 0.1062 - -
0.6450 1770 0.1013 - -
0.6487 1780 0.0892 - -
0.6523 1790 0.0765 - -
0.6560 1800 0.0698 0.0962 0.9208
0.6596 1810 0.0658 - -
0.6633 1820 0.1386 - -
0.6669 1830 0.1094 - -
0.6706 1840 0.103 - -
0.6742 1850 0.1075 - -
0.6778 1860 0.091 - -
0.6815 1870 0.106 - -
0.6851 1880 0.0753 - -
0.6888 1890 0.0685 - -
0.6924 1900 0.1045 - -
0.6961 1910 0.087 - -
0.6997 1920 0.0866 - -
0.7034 1930 0.1253 - -
0.7070 1940 0.0915 - -
0.7106 1950 0.061 - -
0.7143 1960 0.0744 - -
0.7179 1970 0.0643 - -
0.7216 1980 0.0571 - -
0.7252 1990 0.1004 - -
0.7289 2000 0.1075 0.0936 0.9237
0.7325 2010 0.0637 - -
0.7362 2020 0.1167 - -
0.7398 2030 0.1113 - -
0.7434 2040 0.1314 - -
0.7471 2050 0.0764 - -
0.7507 2060 0.1297 - -
0.7544 2070 0.0841 - -
0.7580 2080 0.0967 - -
0.7617 2090 0.0916 - -
0.7653 2100 0.1196 - -
0.7690 2110 0.1072 - -
0.7726 2120 0.0974 - -
0.7762 2130 0.0772 - -
0.7799 2140 0.1147 - -
0.7835 2150 0.1003 - -
0.7872 2160 0.0944 - -
0.7908 2170 0.0886 - -
0.7945 2180 0.062 - -
0.7981 2190 0.0817 - -
0.8017 2200 0.1096 0.0919 0.9262
0.8054 2210 0.0821 - -
0.8090 2220 0.0866 - -
0.8127 2230 0.0824 - -
0.8163 2240 0.108 - -
0.8200 2250 0.0746 - -
0.8236 2260 0.0708 - -
0.8273 2270 0.0898 - -
0.8309 2280 0.0876 - -
0.8345 2290 0.0898 - -
0.8382 2300 0.0935 - -
0.8418 2310 0.0655 - -
0.8455 2320 0.106 - -
0.8491 2330 0.0806 - -
0.8528 2340 0.091 - -
0.8564 2350 0.0575 - -
0.8601 2360 0.059 - -
0.8637 2370 0.0889 - -
0.8673 2380 0.0955 - -
0.8710 2390 0.0841 - -
0.8746 2400 0.0759 0.0896 0.9256
0.8783 2410 0.0558 - -
0.8819 2420 0.0921 - -
0.8856 2430 0.0865 - -
0.8892 2440 0.0787 - -
0.8929 2450 0.0803 - -
0.8965 2460 0.0838 - -
0.9001 2470 0.0837 - -
0.9038 2480 0.097 - -
0.9074 2490 0.0673 - -
0.9111 2500 0.0944 - -
0.9147 2510 0.0858 - -
0.9184 2520 0.0761 - -
0.9220 2530 0.0868 - -
0.9257 2540 0.0398 - -
0.9293 2550 0.0494 - -
0.9329 2560 0.123 - -
0.9366 2570 0.0956 - -
0.9402 2580 0.065 - -
0.9439 2590 0.0662 - -
0.9475 2600 0.0747 0.0882 0.9272

Framework Versions

  • Python: 3.12.12
  • Sentence Transformers: 5.1.2
  • Transformers: 4.57.3
  • PyTorch: 2.9.0+cu126
  • Accelerate: 1.12.0
  • Datasets: 4.4.1
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
1
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for svk2118/reranker-22m

Paper for svk2118/reranker-22m

Evaluation results