CrossEncoder based on BAAI/bge-reranker-v2-m3

This is a Cross Encoder model finetuned from BAAI/bge-reranker-v2-m3 using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.

Model Details

Model Description

  • Model Type: Cross Encoder
  • Base model: BAAI/bge-reranker-v2-m3
  • Maximum Sequence Length: 1024 tokens
  • Number of Output Labels: 1 label

Model Sources

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the 🤗 Hub
model = CrossEncoder("cross_encoder_model_id")
# Get scores for pairs of texts
pairs = [
    ['Paloma and Monkey Gland, are what type of item?', "Monkey Gland. The Monkey Gland is a cocktail of gin, orange juice, grenadine and absinthe created in the 1920s by Harry MacElhone, owner of Harry's New York Bar in Paris, France."],
    ['Tess Asplund fought against the Neo-Nazi movement that existed in which countries?', 'Tess Asplund. Tess Asplund, born 1974, is a Swedish activist who gained attention following her protest against neo-Nazis in Borlänge, Sweden.  David Lagerlof is the photographer of the viral image of Asplund, which shows her facing uniformed members of the Swedish Nordic Resistance Movement with her fist in the air.  She is originally from Colombia and describes herself as Afro-Swedish.  About the incident, Asplund is quoted as having said “If this picture of me can get more people to dare to show resistance, then it’s all good...the people must unite and show that it is not okay that racism is becoming normalised and that fascists are running around on our streets.”'],
    ['Which magazine, Arizona Highways or Adventist World, is published by Herald Publishing Association?', 'My Worlds: The Collection. My Worlds: The Collection is the first compilation album released by Canadian recording artist Justin Bieber.  As the international alternative to the Walmart and Sam\'s Club exclusive "My Worlds Acoustic" (2010), "My Worlds: The Collection" was released in numerous European countries on November 19, 2010.  The album consists of two discs; the first is a slightly altered version of "My Worlds Acoustic", and the second is "My Worlds", a compilation itself made up of "My World" (2009) and "My World 2.0" (2010).  In addition, the album also features a new song, an inspirational ballad entitled "Pray", a Jaden Smith collaboration, "Never Say Never", and remixes of "Somebody to Love".  The new versions of the songs were produced by Bieber\'s music director, Dan Kanter, his vocal producer Kuk Harrell, and also producer Rob Wells.  While most reviewers complimented the set , several thought that its release was unneeded.  The album charted moderately in Europe, reaching the top half of several album charts.'],
    ['Who made the sculpture of the an American professional baseball player and manager regarded as one of the greatest players in baseball history?', 'Ted Williams. Theodore Samuel Williams (August 30, 1918 – July 5, 2002) was an American professional baseball player and manager.  He played his entire 19-year Major League Baseball (MLB) career as a left fielder for the Boston Red Sox from 1939 to 1960, only interrupted by service time during World War II and the Korean War.  Nicknamed "The Kid", "The Splendid Splinter", "Teddy Ballgame", "The Thumper", and "The Greatest Hitter Who Ever Lived", Williams is regarded as one of the greatest players in baseball history.  Williams was also an outstanding fielder, especially in the difficult left field of Fenway Park in Boston, where he played his entire Major League career at that position.'],
    ['Paul-Werner Krapke is notable for his management of a main battle tank developed by who?', 'Paul-Werner Krapke. Paul-Werner Krapke (born 1915) is a German armored fighting vehicle engineer, notable for his management of the Leopard 2 project.'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)

# Or rank different texts based on similarity to a single text
ranks = model.rank(
    'Paloma and Monkey Gland, are what type of item?',
    [
        "Monkey Gland. The Monkey Gland is a cocktail of gin, orange juice, grenadine and absinthe created in the 1920s by Harry MacElhone, owner of Harry's New York Bar in Paris, France.",
        'Tess Asplund. Tess Asplund, born 1974, is a Swedish activist who gained attention following her protest against neo-Nazis in Borlänge, Sweden.  David Lagerlof is the photographer of the viral image of Asplund, which shows her facing uniformed members of the Swedish Nordic Resistance Movement with her fist in the air.  She is originally from Colombia and describes herself as Afro-Swedish.  About the incident, Asplund is quoted as having said “If this picture of me can get more people to dare to show resistance, then it’s all good...the people must unite and show that it is not okay that racism is becoming normalised and that fascists are running around on our streets.”',
        'My Worlds: The Collection. My Worlds: The Collection is the first compilation album released by Canadian recording artist Justin Bieber.  As the international alternative to the Walmart and Sam\'s Club exclusive "My Worlds Acoustic" (2010), "My Worlds: The Collection" was released in numerous European countries on November 19, 2010.  The album consists of two discs; the first is a slightly altered version of "My Worlds Acoustic", and the second is "My Worlds", a compilation itself made up of "My World" (2009) and "My World 2.0" (2010).  In addition, the album also features a new song, an inspirational ballad entitled "Pray", a Jaden Smith collaboration, "Never Say Never", and remixes of "Somebody to Love".  The new versions of the songs were produced by Bieber\'s music director, Dan Kanter, his vocal producer Kuk Harrell, and also producer Rob Wells.  While most reviewers complimented the set , several thought that its release was unneeded.  The album charted moderately in Europe, reaching the top half of several album charts.',
        'Ted Williams. Theodore Samuel Williams (August 30, 1918 – July 5, 2002) was an American professional baseball player and manager.  He played his entire 19-year Major League Baseball (MLB) career as a left fielder for the Boston Red Sox from 1939 to 1960, only interrupted by service time during World War II and the Korean War.  Nicknamed "The Kid", "The Splendid Splinter", "Teddy Ballgame", "The Thumper", and "The Greatest Hitter Who Ever Lived", Williams is regarded as one of the greatest players in baseball history.  Williams was also an outstanding fielder, especially in the difficult left field of Fenway Park in Boston, where he played his entire Major League career at that position.',
        'Paul-Werner Krapke. Paul-Werner Krapke (born 1915) is a German armored fighting vehicle engineer, notable for his management of the Leopard 2 project.',
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Evaluation

Metrics

Cross Encoder Binary Classification

Metric validation train_subset
accuracy 0.9933 0.976
accuracy_threshold 0.162 0.5996
f1 0.9933 0.9771
f1_threshold 0.162 0.5996
precision 1.0 0.9734
recall 0.9867 0.9808
average_precision 0.9992 0.9929

Training Details

Training Dataset

Unnamed Dataset

  • Size: 8,000 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 35 characters
    • mean: 103.16 characters
    • max: 498 characters
    • min: 74 characters
    • mean: 550.07 characters
    • max: 2501 characters
    • min: 0.0
    • mean: 0.51
    • max: 1.0
  • Samples:
    sentence_0 sentence_1 label
    Paloma and Monkey Gland, are what type of item? Monkey Gland. The Monkey Gland is a cocktail of gin, orange juice, grenadine and absinthe created in the 1920s by Harry MacElhone, owner of Harry's New York Bar in Paris, France. 1.0
    Tess Asplund fought against the Neo-Nazi movement that existed in which countries? Tess Asplund. Tess Asplund, born 1974, is a Swedish activist who gained attention following her protest against neo-Nazis in Borlänge, Sweden. David Lagerlof is the photographer of the viral image of Asplund, which shows her facing uniformed members of the Swedish Nordic Resistance Movement with her fist in the air. She is originally from Colombia and describes herself as Afro-Swedish. About the incident, Asplund is quoted as having said “If this picture of me can get more people to dare to show resistance, then it’s all good...the people must unite and show that it is not okay that racism is becoming normalised and that fascists are running around on our streets.” 1.0
    Which magazine, Arizona Highways or Adventist World, is published by Herald Publishing Association? My Worlds: The Collection. My Worlds: The Collection is the first compilation album released by Canadian recording artist Justin Bieber. As the international alternative to the Walmart and Sam's Club exclusive "My Worlds Acoustic" (2010), "My Worlds: The Collection" was released in numerous European countries on November 19, 2010. The album consists of two discs; the first is a slightly altered version of "My Worlds Acoustic", and the second is "My Worlds", a compilation itself made up of "My World" (2009) and "My World 2.0" (2010). In addition, the album also features a new song, an inspirational ballad entitled "Pray", a Jaden Smith collaboration, "Never Say Never", and remixes of "Somebody to Love". The new versions of the songs were produced by Bieber's music director, Dan Kanter, his vocal producer Kuk Harrell, and also producer Rob Wells. While most reviewers complimented the set , several thought that its release was unneeded. The album charted moderately in Europe, reachi... 0.0
  • Loss: BinaryCrossEntropyLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "pos_weight": null
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 2
  • per_device_eval_batch_size: 2

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 2
  • per_device_eval_batch_size: 2
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss validation_average_precision train_subset_average_precision
0.125 250 - 0.9988 0.9896
0.25 500 0.2247 0.9989 0.9896
0.375 750 - 0.9988 0.9918
0.5 1000 0.2255 0.9992 0.9929

Framework Versions

  • Python: 3.11.13
  • Sentence Transformers: 5.2.2
  • Transformers: 4.44.2
  • PyTorch: 2.10.0+cu128
  • Accelerate: 1.12.0
  • Datasets: 4.0.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
14
Safetensors
Model size
0.6B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for OloriBern/hotpotqa-hybrid-2000

Finetuned
(49)
this model

Paper for OloriBern/hotpotqa-hybrid-2000

Evaluation results