zkarbie's picture
Upload folder using huggingface_hub
cc34774 verified
metadata
tags:
  - sentence-transformers
  - cross-encoder
  - reranker
  - generated_from_trainer
  - dataset_size:1600
  - loss:BinaryCrossEntropyLoss
base_model: cross-encoder/ms-marco-MiniLM-L6-v2
pipeline_tag: text-ranking
library_name: sentence-transformers
metrics:
  - pearson
  - spearman
model-index:
  - name: CrossEncoder based on cross-encoder/ms-marco-MiniLM-L6-v2
    results:
      - task:
          type: cross-encoder-correlation
          name: Cross Encoder Correlation
        dataset:
          name: val
          type: val
        metrics:
          - type: pearson
            value: 0.9929011064605967
            name: Pearson
          - type: spearman
            value: 0.9384365513352464
            name: Spearman

CrossEncoder based on cross-encoder/ms-marco-MiniLM-L6-v2

This is a Cross Encoder model finetuned from cross-encoder/ms-marco-MiniLM-L6-v2 using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.

Model Details

Model Description

Model Sources

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the 🤗 Hub
model = CrossEncoder("cross_encoder_model_id")
# Get scores for pairs of texts
pairs = [
    ['How to implement error handling', 'Implementation guide for error handling. Step-by-step instructions with code examples. Covers setup, configuration, and best practices.'],
    ['Where is UserModel defined', 'Source code location and module structure. Class definitions and interface documentation. File paths and import statements.'],
    ['Add metrics collection to scheduler', 'Feature implementation guide with API extensions. Configuration options and customization points. Testing requirements.'],
    ['Fix bug in data processor', 'Company holiday schedule and PTO policy. HR contact information.'],
    ['Refactor authentication middleware for better readability', 'Refactoring patterns and code improvement strategies. Before/after examples with measurable improvements. Migration guide.'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)

# Or rank different texts based on similarity to a single text
ranks = model.rank(
    'How to implement error handling',
    [
        'Implementation guide for error handling. Step-by-step instructions with code examples. Covers setup, configuration, and best practices.',
        'Source code location and module structure. Class definitions and interface documentation. File paths and import statements.',
        'Feature implementation guide with API extensions. Configuration options and customization points. Testing requirements.',
        'Company holiday schedule and PTO policy. HR contact information.',
        'Refactoring patterns and code improvement strategies. Before/after examples with measurable improvements. Migration guide.',
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Evaluation

Metrics

Cross Encoder Correlation

Metric Value
pearson 0.9929
spearman 0.9384

Training Details

Training Dataset

Unnamed Dataset

  • Size: 1,600 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 20 characters
    • mean: 32.95 characters
    • max: 61 characters
    • min: 48 characters
    • mean: 110.73 characters
    • max: 158 characters
    • min: 0.0
    • mean: 0.56
    • max: 1.0
  • Samples:
    sentence_0 sentence_1 label
    How to implement error handling Implementation guide for error handling. Step-by-step instructions with code examples. Covers setup, configuration, and best practices. 1.0
    Where is UserModel defined Source code location and module structure. Class definitions and interface documentation. File paths and import statements. 1.0
    Add metrics collection to scheduler Feature implementation guide with API extensions. Configuration options and customization points. Testing requirements. 1.0
  • Loss: BinaryCrossEntropyLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "pos_weight": null
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • eval_strategy: steps
  • per_device_eval_batch_size: 16

All Hyperparameters

Click to expand
  • per_device_train_batch_size: 16
  • num_train_epochs: 3
  • max_steps: -1
  • learning_rate: 5e-05
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: None
  • warmup_steps: 0
  • optim: adamw_torch_fused
  • optim_args: None
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • optim_target_modules: None
  • gradient_accumulation_steps: 1
  • average_tokens_across_devices: True
  • max_grad_norm: 1
  • label_smoothing_factor: 0.0
  • bf16: False
  • fp16: False
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • use_liger_kernel: False
  • liger_kernel_config: None
  • use_cache: False
  • neftune_noise_alpha: None
  • torch_empty_cache_steps: None
  • auto_find_batch_size: False
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • include_num_input_tokens_seen: no
  • log_level: passive
  • log_level_replica: warning
  • disable_tqdm: False
  • project: huggingface
  • trackio_space_id: trackio
  • eval_strategy: steps
  • per_device_eval_batch_size: 16
  • prediction_loss_only: True
  • eval_on_start: False
  • eval_do_concat_batches: True
  • eval_use_gather_object: False
  • eval_accumulation_steps: None
  • include_for_metrics: []
  • batch_eval_metrics: False
  • save_only_model: False
  • save_on_each_node: False
  • enable_jit_checkpoint: False
  • push_to_hub: False
  • hub_private_repo: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_always_push: False
  • hub_revision: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • restore_callback_states_from_checkpoint: False
  • full_determinism: False
  • seed: 42
  • data_seed: None
  • use_cpu: False
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • dataloader_prefetch_factor: None
  • remove_unused_columns: True
  • label_names: None
  • train_sampling_strategy: random
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • ddp_backend: None
  • ddp_timeout: 1800
  • fsdp: []
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • deepspeed: None
  • debug: []
  • skip_memory_metrics: True
  • do_predict: False
  • resume_from_checkpoint: None
  • warmup_ratio: None
  • local_rank: -1
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step val_spearman
1.0 100 0.9382
2.0 200 0.9384
3.0 300 0.9384

Framework Versions

  • Python: 3.13.5
  • Sentence Transformers: 5.3.0
  • Transformers: 5.5.0
  • PyTorch: 2.11.0+cu130
  • Accelerate: 1.13.0
  • Datasets: 4.8.4
  • Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}