CrossEncoder
This is a Cross Encoder model trained using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
Model Details
Model Description
- Model Type: Cross Encoder
- Maximum Sequence Length: 8192 tokens
- Number of Output Labels: 1 label
Model Sources
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import CrossEncoder
model = CrossEncoder("cross_encoder_model_id")
pairs = [
['Hexacarboxylporphyrin/Creatinine [Molar ratio] in 24 hour Urine', '[Molar ratio] in Hexacarboxylporphyrin/Creatinine 24 hour Urn'],
['HLA-A2 Ql (Bld/Tiss donor)', 'HLA-A11 donor) (Bld/Tiss Ql'],
['Urea nitrogen [Mass/volume] in Urine', 'POC Urine Urea nitrogen Measurement'],
['Cauliflower IgG (S) [Mass/Vol]', 'POC Cauliflower Immune globulin (S) Radioallergosorbent'],
['Mannose-binding protein [Mass/volume] in Serum', 'MBP Level Serum Quantitative'],
]
scores = model.predict(pairs)
print(scores.shape)
ranks = model.rank(
'Hexacarboxylporphyrin/Creatinine [Molar ratio] in 24 hour Urine',
[
'[Molar ratio] in Hexacarboxylporphyrin/Creatinine 24 hour Urn',
'HLA-A11 donor) (Bld/Tiss Ql',
'POC Urine Urea nitrogen Measurement',
'POC Cauliflower Immune globulin (S) Radioallergosorbent',
'MBP Level Serum Quantitative',
]
)
Training Details
Training Dataset
Unnamed Dataset
Evaluation Dataset
Unnamed Dataset
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size: 64
num_train_epochs: 1
learning_rate: 1e-07
warmup_steps: 0.1
bf16: True
eval_strategy: steps
per_device_eval_batch_size: 64
batch_sampler: no_duplicates
All Hyperparameters
Click to expand
per_device_train_batch_size: 64
num_train_epochs: 1
max_steps: -1
learning_rate: 1e-07
lr_scheduler_type: linear
lr_scheduler_kwargs: None
warmup_steps: 0.1
optim: adamw_torch_fused
optim_args: None
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
optim_target_modules: None
gradient_accumulation_steps: 1
average_tokens_across_devices: True
max_grad_norm: 1.0
label_smoothing_factor: 0.0
bf16: True
fp16: False
bf16_full_eval: False
fp16_full_eval: False
tf32: None
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
use_liger_kernel: False
liger_kernel_config: None
use_cache: False
neftune_noise_alpha: None
torch_empty_cache_steps: None
auto_find_batch_size: False
log_on_each_node: True
logging_nan_inf_filter: True
include_num_input_tokens_seen: no
log_level: passive
log_level_replica: warning
disable_tqdm: False
project: huggingface
trackio_space_id: trackio
eval_strategy: steps
per_device_eval_batch_size: 64
prediction_loss_only: True
eval_on_start: False
eval_do_concat_batches: True
eval_use_gather_object: False
eval_accumulation_steps: None
include_for_metrics: []
batch_eval_metrics: False
save_only_model: False
save_on_each_node: False
enable_jit_checkpoint: False
push_to_hub: False
hub_private_repo: None
hub_model_id: None
hub_strategy: every_save
hub_always_push: False
hub_revision: None
load_best_model_at_end: False
ignore_data_skip: False
restore_callback_states_from_checkpoint: False
full_determinism: False
seed: 42
data_seed: None
use_cpu: False
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
parallelism_config: None
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_pin_memory: True
dataloader_persistent_workers: False
dataloader_prefetch_factor: None
remove_unused_columns: True
label_names: None
train_sampling_strategy: random
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
ddp_backend: None
ddp_timeout: 1800
fsdp: []
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
deepspeed: None
debug: []
skip_memory_metrics: True
do_predict: False
resume_from_checkpoint: None
warmup_ratio: None
local_rank: -1
prompts: None
batch_sampler: no_duplicates
multi_dataset_batch_sampler: proportional
router_mapping: {}
learning_rate_mapping: {}
Training Logs
| Epoch |
Step |
Training Loss |
Validation Loss |
| 0.1000 |
1751 |
1.5995 |
1.2376 |
| 0.1999 |
3502 |
1.1824 |
1.0858 |
| 0.2999 |
5253 |
1.0610 |
1.0036 |
| 0.3999 |
7004 |
1.0037 |
0.9503 |
| 0.4999 |
8755 |
0.9602 |
0.9021 |
| 0.5998 |
10506 |
0.9261 |
0.8669 |
| 0.6998 |
12257 |
0.8943 |
0.8422 |
| 0.7998 |
14008 |
0.8777 |
0.8264 |
| 0.8997 |
15759 |
0.8619 |
0.8176 |
| 0.9997 |
17510 |
0.8668 |
0.8150 |
| 1.0 |
17515 |
- |
0.8150 |
Framework Versions
- Python: 3.10.20
- Sentence Transformers: 5.3.0
- Transformers: 5.4.0
- PyTorch: 2.10.0+cu128
- Accelerate: 1.13.0
- Datasets: 4.8.4
- Tokenizers: 0.22.2
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}