Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper
• 1908.10084 • Published
• 12
This is a sentence-transformers model finetuned from nomic-ai/modernbert-embed-base on the touch-rugby-modernbert-pairs dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Trelis/modernbert-embed-base-touch-rugby-ft-v2")
# Run inference
sentences = [
'Who indicates to commence play at the start of a Touch Rugby match?',
'6.2\tThe Team coach(s) and Team officials may move from one position to the other \nbut shall do so without delay.While in a position at the end of the Field of Play, \nthe Team coach(s) or Team official must remain no closer than five (5) metres \nfrom the Dead Ball Line and must not coach or communicate (verbal or non-\nverbal) with either Team or the Referees.7\u2002 Commencement and Recommencement of Play \n7.1\tTeam captains are to toss a coin in the presence of the Referee(s) with the \nwinning captain’s Team having the choice of the direction the Team wishes \nto run in the first half; the choice of Interchange Areas for the duration of the \nmatch, including any extra time; and the choice of which team will commence \nthe match in Possession.7.2\tA player of the Attacking Team is to commence the match with a Tap at the \ncentre of the Halfway Line following the indication to commence play from the \nReferee.',
'See Appendix 1.Forced Interchange\nWhen a player is required to undertake a compulsory Interchange for \nan Infringement ruled more serious than a Penalty but less serious \nthan a Permanent Interchange, Sin Bin or Dismissal.Forward\nA position or direction towards the Dead Ball Line beyond the Team’s \nAttacking Try Line.Full Time\nThe expiration of the second period of time allowed for play.Half\nThe player who takes Possession following a Rollball.Half Time\nThe break in play between the two halves of a match.Imminent\nAbout to occur, it is almost certain to occur.Infringement\nThe action of a player contrary to the Rules of the game.In-Goal Area\nThe area in the Field of Play bounded by the Sidelines, the Try Lines \nand the Dead Ball Lines.There are two (2), one (1) at each end of the \nField of Play.See Appendix 1.Interchange\nThe act of an on-field player leaving the Field of Play to be replaced \nby an off-field player entering the Field of Play.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
question and related_chunk| question | related_chunk | |
|---|---|---|
| type | string | string |
| details |
|
|
| question | related_chunk |
|---|---|
When may Onside players of the Defending Team move forward if the Half is not within one metre of the Rollball? |
13.10 A player ceases to be the Half once the ball is passed to another player.13.11 Defending players are not to interfere with the performance of the Rollball or the |
Besides awarding tries, what other scoring-related task does the Referee perform? |
An approach may only be made during a break in play or at |
What happens if a team has fewer than four players on the field during a match? |
FIT Playing Rules - 5th Edition |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
question and related_chunk| question | related_chunk | |
|---|---|---|
| type | string | string |
| details |
|
|
| question | related_chunk |
|---|---|
What is the definition of the 'Defending Team' in Touch Rugby Rules 5th Edition? |
Except as permitted under the |
What is the minimum number of players required on the field for a touch rugby match to begin or continue? |
FIT Playing Rules - 5th Edition |
What are the possible outcomes of a Referee's Ruling? |
See Appendix 1.Forced Interchange |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
eval_strategy: stepsper_device_train_batch_size: 32per_device_eval_batch_size: 32learning_rate: 5e-06num_train_epochs: 1lr_scheduler_type: constantwarmup_ratio: 0.3overwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 32per_device_eval_batch_size: 32per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-06weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 1max_steps: -1lr_scheduler_type: constantlr_scheduler_kwargs: {}warmup_ratio: 0.3warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss | Validation Loss |
|---|---|---|---|
| 0.2222 | 2 | 2.8177 | 2.5945 |
| 0.4444 | 4 | 2.9155 | 2.5693 |
| 0.6667 | 6 | 2.9114 | 2.5402 |
| 0.8889 | 8 | 2.7999 | 2.5098 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
answerdotai/ModernBERT-base