metadata
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- dense
- generated_from_trainer
- dataset_size:359997
- loss:MultipleNegativesRankingLoss
base_model: prajjwal1/bert-small
widget:
- source_sentence: >-
When do you use Ms. or Mrs.? Is one for a married woman and one for one
that's not married? Which one is for what?
sentences:
- >-
When do you use Ms. or Mrs.? Is one for a married woman and one for one
that's not married? Which one is for what?
- Nations that do/does otherwise? Which one do I use?
- What is the best way to make money on Quora?
- source_sentence: >-
Which ointment is applied to the face of UFC fighters at the commencement
of a bout? What does it do?
sentences:
- Why don't bikes have a gear indicator?
- >-
Which ointment is applied to the face of UFC fighters at the
commencement of a bout? What does it do?
- How do I get the body of a UFC Fighter?
- source_sentence: Do you love the life you live?
sentences:
- Which file formats are compatible with iTunes?
- Do you love the life you're living?
- >-
What is the best way to find a person just using their phone by trying
to track the other persons phone and get a location from it?
- source_sentence: >-
Can I do shoulder and triceps workout on same day? What other combinations
like this can I do?
sentences:
- >-
Can I do shoulder and triceps workout on same day? I can What other
combinations like thisdo?
- How can I save a Snapchat video that others posted?
- >-
Can I do shoulder and triceps workout on same day? What other
combinations like this can I do?
- source_sentence: I am a married woman and I'm in love with married man. what should I do?
sentences:
- How can I earn money easily online?
- >-
I am not a married woman and I 'm in love with married man . what should
I do ?
- I am a married woman and I'm in love with married man. what should I do?
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_ndcg@10
- cosine_mrr@1
- cosine_mrr@5
- cosine_mrr@10
- cosine_map@100
model-index:
- name: SentenceTransformer based on prajjwal1/bert-small
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: val
type: val
metrics:
- type: cosine_accuracy@1
value: 0.828025
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.9027
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.931025
name: Cosine Accuracy@5
- type: cosine_precision@1
value: 0.828025
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.3008999999999999
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.186205
name: Cosine Precision@5
- type: cosine_recall@1
value: 0.828025
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.9027
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.931025
name: Cosine Recall@5
- type: cosine_ndcg@10
value: 0.8942284691055087
name: Cosine Ndcg@10
- type: cosine_mrr@1
value: 0.828025
name: Cosine Mrr@1
- type: cosine_mrr@5
value: 0.8677179166666629
name: Cosine Mrr@5
- type: cosine_mrr@10
value: 0.8721162896825339
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.8742240723304836
name: Cosine Map@100
SentenceTransformer based on prajjwal1/bert-small
This is a sentence-transformers model finetuned from prajjwal1/bert-small. It maps sentences & paragraphs to a 512-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: prajjwal1/bert-small
- Maximum Sequence Length: 128 tokens
- Output Dimensionality: 512 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 128, 'do_lower_case': False, 'architecture': 'BertModel'})
(1): Pooling({'word_embedding_dimension': 512, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("redis/model-b-structured")
# Run inference
sentences = [
"I am a married woman and I'm in love with married man. what should I do?",
"I am a married woman and I'm in love with married man. what should I do?",
"I am not a married woman and I 'm in love with married man . what should I do ?",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 512]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 1.0000, 0.4050],
# [1.0000, 1.0000, 0.4050],
# [0.4050, 0.4050, 1.0000]])
Evaluation
Metrics
Information Retrieval
- Dataset:
val - Evaluated with
InformationRetrievalEvaluator
| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.828 |
| cosine_accuracy@3 | 0.9027 |
| cosine_accuracy@5 | 0.931 |
| cosine_precision@1 | 0.828 |
| cosine_precision@3 | 0.3009 |
| cosine_precision@5 | 0.1862 |
| cosine_recall@1 | 0.828 |
| cosine_recall@3 | 0.9027 |
| cosine_recall@5 | 0.931 |
| cosine_ndcg@10 | 0.8942 |
| cosine_mrr@1 | 0.828 |
| cosine_mrr@5 | 0.8677 |
| cosine_mrr@10 | 0.8721 |
| cosine_map@100 | 0.8742 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 359,997 training samples
- Columns:
anchor,positive, andnegative - Approximate statistics based on the first 1000 samples:
anchor positive negative type string string string details - min: 4 tokens
- mean: 15.46 tokens
- max: 49 tokens
- min: 4 tokens
- mean: 15.52 tokens
- max: 49 tokens
- min: 4 tokens
- mean: 16.63 tokens
- max: 59 tokens
- Samples:
anchor positive negative Shall I upgrade my iPhone 5s to iOS 10 final version?Should I upgrade an iPhone 5s to iOS 10?Shall my iPhone 5s upgrade Ito iOS 10 final version?Is Donald Trump really going to be the president of United States?Do you think Donald Trump could conceivably be the next President of the United States?Is Donald Trump really going not to be the president of United States ?What are real tips to improve work life balance?What are the best ways to create a work life balance?How far is Miami from Fort Lauderdale? - Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false }
Evaluation Dataset
Unnamed Dataset
- Size: 40,000 evaluation samples
- Columns:
anchor,positive, andnegative - Approximate statistics based on the first 1000 samples:
anchor positive negative type string string string details - min: 6 tokens
- mean: 15.71 tokens
- max: 65 tokens
- min: 6 tokens
- mean: 15.79 tokens
- max: 65 tokens
- min: 5 tokens
- mean: 16.59 tokens
- max: 77 tokens
- Samples:
anchor positive negative Why were feathered dinosaur fossils only found in the last 20 years?Why were feathered dinosaur fossils only found in the last 20 years?Why are only few people aware that many dinosaurs had feathers?If FOX News is the conservative news station, which cable news network is for liberals/progressives?If FOX News is the conservative news station, which cable news network is for liberals/progressives?How much did Fox News and conservative leaning media networks stoke the anger that contributed to Donald Trump's popularity?How can guys last longer during sex?How do I last longer in sex?Why does economics require calculus? - Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: stepsper_device_train_batch_size: 256per_device_eval_batch_size: 256learning_rate: 2e-05weight_decay: 0.001max_steps: 14060warmup_ratio: 0.1fp16: Truedataloader_drop_last: Truedataloader_num_workers: 1dataloader_prefetch_factor: 1load_best_model_at_end: Trueoptim: adamw_torchddp_find_unused_parameters: Falsepush_to_hub: Truehub_model_id: redis/model-b-structuredeval_on_start: True
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 256per_device_eval_batch_size: 256per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.001adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 3.0max_steps: 14060lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Truedataloader_num_workers: 1dataloader_prefetch_factor: 1past_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Trueignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Falseddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Trueresume_from_checkpoint: Nonehub_model_id: redis/model-b-structuredhub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Trueuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}
Training Logs
Click to expand
| Epoch | Step | Training Loss | Validation Loss | val_cosine_ndcg@10 |
|---|---|---|---|---|
| 0 | 0 | - | 1.7418 | 0.7821 |
| 0.0711 | 100 | 2.0777 | 0.7932 | 0.8130 |
| 0.1422 | 200 | 0.7966 | 0.4005 | 0.8510 |
| 0.2134 | 300 | 0.3991 | 0.2603 | 0.8615 |
| 0.2845 | 400 | 0.3153 | 0.2051 | 0.8652 |
| 0.3556 | 500 | 0.2593 | 0.1740 | 0.8681 |
| 0.4267 | 600 | 0.2231 | 0.1568 | 0.8707 |
| 0.4979 | 700 | 0.2017 | 0.1443 | 0.8727 |
| 0.5690 | 800 | 0.1933 | 0.1322 | 0.8746 |
| 0.6401 | 900 | 0.1818 | 0.1217 | 0.8755 |
| 0.7112 | 1000 | 0.1714 | 0.1141 | 0.8769 |
| 0.7824 | 1100 | 0.157 | 0.1060 | 0.8780 |
| 0.8535 | 1200 | 0.1467 | 0.0998 | 0.8788 |
| 0.9246 | 1300 | 0.1394 | 0.0937 | 0.8805 |
| 0.9957 | 1400 | 0.1343 | 0.0910 | 0.8813 |
| 1.0669 | 1500 | 0.1222 | 0.0853 | 0.8822 |
| 1.1380 | 1600 | 0.1173 | 0.0820 | 0.8821 |
| 1.2091 | 1700 | 0.1082 | 0.0797 | 0.8828 |
| 1.2802 | 1800 | 0.1105 | 0.0777 | 0.8835 |
| 1.3514 | 1900 | 0.1093 | 0.0734 | 0.8833 |
| 1.4225 | 2000 | 0.1034 | 0.0744 | 0.8840 |
| 1.4936 | 2100 | 0.1016 | 0.0713 | 0.8845 |
| 1.5647 | 2200 | 0.0995 | 0.0699 | 0.8851 |
| 1.6358 | 2300 | 0.0994 | 0.0679 | 0.8849 |
| 1.7070 | 2400 | 0.1024 | 0.0667 | 0.8867 |
| 1.7781 | 2500 | 0.0911 | 0.0658 | 0.8868 |
| 1.8492 | 2600 | 0.0907 | 0.0640 | 0.8861 |
| 1.9203 | 2700 | 0.0941 | 0.0632 | 0.8859 |
| 1.9915 | 2800 | 0.093 | 0.0625 | 0.8870 |
| 2.0626 | 2900 | 0.0814 | 0.0618 | 0.8875 |
| 2.1337 | 3000 | 0.0811 | 0.0609 | 0.8868 |
| 2.2048 | 3100 | 0.0773 | 0.0602 | 0.8880 |
| 2.2760 | 3200 | 0.0813 | 0.0590 | 0.8873 |
| 2.3471 | 3300 | 0.0806 | 0.0584 | 0.8876 |
| 2.4182 | 3400 | 0.0765 | 0.0575 | 0.8882 |
| 2.4893 | 3500 | 0.0774 | 0.0581 | 0.8889 |
| 2.5605 | 3600 | 0.0761 | 0.0560 | 0.8883 |
| 2.6316 | 3700 | 0.0735 | 0.0560 | 0.8886 |
| 2.7027 | 3800 | 0.0711 | 0.0555 | 0.8891 |
| 2.7738 | 3900 | 0.0747 | 0.0551 | 0.8889 |
| 2.8450 | 4000 | 0.0731 | 0.0552 | 0.8897 |
| 2.9161 | 4100 | 0.0708 | 0.0543 | 0.8898 |
| 2.9872 | 4200 | 0.0778 | 0.0536 | 0.8901 |
| 3.0583 | 4300 | 0.0697 | 0.0540 | 0.8893 |
| 3.1294 | 4400 | 0.0668 | 0.0533 | 0.8900 |
| 3.2006 | 4500 | 0.0679 | 0.0526 | 0.8893 |
| 3.2717 | 4600 | 0.0652 | 0.0532 | 0.8902 |
| 3.3428 | 4700 | 0.0673 | 0.0520 | 0.8899 |
| 3.4139 | 4800 | 0.0625 | 0.0514 | 0.8903 |
| 3.4851 | 4900 | 0.0669 | 0.0515 | 0.8912 |
| 3.5562 | 5000 | 0.0641 | 0.0515 | 0.8915 |
| 3.6273 | 5100 | 0.0637 | 0.0509 | 0.8909 |
| 3.6984 | 5200 | 0.0635 | 0.0506 | 0.8908 |
| 3.7696 | 5300 | 0.0606 | 0.0499 | 0.8915 |
| 3.8407 | 5400 | 0.0633 | 0.0503 | 0.8917 |
| 3.9118 | 5500 | 0.0656 | 0.0498 | 0.8913 |
| 3.9829 | 5600 | 0.0658 | 0.0492 | 0.8916 |
| 4.0541 | 5700 | 0.0606 | 0.0489 | 0.8917 |
| 4.1252 | 5800 | 0.0585 | 0.0485 | 0.8914 |
| 4.1963 | 5900 | 0.0613 | 0.0490 | 0.8914 |
| 4.2674 | 6000 | 0.0568 | 0.0487 | 0.8909 |
| 4.3385 | 6100 | 0.0576 | 0.0481 | 0.8918 |
| 4.4097 | 6200 | 0.0603 | 0.0481 | 0.8915 |
| 4.4808 | 6300 | 0.0569 | 0.0480 | 0.8918 |
| 4.5519 | 6400 | 0.0553 | 0.0477 | 0.8921 |
| 4.6230 | 6500 | 0.057 | 0.0472 | 0.8918 |
| 4.6942 | 6600 | 0.0602 | 0.0472 | 0.8925 |
| 4.7653 | 6700 | 0.0541 | 0.0468 | 0.8922 |
| 4.8364 | 6800 | 0.0588 | 0.0468 | 0.8917 |
| 4.9075 | 6900 | 0.0588 | 0.0471 | 0.8920 |
| 4.9787 | 7000 | 0.0549 | 0.0469 | 0.8921 |
| 5.0498 | 7100 | 0.0522 | 0.0466 | 0.8920 |
| 5.1209 | 7200 | 0.0527 | 0.0462 | 0.8924 |
| 5.1920 | 7300 | 0.0519 | 0.0461 | 0.8924 |
| 5.2632 | 7400 | 0.0544 | 0.0459 | 0.8927 |
| 5.3343 | 7500 | 0.0549 | 0.0456 | 0.8925 |
| 5.4054 | 7600 | 0.0527 | 0.0460 | 0.8932 |
| 5.4765 | 7700 | 0.0519 | 0.0453 | 0.8920 |
| 5.5477 | 7800 | 0.0528 | 0.0455 | 0.8928 |
| 5.6188 | 7900 | 0.0525 | 0.0451 | 0.8929 |
| 5.6899 | 8000 | 0.0535 | 0.0454 | 0.8931 |
| 5.7610 | 8100 | 0.0526 | 0.0452 | 0.8931 |
| 5.8321 | 8200 | 0.0507 | 0.0454 | 0.8930 |
| 5.9033 | 8300 | 0.0511 | 0.0451 | 0.8932 |
| 5.9744 | 8400 | 0.0489 | 0.0451 | 0.8930 |
| 6.0455 | 8500 | 0.0509 | 0.0451 | 0.8929 |
| 6.1166 | 8600 | 0.0487 | 0.0447 | 0.8931 |
| 6.1878 | 8700 | 0.0494 | 0.0449 | 0.8932 |
| 6.2589 | 8800 | 0.0474 | 0.0444 | 0.8932 |
| 6.3300 | 8900 | 0.049 | 0.0448 | 0.8934 |
| 6.4011 | 9000 | 0.0492 | 0.0446 | 0.8934 |
| 6.4723 | 9100 | 0.0493 | 0.0443 | 0.8931 |
| 6.5434 | 9200 | 0.0517 | 0.0442 | 0.8931 |
| 6.6145 | 9300 | 0.0502 | 0.0445 | 0.8938 |
| 6.6856 | 9400 | 0.0501 | 0.0441 | 0.8935 |
| 6.7568 | 9500 | 0.0484 | 0.0439 | 0.8935 |
| 6.8279 | 9600 | 0.0472 | 0.0437 | 0.8935 |
| 6.8990 | 9700 | 0.0484 | 0.0435 | 0.8936 |
| 6.9701 | 9800 | 0.051 | 0.0433 | 0.8933 |
| 7.0413 | 9900 | 0.0496 | 0.0435 | 0.8935 |
| 7.1124 | 10000 | 0.0469 | 0.0434 | 0.8937 |
| 7.1835 | 10100 | 0.0479 | 0.0432 | 0.8935 |
| 7.2546 | 10200 | 0.0476 | 0.0430 | 0.8937 |
| 7.3257 | 10300 | 0.0454 | 0.0431 | 0.8934 |
| 7.3969 | 10400 | 0.0445 | 0.0430 | 0.8937 |
| 7.4680 | 10500 | 0.0471 | 0.0427 | 0.8936 |
| 7.5391 | 10600 | 0.0441 | 0.0429 | 0.8938 |
| 7.6102 | 10700 | 0.046 | 0.0429 | 0.8932 |
| 7.6814 | 10800 | 0.046 | 0.0428 | 0.8934 |
| 7.7525 | 10900 | 0.049 | 0.0428 | 0.8938 |
| 7.8236 | 11000 | 0.0476 | 0.0427 | 0.8939 |
| 7.8947 | 11100 | 0.0468 | 0.0425 | 0.8938 |
| 7.9659 | 11200 | 0.0465 | 0.0426 | 0.8940 |
| 8.0370 | 11300 | 0.048 | 0.0428 | 0.8938 |
| 8.1081 | 11400 | 0.0448 | 0.0425 | 0.8937 |
| 8.1792 | 11500 | 0.0431 | 0.0424 | 0.8939 |
| 8.2504 | 11600 | 0.0428 | 0.0424 | 0.8935 |
| 8.3215 | 11700 | 0.046 | 0.0424 | 0.8937 |
| 8.3926 | 11800 | 0.0471 | 0.0423 | 0.8938 |
| 8.4637 | 11900 | 0.0466 | 0.0424 | 0.8943 |
| 8.5349 | 12000 | 0.0431 | 0.0421 | 0.8941 |
| 8.6060 | 12100 | 0.0462 | 0.0421 | 0.8938 |
| 8.6771 | 12200 | 0.0425 | 0.0423 | 0.8941 |
| 8.7482 | 12300 | 0.0455 | 0.0421 | 0.8941 |
| 8.8193 | 12400 | 0.0445 | 0.0422 | 0.8940 |
| 8.8905 | 12500 | 0.0455 | 0.0422 | 0.8943 |
| 8.9616 | 12600 | 0.0448 | 0.0421 | 0.8941 |
| 9.0327 | 12700 | 0.0462 | 0.0421 | 0.8940 |
| 9.1038 | 12800 | 0.0429 | 0.0421 | 0.8939 |
| 9.1750 | 12900 | 0.0452 | 0.0421 | 0.8942 |
| 9.2461 | 13000 | 0.0439 | 0.0420 | 0.8943 |
| 9.3172 | 13100 | 0.0472 | 0.0420 | 0.8942 |
| 9.3883 | 13200 | 0.0447 | 0.0420 | 0.8943 |
| 9.4595 | 13300 | 0.0426 | 0.0420 | 0.8942 |
| 9.5306 | 13400 | 0.0445 | 0.0420 | 0.8942 |
| 9.6017 | 13500 | 0.0436 | 0.0419 | 0.8942 |
| 9.6728 | 13600 | 0.0445 | 0.0419 | 0.8943 |
| 9.7440 | 13700 | 0.0477 | 0.0419 | 0.8943 |
| 9.8151 | 13800 | 0.0439 | 0.0419 | 0.8942 |
| 9.8862 | 13900 | 0.0438 | 0.0419 | 0.8942 |
| 9.9573 | 14000 | 0.0468 | 0.0419 | 0.8942 |
Framework Versions
- Python: 3.10.18
- Sentence Transformers: 5.2.0
- Transformers: 4.57.3
- PyTorch: 2.9.1+cu128
- Accelerate: 1.12.0
- Datasets: 4.4.2
- Tokenizers: 0.22.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}