metadata
language:
- en
license: apache-2.0
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- dense
- generated_from_trainer
- dataset_size:91097
- loss:MultipleNegativesRankingLoss
base_model: NeuML/pubmedbert-base-embeddings
widget:
- source_sentence: >-
<DOMAIN>ROSTER<TASK>COL_MAP<TEXT>provider gen suffix<VALUE>Jr, Sr, II,
III||III||III||M||II||Jr ||III||Jr.||Sr.
sentences:
- <DOMAIN>ROSTER<TASK>COL_MAP<TEXT>Gen suffix
- <DOMAIN>ROSTER<TASK>COL_MAP<TEXT>Attr-OBSERVATION
- <DOMAIN>ROSTER<TASK>COL_MAP<TEXT>State License
- source_sentence: >-
<DOMAIN>ROSTER<TASK>COL_MAP<TEXT>Trust-Based Relational Intervention
(TBRI)<VALUE>FALSE
sentences:
- <DOMAIN>ROSTER<TASK>COL_MAP<TEXT>Attr-CHILD PSYCHOLOGY
- <DOMAIN>ROSTER<TASK>COL_MAP<TEXT>Supervising Physician NPI
- >-
<DOMAIN>ROSTER<TASK>COL_MAP<TEXT>Attr-Trust-Based Relational
Intervention (TBRI)
- source_sentence: >-
<DOMAIN>ROSTER<TASK>COL_MAP<TEXT>action<VALUE>Now has AMS Medicare_x000D_
Effective Date 10/7/2023_x000D_
TIN 74-1613878||Effective 11/7/2023 - Previously credentialed; please add
to Baylor contract and fee schedules under TIN 74-1613878 and link to all
applicable Baylor products/networks. Close Panel and Directory Suppress as
they are Hospital Based.||Now Enrolled in AMS Medicaid_x000D_
Effective Date 8/1/2023 _x000D_
TIN 74-1613878
sentences:
- <DOMAIN>ROSTER<TASK>COL_MAP<TEXT>Attr-OBSERVATION
- >-
<DOMAIN>ROSTER<TASK>COL_MAP<TEXT>Attr-Autism/Applied Behavioral Analysis
(ABA)
- <DOMAIN>ROSTER<TASK>COL_MAP<TEXT>Change Request Type
- source_sentence: <DOMAIN>ROSTER<TASK>COL_MAP<TEXT>language_written_indicator<VALUE>YES||NO
sentences:
- <DOMAIN>ROSTER<TASK>COL_MAP<TEXT>Medicare Number
- <DOMAIN>ROSTER<TASK>COL_MAP<TEXT>Language Written?
- <DOMAIN>ROSTER<TASK>COL_MAP<TEXT>Education Specialty
- source_sentence: >-
<DOMAIN>ROSTER<TASK>COL_MAP<TEXT> affiliate address<VALUE>200 E
ARIZONA||3480 E GUASTI RD
sentences:
- <DOMAIN>ROSTER<TASK>COL_MAP<TEXT>Hospital State
- <DOMAIN>ROSTER<TASK>COL_MAP<TEXT>Hospital Addr Line 1
- <DOMAIN>ROSTER<TASK>COL_MAP<TEXT>State License
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy
model-index:
- name: txt_std_ra_automapper_molina
results:
- task:
type: triplet
name: Triplet
dataset:
name: txt std ra automapper molina
type: txt_std_ra_automapper_molina
metrics:
- type: cosine_accuracy
value: 0.9426305294036865
name: Cosine Accuracy
txt_std_ra_automapper_molina
This is a sentence-transformers model finetuned from NeuML/pubmedbert-base-embeddings. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: NeuML/pubmedbert-base-embeddings
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
- Language: en
- License: apache-2.0
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BertModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'<DOMAIN>ROSTER<TASK>COL_MAP<TEXT> affiliate address<VALUE>200 E ARIZONA||3480 E GUASTI RD',
'<DOMAIN>ROSTER<TASK>COL_MAP<TEXT>Hospital Addr Line 1',
'<DOMAIN>ROSTER<TASK>COL_MAP<TEXT>Hospital State',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.9630, 0.0580],
# [0.9630, 1.0000, 0.0549],
# [0.0580, 0.0549, 1.0000]])
Evaluation
Metrics
Triplet
- Dataset:
txt_std_ra_automapper_molina - Evaluated with
TripletEvaluator
| Metric | Value |
|---|---|
| cosine_accuracy | 0.9426 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 91,097 training samples
- Columns:
anchor,positive, andnegative - Approximate statistics based on the first 1000 samples:
anchor positive negative type string string string details - min: 21 tokens
- mean: 45.98 tokens
- max: 185 tokens
- min: 17 tokens
- mean: 19.65 tokens
- max: 30 tokens
- min: 17 tokens
- mean: 19.86 tokens
- max: 30 tokens
- Samples:
anchor positive negative ROSTERCOL_MAPROSTERCOL_MAPROSTERCOL_MAPROSTERCOL_MAPROSTERCOL_MAPROSTERCOL_MAPROSTERCOL_MAPROSTERCOL_MAPROSTERCOL_MAP - Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false }
Evaluation Dataset
Unnamed Dataset
- Size: 9,709 evaluation samples
- Columns:
anchor,positive, andnegative - Approximate statistics based on the first 1000 samples:
anchor positive negative type string string string details - min: 21 tokens
- mean: 46.15 tokens
- max: 253 tokens
- min: 17 tokens
- mean: 19.7 tokens
- max: 30 tokens
- min: 17 tokens
- mean: 19.94 tokens
- max: 30 tokens
- Samples:
anchor positive negative ROSTERCOL_MAPROSTERCOL_MAPROSTERCOL_MAPROSTERCOL_MAPROSTERCOL_MAPROSTERCOL_MAPROSTERCOL_MAPROSTERCOL_MAPROSTERCOL_MAP - Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: stepsper_device_train_batch_size: 16per_device_eval_batch_size: 16num_train_epochs: 2warmup_ratio: 0.1fp16: Trueload_best_model_at_end: Truebatch_sampler: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 2max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Trueignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}
Training Logs
Click to expand
| Epoch | Step | Training Loss | Validation Loss | txt_std_ra_automapper_molina_cosine_accuracy |
|---|---|---|---|---|
| -1 | -1 | - | - | 0.9426 |
| 0.0088 | 50 | 1.7127 | - | - |
| 0.0176 | 100 | 0.8476 | - | - |
| 0.0263 | 150 | 0.5572 | - | - |
| 0.0351 | 200 | 0.4306 | 0.2925 | - |
| 0.0439 | 250 | 0.3464 | - | - |
| 0.0527 | 300 | 0.2764 | - | - |
| 0.0615 | 350 | 0.2381 | - | - |
| 0.0702 | 400 | 0.1622 | 0.1312 | - |
| 0.0790 | 450 | 0.1617 | - | - |
| 0.0878 | 500 | 0.1492 | - | - |
| 0.0966 | 550 | 0.1429 | - | - |
| 0.1054 | 600 | 0.114 | 0.0835 | - |
| 0.1142 | 650 | 0.1203 | - | - |
| 0.1229 | 700 | 0.0901 | - | - |
| 0.1317 | 750 | 0.1014 | - | - |
| 0.1405 | 800 | 0.0796 | 0.0614 | - |
| 0.1493 | 850 | 0.0631 | - | - |
| 0.1581 | 900 | 0.0989 | - | - |
| 0.1668 | 950 | 0.0627 | - | - |
| 0.1756 | 1000 | 0.0809 | 0.0670 | - |
| 0.1844 | 1050 | 0.0638 | - | - |
| 0.1932 | 1100 | 0.0664 | - | - |
| 0.2020 | 1150 | 0.0419 | - | - |
| 0.2107 | 1200 | 0.0569 | 0.0506 | - |
| 0.2195 | 1250 | 0.0842 | - | - |
| 0.2283 | 1300 | 0.0557 | - | - |
| 0.2371 | 1350 | 0.0653 | - | - |
| 0.2459 | 1400 | 0.065 | 0.0438 | - |
| 0.2547 | 1450 | 0.0459 | - | - |
| 0.2634 | 1500 | 0.0644 | - | - |
| 0.2722 | 1550 | 0.0494 | - | - |
| 0.2810 | 1600 | 0.0532 | 0.0296 | - |
| 0.2898 | 1650 | 0.0792 | - | - |
| 0.2986 | 1700 | 0.0592 | - | - |
| 0.3073 | 1750 | 0.0503 | - | - |
| 0.3161 | 1800 | 0.0353 | 0.0375 | - |
| 0.3249 | 1850 | 0.0556 | - | - |
| 0.3337 | 1900 | 0.0545 | - | - |
| 0.3425 | 1950 | 0.035 | - | - |
| 0.3512 | 2000 | 0.0286 | 0.0219 | - |
| 0.3600 | 2050 | 0.0305 | - | - |
| 0.3688 | 2100 | 0.014 | - | - |
| 0.3776 | 2150 | 0.0307 | - | - |
| 0.3864 | 2200 | 0.0374 | 0.0242 | - |
| 0.3952 | 2250 | 0.039 | - | - |
| 0.4039 | 2300 | 0.0218 | - | - |
| 0.4127 | 2350 | 0.0416 | - | - |
| 0.4215 | 2400 | 0.038 | 0.0186 | - |
| 0.4303 | 2450 | 0.0282 | - | - |
| 0.4391 | 2500 | 0.0171 | - | - |
| 0.4478 | 2550 | 0.0282 | - | - |
| 0.4566 | 2600 | 0.0227 | 0.0185 | - |
| 0.4654 | 2650 | 0.02 | - | - |
| 0.4742 | 2700 | 0.0215 | - | - |
| 0.4830 | 2750 | 0.0328 | - | - |
| 0.4917 | 2800 | 0.0118 | 0.0165 | - |
| 0.5005 | 2850 | 0.0278 | - | - |
| 0.5093 | 2900 | 0.0072 | - | - |
| 0.5181 | 2950 | 0.0252 | - | - |
| 0.5269 | 3000 | 0.0162 | 0.0151 | - |
| 0.5357 | 3050 | 0.0241 | - | - |
| 0.5444 | 3100 | 0.0042 | - | - |
| 0.5532 | 3150 | 0.0157 | - | - |
| 0.5620 | 3200 | 0.0256 | 0.0141 | - |
| 0.5708 | 3250 | 0.0106 | - | - |
| 0.5796 | 3300 | 0.0138 | - | - |
| 0.5883 | 3350 | 0.0292 | - | - |
| 0.5971 | 3400 | 0.0133 | 0.0164 | - |
| 0.6059 | 3450 | 0.0105 | - | - |
| 0.6147 | 3500 | 0.0148 | - | - |
| 0.6235 | 3550 | 0.0101 | - | - |
| 0.6322 | 3600 | 0.0101 | 0.0136 | - |
| 0.6410 | 3650 | 0.0271 | - | - |
| 0.6498 | 3700 | 0.028 | - | - |
| 0.6586 | 3750 | 0.0057 | - | - |
| 0.6674 | 3800 | 0.0273 | 0.0101 | - |
| 0.6762 | 3850 | 0.0201 | - | - |
| 0.6849 | 3900 | 0.0164 | - | - |
| 0.6937 | 3950 | 0.0425 | - | - |
| 0.7025 | 4000 | 0.0168 | 0.0112 | - |
| 0.7113 | 4050 | 0.0174 | - | - |
| 0.7201 | 4100 | 0.0153 | - | - |
| 0.7288 | 4150 | 0.0166 | - | - |
| 0.7376 | 4200 | 0.0252 | 0.0078 | - |
| 0.7464 | 4250 | 0.0098 | - | - |
| 0.7552 | 4300 | 0.0145 | - | - |
| 0.7640 | 4350 | 0.0141 | - | - |
| 0.7727 | 4400 | 0.0119 | 0.0088 | - |
| 0.7815 | 4450 | 0.0108 | - | - |
| 0.7903 | 4500 | 0.0146 | - | - |
| 0.7991 | 4550 | 0.0104 | - | - |
| 0.8079 | 4600 | 0.0068 | 0.0116 | - |
| 0.8166 | 4650 | 0.0233 | - | - |
| 0.8254 | 4700 | 0.0028 | - | - |
| 0.8342 | 4750 | 0.0255 | - | - |
| 0.8430 | 4800 | 0.009 | 0.0127 | - |
| 0.8518 | 4850 | 0.0293 | - | - |
| 0.8606 | 4900 | 0.0045 | - | - |
| 0.8693 | 4950 | 0.0048 | - | - |
| 0.8781 | 5000 | 0.0178 | 0.0132 | - |
| 0.8869 | 5050 | 0.0059 | - | - |
| 0.8957 | 5100 | 0.0221 | - | - |
| 0.9045 | 5150 | 0.0082 | - | - |
| 0.9132 | 5200 | 0.0111 | 0.0097 | - |
| 0.9220 | 5250 | 0.0021 | - | - |
| 0.9308 | 5300 | 0.0034 | - | - |
| 0.9396 | 5350 | 0.0449 | - | - |
| 0.9484 | 5400 | 0.0128 | 0.0066 | - |
| 0.9571 | 5450 | 0.0095 | - | - |
| 0.9659 | 5500 | 0.009 | - | - |
| 0.9747 | 5550 | 0.0169 | - | - |
| 0.9835 | 5600 | 0.0115 | 0.0060 | - |
| 0.9923 | 5650 | 0.0204 | - | - |
| 1.0011 | 5700 | 0.0116 | - | - |
| 1.0098 | 5750 | 0.0049 | - | - |
| 1.0186 | 5800 | 0.0064 | 0.0096 | - |
| 1.0274 | 5850 | 0.0061 | - | - |
| 1.0362 | 5900 | 0.0011 | - | - |
| 1.0450 | 5950 | 0.018 | - | - |
| 1.0537 | 6000 | 0.0231 | 0.0056 | - |
| 1.0625 | 6050 | 0.0081 | - | - |
| 1.0713 | 6100 | 0.0021 | - | - |
| 1.0801 | 6150 | 0.006 | - | - |
| 1.0889 | 6200 | 0.0116 | 0.0078 | - |
| 1.0976 | 6250 | 0.0074 | - | - |
| 1.1064 | 6300 | 0.0082 | - | - |
| 1.1152 | 6350 | 0.0011 | - | - |
| 1.1240 | 6400 | 0.0051 | 0.0101 | - |
| 1.1328 | 6450 | 0.007 | - | - |
| 1.1416 | 6500 | 0.0015 | - | - |
| 1.1503 | 6550 | 0.0037 | - | - |
| 1.1591 | 6600 | 0.0027 | 0.0073 | - |
| 1.1679 | 6650 | 0.0005 | - | - |
| 1.1767 | 6700 | 0.0239 | - | - |
| 1.1855 | 6750 | 0.0136 | - | - |
| 1.1942 | 6800 | 0.0251 | 0.0070 | - |
| 1.2030 | 6850 | 0.0004 | - | - |
| 1.2118 | 6900 | 0.0065 | - | - |
| 1.2206 | 6950 | 0.0109 | - | - |
| 1.2294 | 7000 | 0.0009 | 0.0043 | - |
| 1.2381 | 7050 | 0.0086 | - | - |
| 1.2469 | 7100 | 0.003 | - | - |
| 1.2557 | 7150 | 0.0044 | - | - |
| 1.2645 | 7200 | 0.0118 | 0.0058 | - |
| 1.2733 | 7250 | 0.0093 | - | - |
| 1.2821 | 7300 | 0.0023 | - | - |
| 1.2908 | 7350 | 0.002 | - | - |
| 1.2996 | 7400 | 0.007 | 0.0061 | - |
| 1.3084 | 7450 | 0.0162 | - | - |
| 1.3172 | 7500 | 0.0011 | - | - |
| 1.3260 | 7550 | 0.007 | - | - |
| 1.3347 | 7600 | 0.0014 | 0.0048 | - |
| 1.3435 | 7650 | 0.0033 | - | - |
| 1.3523 | 7700 | 0.0007 | - | - |
| 1.3611 | 7750 | 0.0017 | - | - |
| 1.3699 | 7800 | 0.0078 | 0.0049 | - |
| 1.3786 | 7850 | 0.0049 | - | - |
| 1.3874 | 7900 | 0.003 | - | - |
| 1.3962 | 7950 | 0.0028 | - | - |
| 1.4050 | 8000 | 0.0038 | 0.0033 | - |
| 1.4138 | 8050 | 0.0158 | - | - |
| 1.4226 | 8100 | 0.0008 | - | - |
| 1.4313 | 8150 | 0.0007 | - | - |
| 1.4401 | 8200 | 0.0038 | 0.0024 | - |
| 1.4489 | 8250 | 0.0177 | - | - |
| 1.4577 | 8300 | 0.0044 | - | - |
| 1.4665 | 8350 | 0.0064 | - | - |
| 1.4752 | 8400 | 0.0005 | 0.0049 | - |
| 1.4840 | 8450 | 0.0146 | - | - |
| 1.4928 | 8500 | 0.001 | - | - |
| 1.5016 | 8550 | 0.0014 | - | - |
| 1.5104 | 8600 | 0.0041 | 0.0038 | - |
| 1.5191 | 8650 | 0.0072 | - | - |
| 1.5279 | 8700 | 0.0014 | - | - |
| 1.5367 | 8750 | 0.0135 | - | - |
| 1.5455 | 8800 | 0.0148 | 0.0039 | - |
| 1.5543 | 8850 | 0.0017 | - | - |
| 1.5630 | 8900 | 0.007 | - | - |
| 1.5718 | 8950 | 0.012 | - | - |
| 1.5806 | 9000 | 0.0004 | 0.0024 | - |
| 1.5894 | 9050 | 0.0026 | - | - |
| 1.5982 | 9100 | 0.0109 | - | - |
| 1.6070 | 9150 | 0.0009 | - | - |
| 1.6157 | 9200 | 0.0054 | 0.0022 | - |
| 1.6245 | 9250 | 0.0032 | - | - |
| 1.6333 | 9300 | 0.0135 | - | - |
| 1.6421 | 9350 | 0.0131 | - | - |
| 1.6509 | 9400 | 0.0049 | 0.0021 | - |
| 1.6596 | 9450 | 0.0003 | - | - |
| 1.6684 | 9500 | 0.0027 | - | - |
| 1.6772 | 9550 | 0.0008 | - | - |
| 1.686 | 9600 | 0.0124 | 0.002 | - |
| 1.6948 | 9650 | 0.0026 | - | - |
| 1.7035 | 9700 | 0.004 | - | - |
| 1.7123 | 9750 | 0.0008 | - | - |
| 1.7211 | 9800 | 0.0058 | 0.0028 | - |
| 1.7299 | 9850 | 0.0133 | - | - |
| 1.7387 | 9900 | 0.0005 | - | - |
| 1.7475 | 9950 | 0.0007 | - | - |
| 1.7562 | 10000 | 0.0009 | 0.0040 | - |
| 1.7650 | 10050 | 0.0018 | - | - |
| 1.7738 | 10100 | 0.0002 | - | - |
| 1.7826 | 10150 | 0.002 | - | - |
| 1.7914 | 10200 | 0.0021 | 0.0042 | - |
| 1.8001 | 10250 | 0.002 | - | - |
| 1.8089 | 10300 | 0.0003 | - | - |
| 1.8177 | 10350 | 0.001 | - | - |
| 1.8265 | 10400 | 0.0092 | 0.0045 | - |
| 1.8353 | 10450 | 0.0032 | - | - |
| 1.8440 | 10500 | 0.0002 | - | - |
| 1.8528 | 10550 | 0.0026 | - | - |
| 1.8616 | 10600 | 0.0003 | 0.0048 | - |
| 1.8704 | 10650 | 0.001 | - | - |
| 1.8792 | 10700 | 0.0126 | - | - |
| 1.8880 | 10750 | 0.0172 | - | - |
| 1.8967 | 10800 | 0.0002 | 0.0034 | - |
| 1.9055 | 10850 | 0.0038 | - | - |
| 1.9143 | 10900 | 0.0005 | - | - |
| 1.9231 | 10950 | 0.0001 | - | - |
| 1.9319 | 11000 | 0.0162 | 0.0033 | - |
| 1.9406 | 11050 | 0.0037 | - | - |
| 1.9494 | 11100 | 0.0003 | - | - |
| 1.9582 | 11150 | 0.0021 | - | - |
| 1.9670 | 11200 | 0.0098 | 0.0032 | - |
| 1.9758 | 11250 | 0.0003 | - | - |
| 1.9845 | 11300 | 0.0017 | - | - |
| 1.9933 | 11350 | 0.0048 | - | - |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.10.17
- Sentence Transformers: 5.1.0
- Transformers: 4.55.0
- PyTorch: 2.8.0+cu128
- Accelerate: 1.7.0
- Datasets: 3.5.0
- Tokenizers: 0.21.4
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}