SentenceTransformer based on BAAI/bge-base-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for retrieval.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-base-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Supported Modality: Text

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'last_hidden_state'}}, 'module_output_name': 'token_embeddings', 'architecture': 'BertModel'})
  (1): Pooling({'embedding_dimension': 768, 'pooling_mode': 'mean', 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Placing the health of (or, with respect to a pregnant woman, health of unborn child) in serious there is to a safe to another hospital before delivery; <> may pose a the health or or the unborn child. [ENUM Hospital] includes access as defined section 1861(mm)(1) Act and a emergency hospital as in section 1861(kkk)(2).',
    '<> Placing the health of the individual (or, with respect to a pregnant woman, the health of the woman or her unborn child) in serious jeopardy; <> That there is inadequate time to effect a safe transfer to another hospital before delivery; or <> That transfer may pose a threat to the health or safety of the woman or the unborn child. [ENUM Hospital] includes a critical access hospital as defined in section 1861(mm)(1) of the Act and a rural emergency hospital as defined in section 1861(kkk)(2).',
    '<> If CMS determines that a facility or organization that had previously been determined to be provider-based under this section no longer qualifies for provider-based status, and if the failure to qualify for provider-based status resulted from a material change in the relationship between the provider and the facility or organization that the provider did not report to CMS under paragraph (c) of this section, CMS will take the actions with respect to notice to the provider, adjustment of payments, and continuation of payment described in paragraphs (j)(3), (j)(4), and (j)(5) of this section, and will recover past payments to the provider to the extent described in paragraph (j)(1)(ii) of this section.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.9431, 0.4788],
#         [0.9431, 1.0000, 0.5043],
#         [0.4788, 0.5043, 1.0000]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 24,712 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 23 tokens
    • mean: 93.61 tokens
    • max: 283 tokens
    • min: 36 tokens
    • mean: 133.76 tokens
    • max: 316 tokens
  • Samples:
    sentence_0 sentence_1
    Be specially designed to respond medical provide acute to transport sick and and with all State and local laws governing an <> Be equipped emergency warning lights and required by State or laws. Be with telecommunications equipment as required or local law to minimum, one two-way voice radio wireless telephone. <> Be with a oxygen medical equipment as required State or local <> Be specially designed to respond to medical emergencies or provide acute medical care to transport the sick and injured and comply with all State and local laws governing an emergency transportation vehicle. <> Be equipped with emergency warning lights and sirens, as required by State or local laws. <> Be equipped with telecommunications equipment as required by State or local law to include, at a minimum, one two-way voice radio or wireless telephone. <> Be equipped with a stretcher, linens, emergency medical supplies, oxygen equipment, and other lifesaving emergency medical equipment as required by State or local laws.
    Except paragraph (b) this section, a Part D plan sponsor that approves request for expedited determination must notify the enrollee (and the prescribing physician prescriber involved, appropriate) decision, whether adverse or as as the enrollee's condition requires, no [NUM] hours after receiving For the sponsor must notify (and the prescribing physician other prescriber involved, as appropriate) of its determination as expeditiously the enrollee's health condition requires, but later [NUM] hours after of the physician's or other prescriber's supporting statement. If a supporting is not received by end of 14 days from receipt of the exceptions Part D sponsor must notify enrollee prescribing physician involved, appropriate) of expeditiously as the enrollee's condition requires, later [NUM] hours from end of 14 days from receipt of request. Except as provided in paragraph (b) of this section, a Part D plan sponsor that approves a request for expedited determination must make its determination and notify the enrollee (and the prescribing physician or other prescriber involved, as appropriate) of its decision, whether adverse or favorable, as expeditiously as the enrollee's health condition requires, but no later than 24 hours after receiving the request. For an exceptions request, the Part D plan sponsor must notify the enrollee (and the prescribing physician or other prescriber involved, as appropriate) of its determination as expeditiously as the enrollee's health condition requires, but no later than 24 hours after receipt of the physician's or other prescriber's supporting statement. If a supporting statement is not received by the end of 14 calendar days from receipt of the exceptions request, the Part D plan sponsor must notify the enrollee (and the prescribing physician or other prescriber involved, as appropriate) ...
    subpart implements sections 1902(a)(38), 1903(i)(2), and 1903(n) of Social Security Act. It forth State plan requirements Disclosure by and fiscal and control and of on a provider's other persons of offenses against Medicare, Medicaid, or the title XX services This subpart implements sections 1124, 1126, 1902(a)(38), 1903(i)(2), and 1903(n) of the Social Security Act. It sets forth State plan requirements regarding— <> Disclosure by providers and fiscal agents of ownership and control information; and <> Disclosure of information on a provider's owners and other persons convicted of criminal offenses against Medicare, Medicaid, or the title XX services program.
  • Loss: DenoisingAutoEncoderLoss with these parameters:
    {
        "decoder_name_or_path": "BAAI/bge-base-en-v1.5",
        "need_retokenization": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • num_train_epochs: 10
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss
0.1619 500 7.6991
0.3237 1000 6.1472
0.4856 1500 5.3852
0.6475 2000 4.7963
0.8093 2500 4.3753
0.9712 3000 4.0604
1.1331 3500 3.76
1.2949 4000 3.5502
1.4568 4500 3.3828
1.6186 5000 3.274
1.7805 5500 3.1832
1.9424 6000 3.0938
2.1042 6500 2.9526
2.2661 7000 2.8591
2.4280 7500 2.818
2.5898 8000 2.7473
2.7517 8500 2.7077
2.9136 9000 2.6896
3.0754 9500 2.5649
3.2373 10000 2.4759
3.3992 10500 2.439
3.5610 11000 2.4331
3.7229 11500 2.3935
3.8848 12000 2.4138
4.0466 12500 2.311
4.2085 13000 2.1906
4.3703 13500 2.214
4.5322 14000 2.1814
4.6941 14500 2.1606
4.8559 15000 2.16
5.0178 15500 2.1203
5.1797 16000 1.9845
5.3415 16500 1.9753
5.5034 17000 1.9799
5.6653 17500 1.9741
5.8271 18000 1.9665
5.9890 18500 1.9645
6.1509 19000 1.8199
6.3127 19500 1.8093
6.4746 20000 1.8284
6.6365 20500 1.8244
6.7983 21000 1.8078
6.9602 21500 1.8021
7.1220 22000 1.7215
7.2839 22500 1.7091
7.4458 23000 1.6928
7.6076 23500 1.687
7.7695 24000 1.6959
7.9314 24500 1.6889
8.0932 25000 1.6431
8.2551 25500 1.6154
8.4170 26000 1.6315
8.5788 26500 1.6223
8.7407 27000 1.6144
8.9026 27500 1.6187
9.0644 28000 1.6091
9.2263 28500 1.5862
9.3882 29000 1.5785
9.5500 29500 1.5802
9.7119 30000 1.5989
9.8737 30500 1.5853

Training Time

  • Training: 1.4 hours

Framework Versions

  • Python: 3.12.6
  • Sentence Transformers: 5.4.1
  • Transformers: 4.56.0
  • PyTorch: 2.8.0+cu129
  • Accelerate: 1.10.1
  • Datasets: 4.8.4
  • Tokenizers: 0.22.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

DenoisingAutoEncoderLoss

@inproceedings{wang-2021-TSDAE,
    title = "TSDAE: Using Transformer-based Sequential Denoising Auto-Encoderfor Unsupervised Sentence Embedding Learning",
    author = "Wang, Kexin and Reimers, Nils and Gurevych, Iryna",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
    month = nov,
    year = "2021",
    address = "Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    pages = "671--688",
    url = "https://arxiv.org/abs/2104.06979",
}
Downloads last month
94
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Komalverma/custom_bge_baai_cfr

Finetuned
(467)
this model

Papers for Komalverma/custom_bge_baai_cfr