Instructions to use Komalverma/custom_bge_baai_cfr with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Komalverma/custom_bge_baai_cfr with sentence-transformers:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("Komalverma/custom_bge_baai_cfr")

sentences = [
    "<> The home health aide does not need to be present during the supervisory assessment described in paragraph (h)(1)(i)(A) of this section. <> The supervisory assessment must be completed onsite (that is, an in person visit), or on the rare occasion by using two-way audio-video telecommunications technology that allows for real-time interaction between the registered nurse (or other appropriate skilled professional) and the patient, not to exceed 1 virtual supervisory assessment per patient in a 60-day episode. <> If an area of concern in aide services is noted by the supervising registered nurse or other appropriate skilled professional, then the supervising individual must make an on-site visit to the location where the patient is receiving care in order to observe and assess the aide while he or she is performing care.",
    "<> The State need not require the facility to disclose the same information described in this paragraph (e) more than once on the same enrollment application submission. Federal financial participation (FFP) is not available in payments made to a disclosing entity that fails to disclose ownership or control information as required by this section.",
    "<> The home health aide does not need to be present during the supervisory assessment described in paragraph (h)(1)(i)(A) of this section. <> The supervisory assessment must be completed onsite (that is, an in person visit), or on the rare occasion by using two-way audio-video telecommunications technology that allows for real-time interaction between the registered nurse (or other appropriate skilled professional) and the patient, not to exceed 1 virtual supervisory assessment per patient in a 60-day episode. <> If an area of concern in aide services is noted by the supervising registered nurse or other appropriate skilled professional, then the supervising individual must make an on-site visit to the location where the patient is receiving care in order to observe and assess the aide while he or she is performing care.",
    "<> A medical device distributor or wholesaler that is not otherwise a manufacturer of a device or medical supplies. [ENUM Coordination and management of care (or coordinating and managing care)] (i) means the deliberate organization of patient care activities and sharing of information between two or more VBE participants, one or more VBE participants and the VBE, or one or more VBE participants and patients, that is designed to achieve safer, more effective, or more efficient care to improve the health outcomes of the target patient population."
]
embeddings = model.encode(sentences)

similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]

Notebooks
Google Colab
Kaggle

SentenceTransformer based on BAAI/bge-base-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for retrieval.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: BAAI/bge-base-en-v1.5
Maximum Sequence Length: 512 tokens
Output Dimensionality: 768 dimensions
Similarity Function: Cosine Similarity
Supported Modality: Text

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'last_hidden_state'}}, 'module_output_name': 'token_embeddings', 'architecture': 'BertModel'})
  (1): Pooling({'embedding_dimension': 768, 'pooling_mode': 'mean', 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Placing the health of (or, with respect to a pregnant woman, health of unborn child) in serious there is to a safe to another hospital before delivery; <> may pose a the health or or the unborn child. [ENUM Hospital] includes access as defined section 1861(mm)(1) Act and a emergency hospital as in section 1861(kkk)(2).',
    '<> Placing the health of the individual (or, with respect to a pregnant woman, the health of the woman or her unborn child) in serious jeopardy; <> That there is inadequate time to effect a safe transfer to another hospital before delivery; or <> That transfer may pose a threat to the health or safety of the woman or the unborn child. [ENUM Hospital] includes a critical access hospital as defined in section 1861(mm)(1) of the Act and a rural emergency hospital as defined in section 1861(kkk)(2).',
    '<> If CMS determines that a facility or organization that had previously been determined to be provider-based under this section no longer qualifies for provider-based status, and if the failure to qualify for provider-based status resulted from a material change in the relationship between the provider and the facility or organization that the provider did not report to CMS under paragraph (c) of this section, CMS will take the actions with respect to notice to the provider, adjustment of payments, and continuation of payment described in paragraphs (j)(3), (j)(4), and (j)(5) of this section, and will recover past payments to the provider to the extent described in paragraph (j)(1)(ii) of this section.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.9431, 0.4788],
#         [0.9431, 1.0000, 0.5043],
#         [0.4788, 0.5043, 1.0000]])

Training Details

Training Dataset

Unnamed Dataset

Size: 24,712 training samples
Columns: sentence_0 and sentence_1
Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1
type string string
details
min: 23 tokens
mean: 93.61 tokens
max: 283 tokens

min: 36 tokens
mean: 133.76 tokens
max: 316 tokens

	sentence_0	sentence_1
type	string	string
details	min: 23 tokens mean: 93.61 tokens max: 283 tokens	min: 36 tokens mean: 133.76 tokens max: 316 tokens

Samples:

sentence_0	sentence_1
`Be specially designed to respond medical provide acute to transport sick and and with all State and local laws governing an <> Be equipped emergency warning lights and required by State or laws. Be with telecommunications equipment as required or local law to minimum, one two-way voice radio wireless telephone. <> Be with a oxygen medical equipment as required State or local`	<> Be specially designed to respond to medical emergencies or provide acute medical care to transport the sick and injured and comply with all State and local laws governing an emergency transportation vehicle. <> Be equipped with emergency warning lights and sirens, as required by State or local laws. <> Be equipped with telecommunications equipment as required by State or local law to include, at a minimum, one two-way voice radio or wireless telephone. <> Be equipped with a stretcher, linens, emergency medical supplies, oxygen equipment, and other lifesaving emergency medical equipment as required by State or local laws.
Except paragraph (b) this section, a Part D plan sponsor that approves request for expedited determination must notify the enrollee (and the prescribing physician prescriber involved, appropriate) decision, whether adverse or as as the enrollee's condition requires, no [NUM] hours after receiving For the sponsor must notify (and the prescribing physician other prescriber involved, as appropriate) of its determination as expeditiously the enrollee's health condition requires, but later [NUM] hours after of the physician's or other prescriber's supporting statement. If a supporting is not received by end of 14 days from receipt of the exceptions Part D sponsor must notify enrollee prescribing physician involved, appropriate) of expeditiously as the enrollee's condition requires, later [NUM] hours from end of 14 days from receipt of request.	Except as provided in paragraph (b) of this section, a Part D plan sponsor that approves a request for expedited determination must make its determination and notify the enrollee (and the prescribing physician or other prescriber involved, as appropriate) of its decision, whether adverse or favorable, as expeditiously as the enrollee's health condition requires, but no later than 24 hours after receiving the request. For an exceptions request, the Part D plan sponsor must notify the enrollee (and the prescribing physician or other prescriber involved, as appropriate) of its determination as expeditiously as the enrollee's health condition requires, but no later than 24 hours after receipt of the physician's or other prescriber's supporting statement. If a supporting statement is not received by the end of 14 calendar days from receipt of the exceptions request, the Part D plan sponsor must notify the enrollee (and the prescribing physician or other prescriber involved, as appropriate) ...
`subpart implements sections 1902(a)(38), 1903(i)(2), and 1903(n) of Social Security Act. It forth State plan requirements Disclosure by and fiscal and control and of on a provider's other persons of offenses against Medicare, Medicaid, or the title XX services`	`This subpart implements sections 1124, 1126, 1902(a)(38), 1903(i)(2), and 1903(n) of the Social Security Act. It sets forth State plan requirements regarding— <> Disclosure by providers and fiscal agents of ownership and control information; and <> Disclosure of information on a provider's owners and other persons convicted of criminal offenses against Medicare, Medicaid, or the title XX services program.`

Loss: DenoisingAutoEncoderLoss with these parameters:

{
    "decoder_name_or_path": "BAAI/bge-base-en-v1.5",
    "need_retokenization": false
}

Training Hyperparameters

Non-Default Hyperparameters

num_train_epochs: 10
multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
prediction_loss_only: True
per_device_train_batch_size: 8
per_device_eval_batch_size: 8
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1
num_train_epochs: 10
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.0
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
parallelism_config: None
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch_fused
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
hub_revision: None
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
liger_kernel_config: None
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: round_robin
router_mapping: {}
learning_rate_mapping: {}

Training Logs

Epoch	Step	Training Loss
0.1619	500	7.6991
0.3237	1000	6.1472
0.4856	1500	5.3852
0.6475	2000	4.7963
0.8093	2500	4.3753
0.9712	3000	4.0604
1.1331	3500	3.76
1.2949	4000	3.5502
1.4568	4500	3.3828
1.6186	5000	3.274
1.7805	5500	3.1832
1.9424	6000	3.0938
2.1042	6500	2.9526
2.2661	7000	2.8591
2.4280	7500	2.818
2.5898	8000	2.7473
2.7517	8500	2.7077
2.9136	9000	2.6896
3.0754	9500	2.5649
3.2373	10000	2.4759
3.3992	10500	2.439
3.5610	11000	2.4331
3.7229	11500	2.3935
3.8848	12000	2.4138
4.0466	12500	2.311
4.2085	13000	2.1906
4.3703	13500	2.214
4.5322	14000	2.1814
4.6941	14500	2.1606
4.8559	15000	2.16
5.0178	15500	2.1203
5.1797	16000	1.9845
5.3415	16500	1.9753
5.5034	17000	1.9799
5.6653	17500	1.9741
5.8271	18000	1.9665
5.9890	18500	1.9645
6.1509	19000	1.8199
6.3127	19500	1.8093
6.4746	20000	1.8284
6.6365	20500	1.8244
6.7983	21000	1.8078
6.9602	21500	1.8021
7.1220	22000	1.7215
7.2839	22500	1.7091
7.4458	23000	1.6928
7.6076	23500	1.687
7.7695	24000	1.6959
7.9314	24500	1.6889
8.0932	25000	1.6431
8.2551	25500	1.6154
8.4170	26000	1.6315
8.5788	26500	1.6223
8.7407	27000	1.6144
8.9026	27500	1.6187
9.0644	28000	1.6091
9.2263	28500	1.5862
9.3882	29000	1.5785
9.5500	29500	1.5802
9.7119	30000	1.5989
9.8737	30500	1.5853

Training Time

Training: 1.4 hours

Framework Versions

Python: 3.12.6
Sentence Transformers: 5.4.1
Transformers: 4.56.0
PyTorch: 2.8.0+cu129
Accelerate: 1.10.1
Datasets: 4.8.4
Tokenizers: 0.22.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

DenoisingAutoEncoderLoss

@inproceedings{wang-2021-TSDAE,
    title = "TSDAE: Using Transformer-based Sequential Denoising Auto-Encoderfor Unsupervised Sentence Embedding Learning",
    author = "Wang, Kexin and Reimers, Nils and Gurevych, Iryna",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
    month = nov,
    year = "2021",
    address = "Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    pages = "671--688",
    url = "https://arxiv.org/abs/2104.06979",
}