yasserrmd's picture
Initial commit: Fine-tuned embedding-gemma-300m on GeoGPT-QA dataset
7538ef0 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - dense
  - generated_from_trainer
  - dataset_size:20000
  - loss:MultipleNegativesRankingLoss
base_model: google/embeddinggemma-300m
widget:
  - source_sentence: >
      What is the importance of careful management of intravascular volume and
      hypertension in preventing acute pulmonary edema in post-MI parturients
      during the immediate postpartum period?
    sentences:
      - >-
        True resistant hypertension is defined as office blood pressure (BP)
        ≥140/90 mm Hg despite being treated with at least 3 antihypertensive
        drugs including a diuretic. It is confirmed by initial 24-hour
        ambulatory blood pressure monitoring (≥130/80 mm Hg), which excludes the
        white coat effect.
      - >-
        Careful management of intravascular volume and hypertension is essential
        in preventing acute pulmonary edema in post-MI parturients during the
        immediate postpartum period. By controlling intravascular volume and
        lowering systolic pressure, the left ventricle can eject to a smaller
        end-systolic volume, reducing pulmonary congestion caused by diastolic
        dysfunction. Strict control of hypertension is particularly important as
        it is associated with a better short-term prognosis than hypotension in
        patients with pulmonary edema and reversible/transient diastolic
        dysfunction. Proper management can help prevent complications and
        improve outcomes in these patients.
      - >-
        Non-HDL-C is selected as a superior lipid component for assessing
        cardiovascular health because it has been found to have higher
        prediction of cardiovascular disease compared to total cholesterol. It
        represents cholesterol contained in atherogenic lipoprotein particles
        and is recommended for screening followed by a fasting lipid profile
        when the non-HDL-C is above a certain threshold. It is also associated
        with obesity and can be used to grade the lipid metric.
  - source_sentence: >
      What diagnostic tests can be used to assess the severity and type of
      regurgitation following prosthetic valve replacement?
    sentences:
      - >-
        The major complications associated with the Hancock II bioprosthesis in
        the aortic position include thromboembolism, prosthetic valve
        endocarditis, major bleeding, and non-structural valve failure.
      - >-
        Anticoagulant therapy is generally recommended for patients with
        valvular heart disease, particularly when associated with atrial
        fibrillation. This is because the frequency and consequences of
        thromboembolic events are usually greater than the risk of bleeding.
        However, the risks of thromboembolism and bleeding must be balanced, as
        anticoagulant therapy carries a hemorrhagic risk. 
      - >-
        Echocardiography is the diagnostic test of choice for assessing
        prosthetic valve function and determining the severity and type of
        regurgitation. However, other imaging modalities, such as angiography,
        can also be used to assess the spatial and anatomic dimensions of
        paravalvular leakage (PVL) in surgical prosthetic valves. These tests
        help determine whether the regurgitation is functional or abnormal, and
        if abnormal, whether it is central or paravalvular.
  - source_sentence: >
      What are the treatment strategies used for patients with PLE following the
      Fontan operation?
    sentences:
      - >-
        The potential obstacles in targeting SERCA2a for heart failure treatment
        include the choice of AAV vectors, which have advantages but are limited
        by the usage of neutralizing antibodies against AAV and relatively low
        transduction efficiency. Additionally, long-term effects need to be
        studied in larger groups of patients. Alternative approaches, such as
        transplantation of induced pluripotent stem cells or deriving de novo
        cardiomyocytes, may enhance the long-term benefits of the therapy.
      - >-
        The rate of sudden cardiac death can be decreased through the
        identification and treatment of at-risk patients using evidence-based
        pharmacotherapy and interventional strategies. Primary prevention
        involves using medications such as beta-blockers, aspirin, statins, and
        angiotensin-converting enzyme inhibitors, as well as revascularization.
        Secondary prevention focuses on patients who have already experienced an
        acute myocardial infarction and aims to prevent further episodes of
        sudden cardiac death.
      - >-
        Treatment strategies for patients with PLE following the Fontan
        operation include medical therapy, such as controlled-release budesonide
        and sildenafil, as well as interventional and surgical therapies, such
        as Fontan revision and Fontan fenestration creation. However, limited
        studies have reported improved survival for patients with PLE following
        the Fontan operation.
  - source_sentence: >-
      How can positron emission tomography (PET) contribute to the diagnostic
      armamentarium for assessing myocardial viability?
    sentences:
      - >-
        The indications for intervention in cases of coronary artery fistulas
        include the presence of significant left-to-right shunt, left
        ventricular volume overload, myocardial ischemia, left ventricular
        dysfunction, congestive cardiac failure, and prevention of
        endocarditis/endarteritis. Interventions can be percutaneous or
        surgical, with options such as surgical closure of the fistula or
        percutaneous intervention using occlusion coils, vascular plugs, covered
        stents, or umbrella devices.
      - >-
        Positron emission tomography (PET) is considered the gold standard for
        detecting viable myocardium. It can accurately differentiate between
        viable myocardium and scar tissue (fibrosis) based on the presence or
        absence of metabolism and normal perfusion/metabolism ratio. PET studies
        have shown positive predictive values ranging from 75 to 80% and
        negative predictive values from 78 to 92% in differentiating viable
        myocardium from fibrosis. However, defining the viability of myocardial
        areas is challenging due to the mixture of vital myocardium, scar
        tissue, and hibernating areas within the three-dimensional organ.
      - >-
        Semi-quantitative scoring systems that include comorbidities and
        measures of functional capacity have been developed to estimate
        mortality in patients with severe aortic stenosis. These scores have
        good predictive accuracy and can help evaluate the life expectancy of
        patients in different age groups.
  - source_sentence: >-
      What are the limitations of current alternative technologies, such as
      drug-eluting stents and coated balloons, in the treatment of arterial
      lesions?
    sentences:
      - >-
        Apart from oxygen supply-demand imbalance and lactic acidosis, factors
        such as the buffering capacity of H' and volume changes of the
        exercising muscle cells may play a role in muscle fatigue. The decrease
        in pH and increase in intracellular osmolality when lactate accumulates
        without a simultaneous loss of an intracellular anion can contribute to
        muscle fatigue. Additionally, intracellular acidosis accompanying the
        increase in lactate may block excitation-contraction coupling, further
        contributing to fatigue.
      - >-
        MSCT (Multi-Slice Computed Tomography) and 3D TEE (Transesophageal
        Echocardiography) have a role in VIV procedures when the size of the SHV
        (Surgical Heart Valve) is not known. These imaging techniques can
        provide valuable information about the size and design of the SHV, which
        is important for selecting the appropriate THV (Transcatheter Heart
        Valve) valve type and sizes. By using MSCT or 3D TEE, healthcare
        professionals can gather the necessary information to ensure the proper
        placement of the THV during a VIV/VIR procedure.
      - >-
        Current studies on alternative technologies are limited to evaluating
        their use in non-complex and minimally calcified lesions. These
        technologies have theoretical advantages in inhibiting intimal
        hyperplasia and intra-stent stenosis, but their effectiveness in more
        complex lesions is still under evaluation.
pipeline_tag: sentence-similarity
library_name: sentence-transformers

SentenceTransformer based on google/embeddinggemma-300m

This is a sentence-transformers model finetuned from google/embeddinggemma-300m. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: google/embeddinggemma-300m
  • Maximum Sequence Length: 2048 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 2048, 'do_lower_case': False, 'architecture': 'Gemma3TextModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Dense({'in_features': 768, 'out_features': 3072, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
  (3): Dense({'in_features': 3072, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
  (4): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("yasserrmd/cardio-gemma-300m-emb")
# Run inference
queries = [
    "What are the limitations of current alternative technologies, such as drug-eluting stents and coated balloons, in the treatment of arterial lesions?",
]
documents = [
    'Current studies on alternative technologies are limited to evaluating their use in non-complex and minimally calcified lesions. These technologies have theoretical advantages in inhibiting intimal hyperplasia and intra-stent stenosis, but their effectiveness in more complex lesions is still under evaluation.',
    'MSCT (Multi-Slice Computed Tomography) and 3D TEE (Transesophageal Echocardiography) have a role in VIV procedures when the size of the SHV (Surgical Heart Valve) is not known. These imaging techniques can provide valuable information about the size and design of the SHV, which is important for selecting the appropriate THV (Transcatheter Heart Valve) valve type and sizes. By using MSCT or 3D TEE, healthcare professionals can gather the necessary information to ensure the proper placement of the THV during a VIV/VIR procedure.',
    "Apart from oxygen supply-demand imbalance and lactic acidosis, factors such as the buffering capacity of H' and volume changes of the exercising muscle cells may play a role in muscle fatigue. The decrease in pH and increase in intracellular osmolality when lactate accumulates without a simultaneous loss of an intracellular anion can contribute to muscle fatigue. Additionally, intracellular acidosis accompanying the increase in lactate may block excitation-contraction coupling, further contributing to fatigue.",
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[0.4404, 0.0641, 0.0375]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 20,000 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 8 tokens
    • mean: 22.56 tokens
    • max: 57 tokens
    • min: 17 tokens
    • mean: 89.45 tokens
    • max: 260 tokens
  • Samples:
    sentence_0 sentence_1
    What are the key features of diabetic cardiomyopathy and how are they affected by 11β-HSD1 inhibition?
    Diabetic cardiomyopathy is characterized by fibrosis and hypertrophy in the heart tissues. In the low dose STZ-high fat model of type 2 diabetes, diabetic mice showed increased collagen deposition and irregular/disorganized muscle fibers in the heart. However, treatment with PF, an inhibitor of 11β-HSD1, normalized these alterations, indicating that 11β-HSD1 inhibition can prevent the development of diabetic cardiomyopathy.
    How does tissue Doppler imaging (TDI) contribute to the assessment of myocardial dyssynchrony?
    Tissue Doppler imaging (TDI) is a technique used in echocardiography to evaluate the motion of the left ventricle. By analyzing myocardial regional velocity curves, TDI can provide information on the timing of systolic contractions in different myocardial segments. In the context of assessing dyssynchrony, TDI can measure the time-to-peak myocardial sustained systolic velocities (Ts) in all 12 left ventricular (LV) segments. The standard deviation of Ts (Ts-SD) can then be calculated to determine the presence of significant systolic IVD.
    How is conventional coronary angiography performed?
    Conventional coronary angiography is performed via a femoral approach using approximately 40 mL of nonionic contrast material. A minimum of six orthogonal views are obtained to evaluate the coronary arteries. The images are evaluated by a board-certified cardiologist who assesses the diameter stenosis by visual estimation.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 6
  • per_device_eval_batch_size: 6
  • num_train_epochs: 1
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 6
  • per_device_eval_batch_size: 6
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss
0.1500 500 0.0276
0.2999 1000 0.0145
0.4499 1500 0.0072
0.5999 2000 0.007
0.7499 2500 0.0039
0.8998 3000 0.0044

Framework Versions

  • Python: 3.12.11
  • Sentence Transformers: 5.1.0
  • Transformers: 4.56.1
  • PyTorch: 2.8.0+cu128
  • Accelerate: 1.10.1
  • Datasets: 4.0.0
  • Tokenizers: 0.22.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}