bge_MNR / README.md
sucharush's picture
Add new SentenceTransformer model
6a5ad8b verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:98112
  - loss:MultipleNegativesRankingLoss
base_model: BAAI/bge-small-en-v1.5
widget:
  - source_sentence: >-
      Represent this question for retrieving relevant documents: I'm curious
      about strategies businesses can use to ensure data privacy and security.
    sentences:
      - >-
        The Vicsek fractal is a self-replicating fractal, where each square is
        replaced by a pattern of five smaller squares in each iteration. The
        number of line segments in each iteration can be calculated using the
        formula:


        Number of line segments = 4 * (5^(n-1))


        where n is the iteration number.


        For the fifth iteration (n=5):


        Number of line segments = 4 * (5^(5-1))

        Number of line segments = 4 * (5^4)

        Number of line segments = 4 * (625)

        Number of line segments = 2500


        So, there are 2500 line segments needed to draw the fifth iteration of
        the Vicsek fractal.
      - >-
        **Advanced Techniques and Strategies for Data Privacy and Security:**


        1. **Data encryption:** Encrypting data both at rest and in transit
        ensures it remains protected even if accessed by unauthorized
        individuals.


        2. **Regular security audits:** Conducting regular security audits helps
        businesses identify and address vulnerabilities in their systems and
        processes.


        3. **Data masking:** Data masking techniques such as tokenization and
        pseudonymization protect sensitive data by replacing it with
        non-identifiable values.


        4. **Access control management:** Implementing robust access control
        measures, such as role-based access control (RBAC) and multi-factor
        authentication (MFA), restricts access to data to authorized personnel
        only.


        5. **Data minimization:** Businesses should only collect and store data
        that is absolutely necessary for their operations to reduce the risk of
        data breaches.


        **Practical Examples of Data Privacy and Security:**


        - **Healthcare:** Hospitals and medical facilities use data encryption
        to protect patient health records, ensuring compliance with HIPAA
        regulations.


        - **Financial Services:** Banks and financial institutions implement MFA
        and access control measures to safeguard customer financial data.


        - **Retail:** E-commerce companies use data masking techniques to
        protect sensitive customer information, such as credit card numbers.


        **Interview Questions on Data Privacy and Security:**


        - Describe the key principles of data privacy and security.


        - Explain the different methods used for data encryption and their
        strengths and weaknesses.


        - How can organizations implement effective access control mechanisms to
        protect data?


        - What are the best practices for conducting security audits to ensure
        data privacy?


        - Discuss the ethical and legal implications of data privacy and
        security breaches.
      - >-
        First, let's write the system of linear equations as an augmented
        matrix:


        [ 1  2 -1 |  5]

        [ 2 -3  4 |  7]

        [-6  7 -5 | -1]


        Now, we'll perform forward elimination to convert the matrix into an
        upper triangular matrix.


        Step 1: Eliminate x from the second and third rows.


        To eliminate x from the second row, we'll subtract 2 times the first row
        from the second row:


        [ 1  2 -1 |  5]

        [ 0 -7  6 | -3]

        [-6  7 -5 | -1]


        To eliminate x from the third row, we'll add 6 times the first row to
        the third row:


        [ 1  2 -1 |  5]

        [ 0 -7  6 | -3]

        [ 0  5 -1 | 29]


        Step 2: Eliminate y from the third row.


        To eliminate y from the third row, we'll add (5/7) times the second row
        to the third row:


        [ 1  2 -1 |  5]

        [ 0 -7  6 | -3]

        [ 0  0  1 |  4]


        Now, we have an upper triangular matrix, and we can perform back
        substitution to find the values of x, y, and z.


        Step 3: Back substitution


        From the third row, we have z = 4.


        Now, we'll substitute z into the second row to find y:


        -7y + 6(4) = -3

        -7y + 24 = -3

        -7y = -27

        y = 27/7


        Finally, we'll substitute y and z into the first row to find x:


        x + 2(27/7) - 4 = 5

        x + 54/7 - 4 = 5

        x = 5 - 54/7 + 4

        x = (35 - 54 + 28)/7

        x = 9/7


        So, the solution to the system of linear equations is:


        x = 9/7

        y = 27/7

        z = 4
  - source_sentence: >-
      Represent this question for retrieving relevant documents: Solve the
      following system of linear equations with complex coefficients: 


      2x + 5y + 3iz = 8i 


      x - 3y + 2z = 1 


      -4x + 2y + 5iz = -5
    sentences:
      - >-
        To determine whether tympanic membrane (TM) electrodes induce behavioral
        pure-tone threshold shifts. Pure-tone thresholds (250 to 8000 Hz) were
        measured twice in test (n = 18) and control (n = 10) groups. TM
        electrodes were placed between first and second threshold measurements
        in the test group, whereas the control group did not receive electrodes.
        Pure-tone threshold shifts were compared between groups. The effect of
        TM electrode contact location on threshold shifts was evaluated in the
        test group. TM electrodes significantly increased average low-frequency
        thresholds, 7.5 dB at 250 Hz and 4.2 dB at 500 Hz, and shifts were as
        large as 25 dB in individual ears. Also, threshold shifts did not appear
        to vary at any frequency with TM electrode contact location.
      - >-
        To solve the given system of linear equations with complex coefficients,
        we can use the Gaussian elimination method. The given system of
        equations is:


        (1) 2x + 5y + 3iz = 8i

        (2) x - 3y + 2z = 1

        (3) -4x + 2y + 5iz = -5


        First, we will write the augmented matrix for this system:


        |  2   5   3i |  8i |

        |  1  -3   2  |  1  |

        | -4   2   5i | -5  |


        Next, we will perform row operations to get the matrix in row-echelon
        form. We will start by making the first element of the second row 0. To
        do this, we can subtract half of the first row from the second row:


        |  2   5   3i |  8i |

        |  0  -5.5 -1i | -3i |

        | -4   2   5i | -5  |


        Now, we will make the first element of the third row 0. To do this, we
        can add twice the first row to the third row:


        |  2   5   3i |  8i  |

        |  0  -5.5 -1i | -3i  |

        |  0   12  11i | 11i  |


        Next, we will make the second element of the third row 0. To do this, we
        can add (12/5.5) times the second row to the third row:


        |  2   5    3i  |  8i  |

        |  0  -5.5 -1i  | -3i  |

        |  0   0   10.8i|  6i  |


        Now, we have the matrix in row-echelon form. We can now solve for the
        variables using back-substitution.


        From the third row, we have:


        10.8i * z = 6i


        Dividing both sides by 10.8i, we get:


        z = 6i / 10.8i = 6/10.8 = 1/1.8 = 5/9


        Now, we can substitute z back into the second row to find y:


        -5.5y - 1i(5/9) = -3i


        Multiplying both sides by -1, we get:


        5.5y + (5i/9) = 3i


        Subtracting 5i/9 from both sides, we get:


        5.5y = 3i - 5i/9 = (22i - 5i) / 9 = 17i/9


        Dividing both sides by 5.5, we get:


        y = (17i/9) / 5.5 = 17i / 49.5 = 17i / (99/2) = 34i / 99


        Finally, we can substitute y and z back into the first row to find x:


        2x + 5(34i/99) + 3i(5/9) = 8i


        Multiplying both sides by 99, we get:


        198x + 5(34i) + 3i(55) = 792i


        198x + 170i + 165i = 792i


        198x = 792i - 335i = 457i


        Dividing both sides by 198, we get:


        x = 457i / 198


        So, the solution to the given system of linear equations is:


        x = 457i / 198

        y = 34i / 99

        z = 5/9
      - >-
        Remodelling of the asthmatic airway includes increased deposition of
        proteoglycan (PG) molecules. One of the stimuli driving airway
        remodelling may be excessive mechanical stimulation. We hypothesized
        that fibroblasts from asthmatic patients would respond to excessive
        mechanical strain with up-regulation of message for PGs. We obtained
        fibroblasts from asthmatic patients (AF) and normal volunteers (NF)
        using endobronchial biopsy. Cells were maintained in culture until the
        fifth passage and then grown on a flexible collagen-coated membrane.
        Using the Flexercell device, cells were then subjected to cyclic stretch
        at 30% amplitude at 1 Hz for 24 h. Control cells were unstrained. Total
        RNA was extracted from the cell layer and quantitative RT-PCR performed
        for decorin, lumican and versican mRNA. In unstrained cells, the
        expression of decorin mRNA was greater in AF than NF. With strain, NF
        showed increased expression of versican mRNA and AF showed increased
        expression of versican and decorin mRNA. The relative increase in
        versican mRNA expression with strain was greater in AF than NF.
  - source_sentence: >-
      Represent this question for retrieving relevant documents: What is the
      total arc length of the Lévy C curve after iterating 8 times if the
      original line segment had a length of 1 unit?
    sentences:
      - >-
        Pose estimation is indeed a fascinating area in computer vision, but
        it's not entirely a walk in the park. Estimating the pose of a human or
        object involves a combination of complex mathematical techniques and
        algorithms. Let's delve deeper into some key aspects of pose estimation:


        1). **3D vs 2D Pose Estimation**: 
         - 3D Pose Estimation aims to determine the 3-dimensional pose of a subject, providing depth information along with the 2D coordinates. This requires specialized techniques like stereo cameras or depth sensors to capture the 3D structure of the scene.
         - In comparison, 2D Pose Estimation focuses on estimating the 2D pose of a subject within a single image or video frame, providing information about joint locations in the image plane.

        2). **Model-based Pose Estimation**: 
         - This approach leverages predefined models of human (or object) skeletons with known joint connections. The model is then fitted to the input image or video data to estimate the pose of the subject.  
         - A prominent example of Model-based Pose Estimation is the popular OpenPose library, which utilizes a part-based model to estimate human poses.

        3). **Model-free Pose Estimation**: 
         - Contrary to model-based methods, model-free approaches do not rely on predefined models. Instead, they directly learn to estimate the pose from raw image or video data. 
         - One such technique is the Convolutional Pose Machine (CPM) which uses convolutional neural networks to predict heatmaps for body joints, which are then refined to obtain the final pose estimation.

        4). **Case Study: Human Pose Estimation in Sports Analysis**: 
         - Pose estimation plays a crucial role in sports analysis, enabling the quantification of player movements and kinematics. 
         - For instance, in soccer, pose estimation techniques can be employed to track player positions, analyze their running patterns, and evaluate their performance during matches.

        5). **Comparative Analysis with Similar Concepts**: 
         - Object Detection: While both pose estimation and object detection involve locating and identifying objects in images or videos, pose estimation specifically focuses on determining the pose or configuration of the object, while object detection primarily aims to identify and localize the object's presence.
         - Motion Capture: Pose estimation is closely related to motion capture, which involves tracking and recording the movements of human subjects. Motion capture systems typically employ specialized sensors or cameras to capture highly accurate 3D pose data, whereas pose estimation algorithms typically rely on computer vision techniques to infer poses from 2D or 3D image or video data.

        6). **Common Misconceptions and Clarifications**: 
         - Pose estimation is not limited to humans: It can also be used to estimate the pose of objects, animals, and even vehicles.
         - Pose estimation is distinct from facial expression recognition: While both involve analyzing images or videos of people, pose estimation focuses on body posture and joint locations, whereas facial expression recognition aims to identify and interpret facial expressions.
      - >-
        The Lévy C curve is a self-replicating fractal that is created by
        iteratively replacing a straight line segment with two segments, each of
        which is 1/sqrt(2) times the length of the original segment, and joined
        at a right angle. 


        After each iteration, the total arc length of the curve increases by a
        factor of 2/sqrt(2), which is equal to sqrt(2). 


        If the original line segment has a length of 1 unit, then after 8
        iterations, the total arc length of the Lévy C curve will be:


        Arc length = Original length * (sqrt(2))^n

        Arc length = 1 * (sqrt(2))^8

        Arc length = 1 * 2^4

        Arc length = 1 * 16

        Arc length = 16 units


        So, the total arc length of the Lévy C curve after iterating 8 times is
        16 units.
      - >-
        If the dictator keeps X points for themselves, the receiver will get the
        remaining points, which can be calculated as:


        Y = 10 - X


        To find the fractional amount of the total points the receiver received,
        we can create a fraction with Y as the numerator and the total points
        (10) as the denominator:


        Fraction = Y/10 = (10 - X)/10


        So, the receiver gets a fractional amount of (10 - X)/10 of the total
        points.
  - source_sentence: >-
      Represent this question for retrieving relevant documents: Detailed
      Elaboration on Dimensionality Reduction and Industry Application
    sentences:
      - >-
        **Dimensionality Reduction: A Comprehensive Overview**


        Dimensionality reduction is a fundamental concept in machine learning
        and data analysis. It involves transforming high-dimensional data into a
        lower-dimensional representation while preserving the most important
        information. Dimensionality reduction techniques have a wide range of
        applications in various industries, such as:


        * **Feature engineering:** reducing the number of features in a dataset
        to improve the efficiency of machine learning algorithms.

        * **Visualization:** enabling the visualization of high-dimensional data
        by projecting it onto a lower-dimensional subspace.

        * **Data compression:** reducing the storage and transmission costs of
        large datasets.


        **Specific Industry Applications:**


        * **Computer vision:** Extracting meaningful features from images and
        videos for object recognition, image segmentation, and facial
        recognition.

        * **Natural language processing:** Reducing the dimensionality of text
        data for text classification, document summarization, and machine
        translation.

        * **Bioinformatics:** Analyzing gene expression data and identifying
        biomarkers for disease diagnosis and drug discovery.

        * **Financial modeling:** Identifying patterns and trends in financial
        data for risk assessment, portfolio optimization, and fraud detection.

        * **Recommendation systems:** Generating personalized recommendations
        for products, movies, or music based on user preferences.


        To further enhance your understanding, I can provide detailed
        explanations of specific techniques, industry case studies, or address
        any specific questions you may have.
      - >-
        Hepatocellular carcinoma is one of the most common malignancies
        worldwide. The only curative treatment is surgery. As hepatocellular
        carcinoma is often associated with liver cirrhosis, patients are at risk
        for postoperative liver failure. In the recent years, platelets are
        thought to play an important role in liver regeneration.The aim of this
        study was to discover the relevance of postoperative platelet counts
        after liver resection for hepatocellular carcinoma. Data of 68 patients
        who underwent liver resection for hepatocellular carcinoma between July
        2007 and July 2012 in a single centre were analysed. Postoperative
        morbidity and mortality were evaluated in regard to postoperative
        platelet counts. Comparative analysis between patients with platelet
        counts ≤100 2x109/ l and >100 x109/ l at d1 was performed in regard to
        postoperative outcome. Within this cohort, 43 patients (63%) suffered
        from histologically proven liver cirrhosis. Postoperative mortality was
        statistically significant associated with postoperative reduced platelet
        counts. Comparative analysis showed significantly elevated postoperative
        bilirubin levels and lower prothrombin time in patients with platelet
        counts ≤ 100 1x109/ l at d1.
      - >-
        Let G be a group of order 25. Since 25 = 5^2 and 5 is prime, by the
        Sylow theorems, the number of 5-Sylow subgroups in G, denoted by n_5,
        satisfies:


        1. n_5 divides 25/5 = 5, and

        2. n_5  1 (mod 5).


        From these conditions, we have that n_5 can only be 1 or 5. 


        Case 1: n_5 = 1

        In this case, there is only one 5-Sylow subgroup, say H, in G. By the
        Sylow theorems, H is a normal subgroup of G. Since the order of H is 5,
        which is prime, H is cyclic, i.e., H  C_5 (the cyclic group of order
        5). 


        Now, let g be an element of G that is not in H. Since H is normal in G,
        the set {gh : h  H} is also a subgroup of G. Let K = {gh : h  H}. Note
        that the order of K is also 5, as there is a one-to-one correspondence
        between the elements of H and K. Thus, K is also a cyclic group of order
        5, i.e., K  C_5.


        Since the orders of H and K are both 5, their intersection is trivial,
        i.e., H  K = {e}, where e is the identity element of G. Moreover, since
        the order of G is 25, any element of G can be written as a product of
        elements from H and K. Therefore, G is the internal direct product of H
        and K, i.e., G  H × K  C_5 × C_5.


        Case 2: n_5 = 5

        In this case, there are five 5-Sylow subgroups in G. Let H be one of
        these subgroups. Since the order of H is 5, which is prime, H is cyclic,
        i.e., H  C_5.


        Now, consider the action of G on the set of 5-Sylow subgroups by
        conjugation. This action gives rise to a homomorphism φ: G  S_5, where
        S_5 is the symmetric group on 5 elements. The kernel of φ, say N, is a
        normal subgroup of G. Since the action is nontrivial, N is a proper
        subgroup of G, and thus, the order of N is either 1 or 5. If the order
        of N is 1, then G is isomorphic to a subgroup of S_5, which is a
        contradiction since the order of G is 25 and there is no subgroup of S_5
        with order 25. Therefore, the order of N must be 5.


        Since the order of N is 5, N is a cyclic group of order 5, i.e., N 
        C_5. Moreover, N is a normal subgroup of G. Let g be an element of G
        that is not in N. Then, the set {gn : n  N} is also a subgroup of G.
        Let K = {gn : n  N}. Note that the order of K is also 5, as there is a
        one-to-one correspondence between the elements of N and K. Thus, K is
        also a cyclic group of order 5, i.e., K  C_5.


        Since the orders of N and K are both 5, their intersection is trivial,
        i.e., N  K = {e}, where e is the identity element of G. Moreover, since
        the order of G is 25, any element of G can be written as a product of
        elements from N and K. Therefore, G is the internal direct product of N
        and K, i.e., G  N × K  C_5 × C_5.


        In conclusion, a group of order 25 is either cyclic or isomorphic to the
        direct product of two cyclic groups of order 5.
  - source_sentence: >-
      Represent this question for retrieving relevant documents: Does low
      25-Hydroxyvitamin D Level be Associated with Peripheral Arterial Disease
      in Type 2 Diabetes Patients?
    sentences:
      - "Patients with type 2 diabetes have an increased risk of atherosclerosis and vascular disease. Vitamin D deficiency is associated with vascular disease and is prevalent in diabetes patients. We undertook this study to determine the association between 25-hydroxyvitamin D (25[OH]D) levels and prevalence of peripheral arterial disease (PAD) in type 2 diabetes patients. A total of 1028 type 2 diabetes patients were recruited at Nanjing Medical University Affiliated Nanjing Hospital from November 2011 to October 2013. PAD was defined as an ankle-brachial index (ABI)\_<\_0.9. Cardiovascular risk factors (blood pressure, HbA1c, lipid profile), comorbidities, carotid intima-media thickness (IMT) and 25(OH)D were assessed. Overall prevalence of PAD and of decreased 25(OH)D (<30\_ng/mL) were 20.1% (207/1028) and 54.6% (561/1028), respectively. PAD prevalence was higher in participants with decreased (23.9%) than in those with normal (15.6%) 25(OH)D (≥30\_ng/mL, p\_<0.01). Decreased 25(OH)D was associated with increased risk of PAD (odds ratio [OR], 1.69, 95% CI: 1.17-2.44, p\_<0.001) and PAD was significantly more likely to occur in participants ≥65\_years of age (OR, 2.56, 95% CI: 1.51 -4.48, vs. 1.21, 95% CI: 0.80-1.83, p-interaction\_=\_0.027). After adjusting for known cardiovascular risk factors and potential confounding variables, the association of decreased 25(OH)D and PAD remained significant in patients <65\_years of age (OR, 1.55; 95% CI: 1.14-2.12, p\_=\_0.006)."
      - >-
        No study has been performed to compare the impacts of migraine and major
        depressive episode (MDE) on depression, anxiety and somatic symptoms,
        and health-related quality of life (HRQoL) among psychiatric
        outpatients. The aim of this study was to investigate the above issue.
        This study enrolled consecutive psychiatric outpatients with mood and/or
        anxiety disorders who undertook a first visit to a medical center.
        Migraine was diagnosed according to the International Classification of
        Headache Disorders, 2nd edition. Three psychometric scales and the
        Short-Form 36 were administered. General linear models were used to
        estimate the difference in scores contributed by either migraine or MDE.
        Multiple linear regressions were employed to compare the variance of
        these scores explained by migraine or MDE. Among 214 enrolled
        participants, 35.0% had migraine. Bipolar II disorder patients (70.0%)
        had the highest percentage of migraine, followed by major depressive
        disorder (49.1%) and only anxiety disorder (24.5%). Patients with
        migraine had worse depression, anxiety, and somatic symptoms and lower
        SF-36 scores than those without. The estimated differences in the scores
        of physical functioning, bodily pain, and somatic symptoms contributed
        by migraine were not lower than those contributed by MDE. The regression
        model demonstrated the variance explained by migraine was significantly
        greater than that explained by MDE in physical and pain symptoms.
      - >-
        Based on the information provided, we only know the number of patients
        who died within the first year after the surgery. To determine the
        probability of a patient surviving at least two years, we would need
        additional information about the number of patients who died in the
        second year or survived beyond that.


        Without this information, it is not possible to calculate the
        probability of a patient surviving at least two years after the surgery.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
model-index:
  - name: SentenceTransformer based on BAAI/bge-small-en-v1.5
    results:
      - task:
          type: logging
          name: Logging
        dataset:
          name: ir eval
          type: ir-eval
        metrics:
          - type: cosine_accuracy@1
            value: 0.9241493167018252
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.9788131706869669
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.9906447766669724
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.9965147207190681
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.9241493167018252
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.3262710568956556
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.1981289553333945
            name: Cosine Precision@5
          - type: cosine_recall@1
            value: 0.9241493167018252
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.9788131706869669
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.9906447766669724
            name: Cosine Recall@5
          - type: cosine_ndcg@10
            value: 0.9634519649573985
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.9524509418552345
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.9526115405885596
            name: Cosine Map@100

SentenceTransformer based on BAAI/bge-small-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-small-en-v1.5. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-small-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sucharush/bge_MNR")
# Run inference
sentences = [
    'Represent this question for retrieving relevant documents: Does low 25-Hydroxyvitamin D Level be Associated with Peripheral Arterial Disease in Type 2 Diabetes Patients?',
    'Patients with type 2 diabetes have an increased risk of atherosclerosis and vascular disease. Vitamin D deficiency is associated with vascular disease and is prevalent in diabetes patients. We undertook this study to determine the association between 25-hydroxyvitamin D (25[OH]D) levels and prevalence of peripheral arterial disease (PAD) in type 2 diabetes patients. A total of 1028 type 2 diabetes patients were recruited at Nanjing Medical University Affiliated Nanjing Hospital from November 2011 to October 2013. PAD was defined as an ankle-brachial index (ABI)\xa0<\xa00.9. Cardiovascular risk factors (blood pressure, HbA1c, lipid profile), comorbidities, carotid intima-media thickness (IMT) and 25(OH)D were assessed. Overall prevalence of PAD and of decreased 25(OH)D (<30\xa0ng/mL) were 20.1% (207/1028) and 54.6% (561/1028), respectively. PAD prevalence was higher in participants with decreased (23.9%) than in those with normal (15.6%) 25(OH)D (≥30\xa0ng/mL, p\xa0<0.01). Decreased 25(OH)D was associated with increased risk of PAD (odds ratio [OR], 1.69, 95% CI: 1.17-2.44, p\xa0<0.001) and PAD was significantly more likely to occur in participants ≥65\xa0years of age (OR, 2.56, 95% CI: 1.51 -4.48, vs. 1.21, 95% CI: 0.80-1.83, p-interaction\xa0=\xa00.027). After adjusting for known cardiovascular risk factors and potential confounding variables, the association of decreased 25(OH)D and PAD remained significant in patients <65\xa0years of age (OR, 1.55; 95% CI: 1.14-2.12, p\xa0=\xa00.006).',
    'Based on the information provided, we only know the number of patients who died within the first year after the surgery. To determine the probability of a patient surviving at least two years, we would need additional information about the number of patients who died in the second year or survived beyond that.\n\nWithout this information, it is not possible to calculate the probability of a patient surviving at least two years after the surgery.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Logging

  • Dataset: ir-eval
  • Evaluated with main.LoggingEvaluator
Metric Value
cosine_accuracy@1 0.9241
cosine_accuracy@3 0.9788
cosine_accuracy@5 0.9906
cosine_accuracy@10 0.9965
cosine_precision@1 0.9241
cosine_precision@3 0.3263
cosine_precision@5 0.1981
cosine_recall@1 0.9241
cosine_recall@3 0.9788
cosine_recall@5 0.9906
cosine_ndcg@10 0.9635
cosine_mrr@10 0.9525
cosine_map@100 0.9526

Training Details

Training Dataset

Unnamed Dataset

  • Size: 98,112 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 18 tokens
    • mean: 55.27 tokens
    • max: 512 tokens
    • min: 9 tokens
    • mean: 317.52 tokens
    • max: 512 tokens
  • Samples:
    sentence_0 sentence_1
    Represent this question for retrieving relevant documents: Are elevated levels of pro-inflammatory oxylipins in older subjects normalized by flaxseed consumption? Oxylipins, including eicosanoids, are highly bioactive molecules endogenously produced from polyunsaturated fatty acids. Oxylipins play a key role in chronic disease progression. It is possible, but unknown, if oxylipin concentrations change with the consumption of functional foods or differ with subject age. Therefore, in a parallel comparator trial, 20 healthy individuals were recruited into a younger (19-28years) or older (45-64years) age group (n=10/group). Participants ingested one muffin/day containing 30g of milled flaxseed (6g alpha-linolenic acid) for 4weeks. Plasma oxylipins were isolated through solid phase extraction, analyzed with HPLC-MS/MS targeted lipidomics, and quantified with the stable isotope dilution method. At baseline, the older group exhibited 13 oxylipins ≥2-fold the concentration of the younger group. Specifically, pro-inflammatory oxylipins 5-hydroxyeicosatetraenoic acid, 9,10,13-trihydroxyoctadecenoic acid, and 9,12,13-trihydroxyoctadecenoic acid were signi...
    Represent this question for retrieving relevant documents: Find the isometries of the metric $ds^2 = dx^2 + dy^2$ over the rectangle $R=[0,a] \times [0,b]$, subject to the additional condition that any isometry $f$ maps $(0,0)$ to $(x_0, y_0)$. Find $x_0$ and $y_0$ such that the isometry $f$ is given by $f(x,y) = (x_0 + x, y_0 - y)$. An isometry is a transformation that preserves the distance between points. In this case, we are looking for transformations that preserve the metric $ds^2 = dx^2 + dy^2$. Let's consider the transformation $f(x,y) = (x_0 + x, y_0 - y)$ and find the conditions on $x_0$ and $y_0$ for it to be an isometry.

    First, let's compute the differential of the transformation:

    $$df = \begin{pmatrix} 1 & 0 \ 0 & -1 \end{pmatrix} \begin{pmatrix} dx \ dy \end{pmatrix} = \begin{pmatrix} dx \ -dy \end{pmatrix}$$

    Now, let's compute the metric under this transformation:

    $$ds'^2 = (dx')^2 + (dy')^2 = dx^2 + (-dy)^2 = dx^2 + dy^2$$

    Since $ds'^2 = ds^2$, the transformation $f(x,y) = (x_0 + x, y_0 - y)$ is an isometry.

    Now, let's find the conditions on $x_0$ and $y_0$ such that the isometry maps $(0,0)$ to $(x_0, y_0)$. Applying the transformation to $(0,0)$, we get:

    $$f(0,0) = (x_0 + 0, y_0 - 0) = (x_0, y_0)$$

    Since the transformation maps $(0,0)$ to $(x_0, y_0)$, there are no additional conditions...
    Represent this question for retrieving relevant documents: Do two di-leucine motifs regulate trafficking and function of mouse ASIC2a? Acid-sensing ion channels (ASICs) are proton-gated cation channels that mediate acid-induced responses in neurons. ASICs are important for mechanosensation, learning and memory, fear, pain, and neuronal injury. ASIC2a is widely expressed in the nervous system and modulates ASIC channel trafficking and activity in both central and peripheral systems. Here, to better understand mechanisms regulating ASIC2a, we searched for potential protein motifs that regulate ASIC2a trafficking.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • num_train_epochs: 1
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step Training Loss ir-eval_cosine_ndcg@10
0.1631 500 0.021 0.9523
0.3262 1000 0.0069 0.9600
0.4892 1500 0.0051 0.9593
0.6523 2000 0.0055 0.9605
0.8154 2500 0.0053 0.9638
0.9785 3000 0.0056 0.9634
1.0 3066 - 0.9635

Framework Versions

  • Python: 3.12.8
  • Sentence Transformers: 3.4.1
  • Transformers: 4.51.3
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.3.0
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}