mibrahimalpha's picture
Upload 8 files
3382434 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - dense
  - generated_from_trainer
  - dataset_size:35100
  - loss:SoftmaxLoss
base_model: AI-Growth-Lab/PatentSBERTa
widget:
  - source_sentence: >-
      1. A multi-modal monitor system to obtain quantitative, coordinated
      measurement of emissions from a turbine having at least one of a blade and
      a rotor, comprising: a first sensor for measuring at least one type of
      emission generated by the turbine during movement of the at least one of
      the blade and the rotor and generating a first emission signal; a second
      sensor for measuring a different type of emission generated by the turbine
      and generating a second emission signal; a third sensor for measuring a
      different type of emission than that measured by the first and second
      sensors; a data storage unit capable of storing emission signals over
      time; and a housing containing at least the first, second and third
      sensors and capable of being placed operationally at a distance from the
      turbine in an outdoor location to be monitored; wherein each of the first,
      second, and third sensors measures one type of emission selected from
      mechanical wave, optical radiation, electrical, vibration, audible sound,
      and infrasound.
    sentences:
      - >-
        1. An in-wheel motor installed inside a wheel disk of a wheel to
        rotationally drive the wheel around a shaft of the wheel by way of
        applying a current thereto, the in-wheel motor comprising: a coreless
        cylindrical coil body to which a lead wire for applying a current is
        connected, the shaft being inserted in an inner circumferential side of
        the coil body, the coil body supported at one end by a coil body support
        member that is fixed to the shaft; a cylindrical outer yoke that is
        disposed on an outer circumferential side of the coil body, and is fixed
        to the wheel disk; a magnet that is fixed on an inner circumferential
        face of the outer yoke, an inner surface of the magnet disposed
        proximate an outer circumferential face of the coil body; a cylindrical
        inner yoke having an outer circumferential face disposed proximate to an
        inner circumferential face of the coil body, the inner yoke being fixed
        to the outer yoke and being rotatable around the shaft; a brake disk
        that is fixed to an inner circumferential side of the inner yoke; and a
        caliper that is provided on the inner circumferential side of the inner
        yoke to brake the brake disk.
      - >-
        1. A multi-modal monitor system to obtain quantitative, coordinated
        measurement of emissions from a turbine having at least one of a blade
        and a rotor, comprising: a first sensor for measuring at least one type
        of emission generated by the turbine during movement of the at least one
        of the blade and the rotor and generating a first emission signal; a
        second sensor for measuring a different type of emission generated by
        the turbine and generating a second emission signal; a third sensor for
        measuring a different type of emission than that measured by the first
        and second sensors; a data storage unit capable of storing emission
        signals over time; and a housing containing at least the first, second
        and third sensors and capable of being placed operationally at a
        distance from the turbine in an outdoor location to be monitored;
        wherein each of the first, second, and third sensors measures one type
        of emission selected from mechanical wave, optical radiation,
        electrical, vibration, audible sound, and infrasound.
      - >-
        1. An identification medium comprising a cholesteric liquid crystal
        layer on which a hologram is formed, a first supporting member and a
        second supporting member between which the cholesteric liquid crystal
        layer is sandwiched, and at least one thereof is made of transparent
        material which does not disturb circularly polarized light reflected
        from the cholesteric liquid crystal layer, and a mounting region to be
        sewn onto an object, the first supporting member and the second
        supporting member extending to the mounting region and being adhered
        directly to each other by an adhesive layer in the mounting region,
        wherein the first supporting member is a polyurethane film or a cloth,
        and the cholesteric liquid crystal layer is affixed to both of the first
        supporting member and the second supporting member by adhesive layers.
  - source_sentence: >-
      1. A process of making light olefins, in a combined Oxygenate to Olefin
      (XTO)-Olefin Cracking (OC) process, from an oxygen-containing,
      halogenide-containing or sulphur-containing organic feedstock comprising:
      selecting a molecular sieve having pores of 10- or more-membered rings,
      wherein the molecular sieve is a zeolite; contacting the molecular sieve
      with a metal silicate, different from said molecular sieve, comprising at
      least one alkaline earth metal to form a catalyst composite, wherein the
      catalyst composite comprises at least 10 wt % of the zeolite and at least
      0.1 wt % of silicate based on a total weight of the catalyst composite;
      providing a first portion and a second portion of a feedstock that is an
      oxygen-containing, halogenide-containing, or sulphur-containing organic
      feedstock; providing an XTO reaction zone, an OC reaction zone and a
      catalyst regeneration zone, wherein one or more catalysts are in the XTO
      reaction zone and the same one or more catalysts are in the OC reaction
      zone, wherein at least one of the one or more catalysts is the catalyst
      composite; wherein the one or more catalysts circulate in the three zones,
      such that at least a portion of the one or more catalysts from the
      catalyst regeneration zone is passed to the OC reaction zone, at least a
      portion of the one or more catalysts in the OC reaction zone is passed to
      the XTO reaction zone and at least a portion of the one or more catalysts
      in the XTO reaction zone is passed to the catalyst regeneration zone;
      contacting the first portion of the feedstock in the XTO reactor with the
      one or more catalysts at conditions effective to convert at least a
      portion of the feedstock to form an XTO reactor effluent comprising light
      olefins and a heavy hydrocarbon fraction; separating the light olefins
      from the heavy hydrocarbon fraction; and contacting the heavy hydrocarbon
      fraction and the second portion of the feedstock in the OC reactor with
      the one or more catalysts at conditions effective to convert at least a
      portion of the heavy hydrocarbon fraction and the feedstock to light
      olefins.
    sentences:
      - >-
        1. A process of making light olefins, in a combined Oxygenate to Olefin
        (XTO)-Olefin Cracking (OC) process, from an oxygen-containing,
        halogenide-containing or sulphur-containing organic feedstock
        comprising: selecting a molecular sieve having pores of 10- or
        more-membered rings, wherein the molecular sieve is a zeolite;
        contacting the molecular sieve with a metal silicate, different from
        said molecular sieve, comprising at least one alkaline earth metal to
        form a catalyst composite, wherein the catalyst composite comprises at
        least 10 wt % of the zeolite and at least 0.1 wt % of silicate based on
        a total weight of the catalyst composite; providing a first portion and
        a second portion of a feedstock that is an oxygen-containing,
        halogenide-containing, or sulphur-containing organic feedstock;
        providing an XTO reaction zone, an OC reaction zone and a catalyst
        regeneration zone, wherein one or more catalysts are in the XTO reaction
        zone and the same one or more catalysts are in the OC reaction zone,
        wherein at least one of the one or more catalysts is the catalyst
        composite; wherein the one or more catalysts circulate in the three
        zones, such that at least a portion of the one or more catalysts from
        the catalyst regeneration zone is passed to the OC reaction zone, at
        least a portion of the one or more catalysts in the OC reaction zone is
        passed to the XTO reaction zone and at least a portion of the one or
        more catalysts in the XTO reaction zone is passed to the catalyst
        regeneration zone; contacting the first portion of the feedstock in the
        XTO reactor with the one or more catalysts at conditions effective to
        convert at least a portion of the feedstock to form an XTO reactor
        effluent comprising light olefins and a heavy hydrocarbon fraction;
        separating the light olefins from the heavy hydrocarbon fraction; and
        contacting the heavy hydrocarbon fraction and the second portion of the
        feedstock in the OC reactor with the one or more catalysts at conditions
        effective to convert at least a portion of the heavy hydrocarbon
        fraction and the feedstock to light olefins.
      - >-
        1. A needle assembly system comprising: a needle assembly including a
        needle and a needle support; a cover including a distal portion adapted
        to house at least a distal end of the needle and a proximal portion
        adapted to house the needle support, wherein the proximal portion
        includes a first portion and a second portion, wherein the first portion
        has a first inner diameter substantially equal to an outer diameter of
        the needle support such that the first portion of the proximal portion
        of the cover is frictionally engaged with the needle support in a first
        position and the second portion has a second inner diameter greater than
        the diameter of the needle support such that there is radial separation
        between the cover and the needle support in a second position, wherein
        the second portion of the proximal portion has a length greater than or
        equal to a length of the needle support, wherein the needle support is
        configured to be axially advanced from the first position to the second
        position such that a proximal end of the needle assembly does not extend
        past a proximal end of the cover.
      - >-
        1. A membrane electrode assembly for a polymer electrolyte fuel cell,
        comprising: an electrolyte membrane; a catalyst layer; a conductive
        porous gas diffusion layer, wherein the catalyst layer and the
        electrolyte membrane have common boundaries; and grooves for allowing
        one of passage and retention of a fluid being formed in the common
        boundaries, and wherein the grooves have a tapered shape such that a
        width of each groove is largest at the common boundary, and wherein the
        catalyst layer is disposed between the gas diffusion layer and the
        electrolyte membrane.
  - source_sentence: >-
      1. A computer hardware-implemented method of preventing a cascading
      failure in a complex stream computer system, wherein a cascading failure
      results in an untrustworthy output from the complex stream computer
      system, and wherein the computer hardware-implemented method comprises:
      receiving a first set of binary data that identifies multiple
      subcomponents in a complex stream computer system, wherein the identified
      multiple subcomponents comprise multiple upstream subcomponents and a
      downstream subcomponent, and wherein the multiple upstream subcomponents
      execute upstream computational processes; receiving a second set of binary
      data that identifies multiple outputs generated by the multiple upstream
      subcomponents; receiving a third set of binary data that identifies
      multiple inputs to the downstream subcomponent, wherein the identified
      multiple inputs to the downstream subcomponent are the identified multiple
      outputs generated by the multiple upstream subcomponents, and wherein the
      identified multiple inputs are inputs to a downstream computational
      process that is executed by the downstream subcomponent; examining, by
      computer hardware, each of the upstream computational processes to
      determine an accuracy of each of the identified multiple outputs based
      upon: generating, by computer hardware, accuracy values by assigning a
      determined accuracy value to each of the identified multiple outputs,
      wherein the determined accuracy value describes a confidence level of an
      accuracy of each of the identified multiple outputs, and wherein each of
      the identified multiple outputs are created by a separate upstream
      computational process in separate upstream subcomponents from the multiple
      upstream subcomponents; generating, by the computer hardware, weighting
      values by assigning a weighting value to each of the identified multiple
      inputs to the downstream subcomponent, wherein the weighting value
      describes a criticality level of each of the identified multiple inputs
      when executing the downstream computational process in the downstream
      subcomponent; and utilizing, by the computer hardware, the determined
      accuracy values and the weighting values to dynamically adjust which of
      the identified multiple inputs are used by the downstream subcomponent
      until an output from the downstream subcomponent meets a predefined
      trustworthiness level, wherein a trustworthiness of the output from the
      downstream subcomponent is based on the determined accuracy value of each
      of the identified multiple outputs and the weighting value of each of the
      identified multiple inputs to the downstream subcomponent.
    sentences:
      - >-
        1. A method comprising: encoding, by a processing module of a computing
        device, a data segment of a data object into a set of encoded data
        slices; determining, by the processing module, storage requirements of
        the data object; determining, by the processing module, memory device
        capabilities of a plurality of distributed storage units based on types
        of memory devices, wherein at least one of the distributed storage units
        of the plurality of distributed storage units includes multiple types of
        memory devices, and wherein a first type of memory device has first
        memory characteristics and a second type of memory device has second
        memory characteristics; determining, by the processing module, a storage
        mode based on one or more of the storage requirements of the data
        object, the memory device capabilities of a dispersed storage network
        (DSN) memory, and a type of data, the storage mode including a time
        phase indicator specifying one or more time intervals for a given set of
        storage requirements; identifying, by the processing module, a set of
        distributed storage units of the plurality of distributed storage units
        that have at least one or more of the multiple types of memory devices
        based on the storage mode; and sending, by the computing device, at
        least a write threshold number of encoded data slices of the data
        segment to the set of distributed storage units for storage in the at
        least one or more of the multiple types of memory devices in accordance
        with the storage mode, wherein the write threshold number is greater
        than a decode threshold number and less than a total number, wherein the
        decode threshold number corresponds to a minimum number of encoded data
        slices of the set of encoded data slices that is needed to recover the
        data segment, wherein the total number corresponds to a number of
        encoded data slices in the set of encoded data slices.
      - >-
        1. A chemical looping combustion apparatus for solid fuels using
        different oxygen carriers, comprising: a solid fuel chemical looping
        combustor configured to receive solid fuels and to produce carbon
        dioxide and steam by combustion of the solid fuels; a gaseous fuel
        chemical looping combustor configured to receive gaseous fuels and to
        produce carbon dioxide and steam by combustion of the gaseous fuels; and
        a devolatilization reactor configured to produce solids and gases by
        devolatilizing the solid fuels, wherein the solid fuels received by the
        solid fuel chemical looping combustor and the gaseous fuels received by
        the gaseous fuel chemical looping combustor are the solids and the gases
        produced by the devolatilization reactor, respectively, wherein the
        solid fuel chemical looping combustor comprises: an oxidation reactor; a
        loop seal configured to receive a metallic oxide from the oxidation
        reactor; a reduction reactor configured to cause the solid fuels flowing
        from the devolatilization reactor and the metallic oxide transferred
        from the loop seal to react with each other, thereby reducing the oxygen
        carriers; a downcomer connected to an outlet of the loop seal and
        extending to a lower portion of the reduction reactor to receive the
        solid fuels, wherein the oxygen carriers reduced in the reduction
        reactor are provided to the oxidation reactor such that the oxygen
        carriers are re-circulated, and wherein the solid fuels are introduced
        into the reduction reactor from a middle point of a longitudinal length
        of the downcomer.
      - >-
        1. A computer hardware-implemented method of preventing a cascading
        failure in a complex stream computer system, wherein a cascading failure
        results in an untrustworthy output from the complex stream computer
        system, and wherein the computer hardware-implemented method comprises:
        receiving a first set of binary data that identifies multiple
        subcomponents in a complex stream computer system, wherein the
        identified multiple subcomponents comprise multiple upstream
        subcomponents and a downstream subcomponent, and wherein the multiple
        upstream subcomponents execute upstream computational processes;
        receiving a second set of binary data that identifies multiple outputs
        generated by the multiple upstream subcomponents; receiving a third set
        of binary data that identifies multiple inputs to the downstream
        subcomponent, wherein the identified multiple inputs to the downstream
        subcomponent are the identified multiple outputs generated by the
        multiple upstream subcomponents, and wherein the identified multiple
        inputs are inputs to a downstream computational process that is executed
        by the downstream subcomponent; examining, by computer hardware, each of
        the upstream computational processes to determine an accuracy of each of
        the identified multiple outputs based upon: generating, by computer
        hardware, accuracy values by assigning a determined accuracy value to
        each of the identified multiple outputs, wherein the determined accuracy
        value describes a confidence level of an accuracy of each of the
        identified multiple outputs, and wherein each of the identified multiple
        outputs are created by a separate upstream computational process in
        separate upstream subcomponents from the multiple upstream
        subcomponents; generating, by the computer hardware, weighting values by
        assigning a weighting value to each of the identified multiple inputs to
        the downstream subcomponent, wherein the weighting value describes a
        criticality level of each of the identified multiple inputs when
        executing the downstream computational process in the downstream
        subcomponent; and utilizing, by the computer hardware, the determined
        accuracy values and the weighting values to dynamically adjust which of
        the identified multiple inputs are used by the downstream subcomponent
        until an output from the downstream subcomponent meets a predefined
        trustworthiness level, wherein a trustworthiness of the output from the
        downstream subcomponent is based on the determined accuracy value of
        each of the identified multiple outputs and the weighting value of each
        of the identified multiple inputs to the downstream subcomponent.
  - source_sentence: >-
      1. A method comprising the steps of: (a) providing one or more tissues,
      cell types, or a lysate thereof, obtained from a patient administered at
      least one dose of a compound of formula I:  or a pharmaceutically
      acceptable salt thereof, wherein: Ring A is selected from: Ring A is an
      optionally substituted group selected from phenyl, an 8-10 membered
      bicyclic partially unsaturated or aryl ring, a 5-6 membered monocyclic
      heteroaryl ring having 1-4 heteroatoms independently selected from
      nitrogen, oxygen, or sulfur, or an 8-10 membered bicyclic heteroaryl ring
      having 1-5 heteroatoms independently selected from nitrogen, oxygen, or
      sulfur; Ring B is phenyl, a 5-6 membered heteroaryl ring having 1-3
      heteroatoms independently selected from N, O or S, a 5-6 membered
      saturated heterocyclic ring having 1-2 heteroatoms independently selected
      from N, O or S, or an 8-10 membered bicyclic partially unsaturated or aryl
      ring having 1-3 heteroatoms independently selected from N, O or S; R R L
      is a bivalent C L is a bivalent C L is a bivalent C L is a bivalent C L is
      a bivalent C Y is hydrogen, C L is a bivalent C Y is C L is a covalent
      bond and Y is selected from: L is —C(O)— and Y is selected from: L is
      —N(R)C(O)— and Y is selected from: L is a bivalent C L is —CH each R R W
      is a bivalent C R m is 0, 1, 2, 3 or 4; each R (b) contacting said tissue,
      cell type, or a lysate thereof, with a compound of formula I, tethered to
      a detectable moiety to form a probe compound, wherein at least one protein
      kinase present in said tissue, cell type, or a lysate thereof, is
      covalently modified and the detectable moiety is selected from the group
      consisting of a fluorescent label, mass-tag, chemiluminescent group,
      chromophore, electron dense group, or an energy transfer agent; and (c)
      measuring the amount of said protein kinase covalently modified by the
      probe compound thereby to determine occupancy of said protein kinase by
      said compound of formula I as compared to occupancy of said protein kinase
      by said probe compound.
    sentences:
      - >-
        1. A detection circuit that is connectable to a magnetic sensor in which
        a first sensor unit and a second sensor unit are arranged at a
        predetermined angle with respect to each other, each sensor unit having
        a bridge circuit of magnetoresistance elements, the detection circuit
        comprising: a first comparison circuit including: a second comparison
        circuit including: a rotation angle calculation circuit that calculates
        a rotation angle of a magnetic field based on one of the comparison
        results of the first comparison circuit and a comparison result of the
        second comparison circuit, the rotation angle calculation circuit
        including a logic circuit that generates a third detection signal based
        on a comparison result of the third comparator and a comparison result
        of the fourth comparator.
      - >-
        1. A method comprising the steps of: (a) providing one or more tissues,
        cell types, or a lysate thereof, obtained from a patient administered at
        least one dose of a compound of formula I:  or a pharmaceutically
        acceptable salt thereof, wherein: Ring A is selected from: Ring A is an
        optionally substituted group selected from phenyl, an 8-10 membered
        bicyclic partially unsaturated or aryl ring, a 5-6 membered monocyclic
        heteroaryl ring having 1-4 heteroatoms independently selected from
        nitrogen, oxygen, or sulfur, or an 8-10 membered bicyclic heteroaryl
        ring having 1-5 heteroatoms independently selected from nitrogen,
        oxygen, or sulfur; Ring B is phenyl, a 5-6 membered heteroaryl ring
        having 1-3 heteroatoms independently selected from N, O or S, a 5-6
        membered saturated heterocyclic ring having 1-2 heteroatoms
        independently selected from N, O or S, or an 8-10 membered bicyclic
        partially unsaturated or aryl ring having 1-3 heteroatoms independently
        selected from N, O or S; R R L is a bivalent C L is a bivalent C L is a
        bivalent C L is a bivalent C L is a bivalent C Y is hydrogen, C L is a
        bivalent C Y is C L is a covalent bond and Y is selected from: L is
        —C(O)— and Y is selected from: L is —N(R)C(O)— and Y is selected from: L
        is a bivalent C L is —CH each R R W is a bivalent C R m is 0, 1, 2, 3 or
        4; each R (b) contacting said tissue, cell type, or a lysate thereof,
        with a compound of formula I, tethered to a detectable moiety to form a
        probe compound, wherein at least one protein kinase present in said
        tissue, cell type, or a lysate thereof, is covalently modified and the
        detectable moiety is selected from the group consisting of a fluorescent
        label, mass-tag, chemiluminescent group, chromophore, electron dense
        group, or an energy transfer agent; and (c) measuring the amount of said
        protein kinase covalently modified by the probe compound thereby to
        determine occupancy of said protein kinase by said compound of formula I
        as compared to occupancy of said protein kinase by said probe compound.
      - >-
        1. A method of treating lupus in a mammal, the method comprising
        administering to the mammal an antibody which binds an interleukin 3
        receptor α (IL-3Rα) chain and which kills a plasmacytoid dendritic cell
        (pDC) or basophil to which it binds to thereby treat lupus in the
        mammal, wherein the antibody comprises the variable regions of antibody
        7G3 or is a humanized form of antibody 7G3, and wherein the antibody is
        not conjugated to a toxic compound that kills a cell to which the
        antibody binds, and wherein the antibody is capable of inducing an
        enhanced level of effector function, and wherein the effector function
        is antibody-dependent cell cytotoxicity (ADCC) and/or antibody-dependent
        cell mediated phagocytosis (ADCP).
  - source_sentence: >-
      1. A sputtering target having a component composition that contains 1 to
      40 at % of Ga, 0.05 to 2 at % of Na as metal element components, and the
      balance composed of Cu and unavoidable impurities, wherein the sputtering
      target contains Na in at least one form selected from among sodium
      fluoride, sodium sulfide, and sodium selenide and the content of oxygen is
      from 100 to 1,000 ppm.
    sentences:
      - >-
        1. An insulation bobbin unit of a stator, comprising: a first insulation
        bobbin having a first body and a plurality of first extension members
        coupled with the first body, wherein the first body has a first assembly
        hole, each of the extension members has a first wound portion, the first
        wound portion has a first top plate and one first side wall located on
        one side of the first top plate, and a thickness of the first top plate
        is smaller than that of the first side wall; and a second insulation
        bobbin having a second body and a plurality of second extension members,
        wherein the second body is coupled with the first body and has a second
        assembly hole aligning and communicating with the first assembly hole,
        the second extension members are coupled with the second body and
        aligned with the first extension members, each of the second extension
        members has a second wound portion, the second wound portion has a
        second top plate and one second side wall located on one side of the
        second top plate, and a room is defined by the first top plate, the
        first side wall, the second top plate and the second side wall, wherein
        the first side wall is aligned with one edge of the second top plate
        that is not mounted with the second side wall, and the second side wall
        is aligned with one edge of the first top plate that is not mounted with
        the first side wall.
      - >-
        1. A sputtering target having a component composition that contains 1 to
        40 at % of Ga, 0.05 to 2 at % of Na as metal element components, and the
        balance composed of Cu and unavoidable impurities, wherein the
        sputtering target contains Na in at least one form selected from among
        sodium fluoride, sodium sulfide, and sodium selenide and the content of
        oxygen is from 100 to 1,000 ppm.
      - >-
        1. An electrical energy supply system providing voltage to a first load,
        comprising: an external power group providing an external voltage; and a
        DC supply device receiving the external voltage and comprising: a first
        bus receiving the external voltage and coupled to the first load; a
        first converting unit converting the external voltage into a first
        converted voltage when a voltage level of the first bus reaches a
        pre-determined level, and converting a first stored voltage to generate
        a converted result when the voltage level of the first bus is less than
        the pre-determined level; a first storage unit storing the first
        converted voltage when the voltage level of the first bus reaches the
        pre-determined level and providing the first stored voltage to the first
        converting unit when the voltage level of the first bus is less than the
        pre-determined level; and a first smart energy management system (SEMS)
        controlling at least one of the first converting unit, the external
        power group and the first load according to at least one of the external
        voltage, a voltage level of the first bus and a voltage level of the
        first storage unit, wherein the first SEMS controls the external power
        group to adjust the external voltage according to the voltage level of
        the first bus.
pipeline_tag: sentence-similarity
library_name: sentence-transformers

SentenceTransformer based on AI-Growth-Lab/PatentSBERTa

This is a sentence-transformers model finetuned from AI-Growth-Lab/PatentSBERTa. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: AI-Growth-Lab/PatentSBERTa
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'MPNetModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    '1. A sputtering target having a component composition that contains 1 to 40 at % of Ga, 0.05 to 2 at % of Na as metal element components, and the balance composed of Cu and unavoidable impurities, wherein the sputtering target contains Na in at least one form selected from among sodium fluoride, sodium sulfide, and sodium selenide and the content of oxygen is from 100 to 1,000 ppm.',
    '1. A sputtering target having a component composition that contains 1 to 40 at % of Ga, 0.05 to 2 at % of Na as metal element components, and the balance composed of Cu and unavoidable impurities, wherein the sputtering target contains Na in at least one form selected from among sodium fluoride, sodium sulfide, and sodium selenide and the content of oxygen is from 100 to 1,000 ppm.',
    '1. An electrical energy supply system providing voltage to a first load, comprising: an external power group providing an external voltage; and a DC supply device receiving the external voltage and comprising: a first bus receiving the external voltage and coupled to the first load; a first converting unit converting the external voltage into a first converted voltage when a voltage level of the first bus reaches a pre-determined level, and converting a first stored voltage to generate a converted result when the voltage level of the first bus is less than the pre-determined level; a first storage unit storing the first converted voltage when the voltage level of the first bus reaches the pre-determined level and providing the first stored voltage to the first converting unit when the voltage level of the first bus is less than the pre-determined level; and a first smart energy management system (SEMS) controlling at least one of the first converting unit, the external power group and the first load according to at least one of the external voltage, a voltage level of the first bus and a voltage level of the first storage unit, wherein the first SEMS controls the external power group to adjust the external voltage according to the voltage level of the first bus.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 1.0000, 0.0550],
#         [1.0000, 1.0000, 0.0550],
#         [0.0550, 0.0550, 1.0000]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 35,100 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string int
    details
    • min: 13 tokens
    • mean: 192.87 tokens
    • max: 512 tokens
    • min: 13 tokens
    • mean: 192.87 tokens
    • max: 512 tokens
    • 0: ~50.10%
    • 1: ~49.90%
  • Samples:
    sentence_0 sentence_1 label
    1. A method for producing float glass, comprising: feeding air to a first ion transport membrane which produces a stream of pure oxygen and a stream of oxygen-depleted air; feeding the stream of pure oxygen to a glass melting furnace; feeding a mixture of steam and a hydrocarbon fuel to one side of a second ion transport membrane and the stream of oxygen-depleted air to the other side of the second oxygen transport membrane to produce a stream of syngas and a nitrogen-rich stream; feeding the stream of syngas to a third ion transport membrane to produce a stream of pure hydrogen and a stream of hydrogen-depleted syngas; feeding the nitrogen-rich stream the hydrogen-depleted syngas stream to a combustor to produce an oxygen-free stream of nitrogen and carbon dioxide; removing H mixing the stream of pure hydrogen and the purified stream of nitrogen and carbon dioxide; and feeding the mixed stream to the surface of a float glass bath downstream of the glass melting furnace. 1. A method for producing float glass, comprising: feeding air to a first ion transport membrane which produces a stream of pure oxygen and a stream of oxygen-depleted air; feeding the stream of pure oxygen to a glass melting furnace; feeding a mixture of steam and a hydrocarbon fuel to one side of a second ion transport membrane and the stream of oxygen-depleted air to the other side of the second oxygen transport membrane to produce a stream of syngas and a nitrogen-rich stream; feeding the stream of syngas to a third ion transport membrane to produce a stream of pure hydrogen and a stream of hydrogen-depleted syngas; feeding the nitrogen-rich stream the hydrogen-depleted syngas stream to a combustor to produce an oxygen-free stream of nitrogen and carbon dioxide; removing H mixing the stream of pure hydrogen and the purified stream of nitrogen and carbon dioxide; and feeding the mixed stream to the surface of a float glass bath downstream of the glass melting furnace. 1
    1. An application device for a cosmetic product comprising: a holding member, an application member having a surface for application of the product, and a heating electric element; wherein the heating electric element is formed of at least one resistor mounted on a printed circuit positioned, at least in part at a distal end of the application member, and in that a surface area of the orthogonal projection of the resistor on a plane defined by the printed circuit is less than or equal to 10 mm 1. An application device for a cosmetic product comprising: a holding member, an application member having a surface for application of the product, and a heating electric element; wherein the heating electric element is formed of at least one resistor mounted on a printed circuit positioned, at least in part at a distal end of the application member, and in that a surface area of the orthogonal projection of the resistor on a plane defined by the printed circuit is less than or equal to 10 mm 0
    1. A vehicle communication network comprises: a plurality of vehicle control modules; a network fabric, wherein the network fabric comprises: a network manager operably coupled to the network fabric, wherein the network manager is operable to: wherein the data bridge is operable to: 1. A vehicle communication network comprises: a plurality of vehicle control modules; a network fabric, wherein the network fabric comprises: a network manager operably coupled to the network fabric, wherein the network manager is operable to: wherein the data bridge is operable to: 1
  • Loss: SoftmaxLoss

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 1
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: None
  • warmup_ratio: None
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • enable_jit_checkpoint: False
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • use_cpu: False
  • seed: 42
  • data_seed: None
  • bf16: False
  • fp16: False
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: -1
  • ddp_backend: None
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • auto_find_batch_size: False
  • full_determinism: False
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • use_cache: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss
0.2279 500 0.5229
0.4558 1000 0.4447
0.6837 1500 0.4322
0.9116 2000 0.4234

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 5.2.2
  • Transformers: 5.1.0
  • PyTorch: 2.10.0+cu128
  • Accelerate: 1.12.0
  • Datasets: 4.5.0
  • Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers and SoftmaxLoss

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}