Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

anasse15
/
MNLP_M3_document_encoder

Sentence Similarity
sentence-transformers
Safetensors
modernbert
feature-extraction
Generated from Trainer
dataset_size:12689
loss:TripletLossWithLogging
Eval Results (legacy)
text-embeddings-inference
Model card Files Files and versions
xet
Community

Instructions to use anasse15/MNLP_M3_document_encoder with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

  • Libraries
  • sentence-transformers

    How to use anasse15/MNLP_M3_document_encoder with sentence-transformers:

    from sentence_transformers import SentenceTransformer
    
    model = SentenceTransformer("anasse15/MNLP_M3_document_encoder")
    
    sentences = [
        "Which of the following statements is true regarding the properties of zinc-activated ion channels and quaternary carbon atoms?\nA. Quaternary carbon atoms are primarily involved in the activation of zinc-activated ion channels.\nB. Both zinc-activated ion channels and quaternary carbon atoms are unique to the rat genome.\nC. Zinc-activated ion channels are cation-permeable and can activate spontaneously, while quaternary carbon atoms are found in hydrocarbons with at least five carbon atoms.\nD. Zinc-activated ion channels are exclusively found in the human genome, while quaternary carbon atoms can only exist in linear alkanes.",
        "A quaternary carbon is a carbon atom bound to four other carbon atoms. For this reason, quaternary carbon atoms are found only in hydrocarbons having at least five carbon atoms. Quaternary carbon atoms can occur in branched alkanes, but not in linear alkanes.\n\nSynthesis \nThe formation of chiral quaternary carbon centers has been a synthetic challenge. Chemists have developed asymmetric Diels–Alder reactions, Heck reaction, Enyne cyclization, cycloaddition reactions, C–H activation, Allylic substitution,  Pauson–Khand reaction,  etc. to construct asymmetric quaternary carbons.\n\nReferences \n\nChemical nomenclature\nOrganic chemistry",
        "Severe fever with thrombocytopenia syndrome (SFTS) is an emerging infectious disease caused by Dabie bandavirus also known as the SFTS virus, first reported between late March and mid-July 2009 in rural areas of Hubei and Henan provinces in Central China. SFTS has fatality rates ranging from 12% to as high as 30% in some areas. The major clinical symptoms of SFTS are fever, vomiting, diarrhea, multiple organ failure, thrombocytopenia (low platelet count), leucopenia (low white blood cell count), and elevated liver enzyme levels.\n\nVirology\nSFTS virus (SFTSV) is a virus in the order Bunyavirales. Person-to-person transmission was not noted in early reports but has since been documented.\n\nThe life cycle of the SFTSV most likely involves arthropod vectors and animal hosts. Humans appear to be largely accidental hosts. SFTSV has been detected in Haemaphysalis longicornis ticks.\n\nEpidemiology\nSFTS occurs in China's rural areas from March to November with the majority of cases from April to July. In 2013, Japan and Korea also reported several cases with deaths.\n\nIn July 2013, South Korea reported a total of eight deaths since August 2012.\n\nIn July 2017, Japanese doctors reported that a woman had died of SFTS after being bitten by a cat that may have itself infected by a tick. The woman had no visible tick bites, leading doctors to believe that the cat — which died as well — was the transmission vector.\n\nIn early 2020 an outbreak occurred in East China, more than 37 people were found with SFTS in Jiangsu province, while 23 more were found infected in Anhui province in August 2020. Seven people have died.\n\nEvolution\nThe virus originated 50–150 years ago and has undergone a recent population expansion.\n\nHistory\nIn 2009 Xue-jie Yu and colleagues isolated the SFTS virus (SFTSV) from SFTS patients’ blood.\n\nReferences\n\nExternal links \n\nArthropod-borne viral fevers and viral haemorrhagic fevers\nInsect-borne diseases\nZoonoses",
        "Lecticans, also known as hyalectans, are a family of proteoglycans (a type protein that is attached to chains of negatively charged polysaccharides) that are components of the extracellular matrix.  There are four members of the lectican family: aggrecan, brevican, neurocan, and versican.  Lecticans interact with hyaluronic acid and tenascin-R to form a ternary complex.\n\nTissue distribution \n\nAggrecan is a major component of extracellular matrix in cartilage whereas versican is widely expressed in a number of connective tissues including those in vascular smooth muscle, skin epithelial cells, and the cells of central and peripheral nervous system. The expression of neurocan and brevican is largely restricted to neural tissues.\n\nStructure \n\nAll four lecticans contain an N-terminal globular domain (G1 domain) that in turn contains an immunoglobulin V-set domain and a Link domain that binds hyaluronic acid; a long extended central domain (CS) that is modified with covalently attached sulfated glycosaminoglycan chains, and a C-terminal globular domain (G3 domain) containing of one or more EGF repeats, a C-type lectin domain and a CRP-like domain. Aggrecan has in addition a globular domain (G2 domain) that is situated between the G1 and CS domains.\n\nSee also \nHyaladherin\n\nReferences \n\nProtein families"
    ]
    embeddings = model.encode(sentences)
    
    similarities = model.similarity(embeddings, embeddings)
    print(similarities.shape)
    # [4, 4]
  • Notebooks
  • Google Colab
  • Kaggle
MNLP_M3_document_encoder
600 MB
Ctrl+K
Ctrl+K
  • 1 contributor
History: 2 commits
anasse15's picture
anasse15
Add new SentenceTransformer model
5229ac8 verified 11 months ago
  • 1_Pooling
    Add new SentenceTransformer model 11 months ago
  • .gitattributes
    1.52 kB
    initial commit 11 months ago
  • README.md
    46 kB
    Add new SentenceTransformer model 11 months ago
  • config.json
    1.21 kB
    Add new SentenceTransformer model 11 months ago
  • config_sentence_transformers.json
    205 Bytes
    Add new SentenceTransformer model 11 months ago
  • model.safetensors
    596 MB
    xet
    Add new SentenceTransformer model 11 months ago
  • modules.json
    229 Bytes
    Add new SentenceTransformer model 11 months ago
  • sentence_bert_config.json
    54 Bytes
    Add new SentenceTransformer model 11 months ago
  • special_tokens_map.json
    694 Bytes
    Add new SentenceTransformer model 11 months ago
  • tokenizer.json
    3.58 MB
    Add new SentenceTransformer model 11 months ago
  • tokenizer_config.json
    20.9 kB
    Add new SentenceTransformer model 11 months ago