Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up

Stergios-Konstantinidis
/
MNLP_M2_document_encoder

Sentence Similarity
sentence-transformers
Safetensors
bert
feature-extraction
Generated from Trainer
dataset_size:21000
loss:ContrastiveTensionLoss
text-embeddings-inference
Model card Files Files and versions
xet
Community

Instructions to use Stergios-Konstantinidis/MNLP_M2_document_encoder with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

  • Libraries
  • sentence-transformers

    How to use Stergios-Konstantinidis/MNLP_M2_document_encoder with sentence-transformers:

    from sentence_transformers import SentenceTransformer
    
    model = SentenceTransformer("Stergios-Konstantinidis/MNLP_M2_document_encoder")
    
    sentences = [
        "        \"The lemma follows by invoking Lemma 4.1 and Lemma A.1.\\n\\u220e\",",
        "        \"To better address non-stationarity with changing uncertainty, we introduce Location-Scale Noise Model (LSNM) into DDPMs, which relaxes the traditional Additive Noise Model (ANM) by incorporating a contextually changing variance: \\ud835\\udc18=f\\u2062(\\ud835\\udc17)+g\\u2062(\\ud835\\udc17)\\u2062\\u03f5\\ud835\\udc18\\ud835\\udc53\\ud835\\udc17\\ud835\\udc54\\ud835\\udc17bold-italic-\\u03f5\\\\mathbf{Y}=f(\\\\mathbf{X})+\\\\sqrt{g(\\\\mathbf{X})}\\\\boldsymbol{\\\\epsilon}bold_Y = italic_f ( bold_X ) + square-root start_ARG italic_g ( bold_X ) end_ARG bold_italic_\\u03f5, where g\\u2062(\\ud835\\udc17)\\ud835\\udc54\\ud835\\udc17g(\\\\mathbf{X})italic_g ( bold_X ) is an \\ud835\\udc17\\ud835\\udc17\\\\mathbf{X}bold_X-dependent variance model. LSNM is capable of modeling both the contextual mean through f\\u2062(\\ud835\\udc17)\\ud835\\udc53\\ud835\\udc17f(\\\\mathbf{X})italic_f ( bold_X ) and the contextual uncertainty through g\\u2062(\\ud835\\udc17)\\ud835\\udc54\\ud835\\udc17\\\\sqrt{g(\\\\mathbf{X})}square-root start_ARG italic_g ( bold_X ) end_ARG. In the special case where g\\u2062(\\ud835\\udc17)\\u22611\\ud835\\udc54\\ud835\\udc171g(\\\\mathbf{X})\\\\equiv 1italic_g ( bold_X ) \\u2261 1, this simplifies to the standard ANM. Building upon this more flexible and expressive assumption, we propose the Non-stationary Diffusion Model (NsDiff) framework, which provides an uncertainty-aware noise schedule for both forward and reverse diffusion processes. In summary, our contributions are as:\\n\\n\\n\\u2022\\n\\nWe observe that the ANM is inadequate for capturing the varying uncertainty and propose a novel framework that integrates LSNM to allow for explict uncertainty modeling. This work is the first attempt to introduce LSNM into probabilistic time series forecasting.\\n\\n\\n\\n\\u2022\\n\\nTo fundamentally elevate the noise modeling capabilities of DDPM, we seamlessly integrate time-varying variances into the core diffusion process through an uncertainty-aware noise schedule that dynamically adapts the noise variance at each step.\\n\\n\\n\\n\\n\\u2022\\n\\nExperimental results indicate that NsDiff achieves superior performance in capturing uncertainty. Specifically, in comparison to the second-best recent baseline TMDM, NsDiff improves up to 66.3% on real-world datasets and 88.3% on synthetic datasets.\",",
        "        \"The deep neural network representation of the Bifrost simulations is highly compressed compared to the original Bifrost data: the deep neural network has 44,261 floating point values whereas the Bifrost simulation cube has 96\\u22c596\\u22c564\\u22c520=11,796,480\\u22c5969664201179648096\\\\cdot 96\\\\cdot 64\\\\cdot 20=11,796,48096 \\u22c5 96 \\u22c5 64 \\u22c5 20 = 11 , 796 , 480 floating point values. This corresponds to a compression by a factor of 267; this compression factor may be different for other numerical simulations and depends on their smoothness. In addition, the deep neural network can be evaluated at any point in space and time covered by the simulations, therefore enabling a trivial way to interpolate between grid points; furthermore, gradients are calculate with high efficiency with automatic differentiation. As such, it might be worth considering releasing deep-neural-network representations of (magneto)hydrodynamic simulations.\",",
        "        \"\\u03f5y\\u2062(\\u03bc)={1nt\\u2062\\u2211i=nkntey\\u2062(ti,\\u03bc)=1nt\\u2062\\u2211i=nknt|y~\\u2062(ti,\\u03bc)\\u2212y\\u2062(ti,\\u03bc)|if\\u00a0\\u20621nt\\u2062\\u2211i=nknt|y\\u2062(ti,\\u03bc)|\\u22641,1nt\\u2062\\u2211i=nkntey,r\\u2062e\\u2062l\\u2062(ti,\\u03bc)=1nt\\u2062\\u2211i=nknt|y~\\u2062(ti,\\u03bc)\\u2212y\\u2062(ti,\\u03bc)|/|y\\u2062(ti,\\u03bc)|if\\u00a0\\u20621nt\\u2062\\u2211i=nknt|y\\u2062(ti,\\u03bc)|>1.subscriptitalic-\\u03f5\\ud835\\udc66\\ud835\\udf07cases1subscript\\ud835\\udc5b\\ud835\\udc61superscriptsubscript\\ud835\\udc56subscript\\ud835\\udc5b\\ud835\\udc58subscript\\ud835\\udc5b\\ud835\\udc61subscript\\ud835\\udc52\\ud835\\udc66subscript\\ud835\\udc61\\ud835\\udc56\\ud835\\udf071subscript\\ud835\\udc5b\\ud835\\udc61superscriptsubscript\\ud835\\udc56subscript\\ud835\\udc5b\\ud835\\udc58subscript\\ud835\\udc5b\\ud835\\udc61~\\ud835\\udc66subscript\\ud835\\udc61\\ud835\\udc56\\ud835\\udf07\\ud835\\udc66subscript\\ud835\\udc61\\ud835\\udc56\\ud835\\udf07if\\u00a01subscript\\ud835\\udc5b\\ud835\\udc61superscriptsubscript\\ud835\\udc56subscript\\ud835\\udc5b\\ud835\\udc58subscript\\ud835\\udc5b\\ud835\\udc61\\ud835\\udc66subscript\\ud835\\udc61\\ud835\\udc56\\ud835\\udf0711subscript\\ud835\\udc5b\\ud835\\udc61superscriptsubscript\\ud835\\udc56subscript\\ud835\\udc5b\\ud835\\udc58subscript\\ud835\\udc5b\\ud835\\udc61subscript\\ud835\\udc52\\ud835\\udc66\\ud835\\udc5f\\ud835\\udc52\\ud835\\udc59subscript\\ud835\\udc61\\ud835\\udc56\\ud835\\udf071subscript\\ud835\\udc5b\\ud835\\udc61superscriptsubscript\\ud835\\udc56subscript\\ud835\\udc5b\\ud835\\udc58subscript\\ud835\\udc5b\\ud835\\udc61~\\ud835\\udc66subscript\\ud835\\udc61\\ud835\\udc56\\ud835\\udf07\\ud835\\udc66subscript\\ud835\\udc61\\ud835\\udc56\\ud835\\udf07\\ud835\\udc66subscript\\ud835\\udc61\\ud835\\udc56\\ud835\\udf07if\\u00a01subscript\\ud835\\udc5b\\ud835\\udc61superscriptsubscript\\ud835\\udc56subscript\\ud835\\udc5b\\ud835\\udc58subscript\\ud835\\udc5b\\ud835\\udc61\\ud835\\udc66subscript\\ud835\\udc61\\ud835\\udc56\\ud835\\udf071\\\\centering\\\\epsilon_{y}(\\\\mu)=\\\\begin{cases}\\\\frac{1}{n_{t}}\\\\sum\\\\limits_{i=n_{k}}^%\\n{n_{t}}e_{y}(t_{i},\\\\mu)=\\\\frac{1}{n_{t}}\\\\sum\\\\limits_{i=n_{k}}^{n_{t}}|\\\\tilde{y}%\\n(t_{i},\\\\mu)-y(t_{i},\\\\mu)|&\\\\text{if }\\\\frac{1}{n_{t}}\\\\sum\\\\limits_{i=n_{k}}^{n_{t%\\n}}|y(t_{i},\\\\mu)|\\\\leq 1,\\\\\\\\\\n\\\\frac{1}{n_{t}}\\\\sum\\\\limits_{i=n_{k}}^{n_{t}}e_{y,rel}(t_{i},\\\\mu)=\\\\frac{1}{n_{t%\\n}}\\\\sum\\\\limits_{i=n_{k}}^{n_{t}}|\\\\tilde{y}(t_{i},\\\\mu)-y(t_{i},\\\\mu)|/|y(t_{i},%\\n\\\\mu)|&\\\\text{if }\\\\frac{1}{n_{t}}\\\\sum\\\\limits_{i=n_{k}}^{n_{t}}|y(t_{i},\\\\mu)|>1.%\\n\\\\end{cases}\\\\@add@centeringitalic_\\u03f5 start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT ( italic_\\u03bc ) = { start_ROW start_CELL divide start_ARG 1 end_ARG start_ARG italic_n start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG \\u2211 start_POSTSUBSCRIPT italic_i = italic_n start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_e start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_\\u03bc ) = divide start_ARG 1 end_ARG start_ARG italic_n start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG \\u2211 start_POSTSUBSCRIPT italic_i = italic_n start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUPERSCRIPT | over~ start_ARG italic_y end_ARG ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_\\u03bc ) - italic_y ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_\\u03bc ) | end_CELL start_CELL if divide start_ARG 1 end_ARG start_ARG italic_n start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG \\u2211 start_POSTSUBSCRIPT italic_i = italic_n start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUPERSCRIPT | italic_y ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_\\u03bc ) | \\u2264 1 , end_CELL end_ROW start_ROW start_CELL divide start_ARG 1 end_ARG start_ARG italic_n start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG \\u2211 start_POSTSUBSCRIPT italic_i = italic_n start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_e start_POSTSUBSCRIPT italic_y , italic_r italic_e italic_l end_POSTSUBSCRIPT ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_\\u03bc ) = divide start_ARG 1 end_ARG start_ARG italic_n start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG \\u2211 start_POSTSUBSCRIPT italic_i = italic_n start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUPERSCRIPT | over~ start_ARG italic_y end_ARG ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_\\u03bc ) - italic_y ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_\\u03bc ) | / | italic_y ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_\\u03bc ) | end_CELL start_CELL if divide start_ARG 1 end_ARG start_ARG italic_n start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG \\u2211 start_POSTSUBSCRIPT italic_i = italic_n start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUPERSCRIPT | italic_y ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_\\u03bc ) | > 1 . end_CELL end_ROW\\n\\n(12)\","
    ]
    embeddings = model.encode(sentences)
    
    similarities = model.similarity(embeddings, embeddings)
    print(similarities.shape)
    # [4, 4]
  • Notebooks
  • Google Colab
  • Kaggle
MNLP_M2_document_encoder
91.9 MB
Ctrl+K
Ctrl+K
  • 1 contributor
History: 2 commits
Stergios-Konstantinidis's picture
Stergios-Konstantinidis
Add new SentenceTransformer model
2695b3b verified 12 months ago
  • 1_Pooling
    Add new SentenceTransformer model 12 months ago
  • .gitattributes
    1.52 kB
    initial commit 12 months ago
  • README.md
    65.2 kB
    Add new SentenceTransformer model 12 months ago
  • config.json
    617 Bytes
    Add new SentenceTransformer model 12 months ago
  • config_sentence_transformers.json
    205 Bytes
    Add new SentenceTransformer model 12 months ago
  • model.safetensors
    90.9 MB
    xet
    Add new SentenceTransformer model 12 months ago
  • modules.json
    349 Bytes
    Add new SentenceTransformer model 12 months ago
  • sentence_bert_config.json
    53 Bytes
    Add new SentenceTransformer model 12 months ago
  • special_tokens_map.json
    695 Bytes
    Add new SentenceTransformer model 12 months ago
  • tokenizer.json
    712 kB
    Add new SentenceTransformer model 12 months ago
  • tokenizer_config.json
    1.46 kB
    Add new SentenceTransformer model 12 months ago
  • vocab.txt
    232 kB
    Add new SentenceTransformer model 12 months ago