Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

dataera2013
/
mt-1

Sentence Similarity
sentence-transformers
Safetensors
bert
feature-extraction
Generated from Trainer
dataset_size:200
loss:MatryoshkaLoss
loss:MultipleNegativesRankingLoss
Eval Results (legacy)
text-embeddings-inference
Model card Files Files and versions
xet
Community

Instructions to use dataera2013/mt-1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

  • Libraries
  • sentence-transformers

    How to use dataera2013/mt-1 with sentence-transformers:

    from sentence_transformers import SentenceTransformer
    
    model = SentenceTransformer("dataera2013/mt-1")
    
    sentences = [
        "How do longer inputs enhance the problem-solving capabilities of an LLM?",
        "Longer inputs dramatically increase the scope of problems that can be solved with an LLM: you can now throw in an entire book and ask questions about its contents, but more importantly you can feed in a lot of example code to help the model correctly solve a coding problem. LLM use-cases that involve long inputs are far more interesting to me than short prompts that rely purely on the information already baked into the model weights. Many of my tools were built using this pattern.",
        "If you think about what they do, this isn’t such a big surprise. The grammar rules of programming languages like Python and JavaScript are massively less complicated than the grammar of Chinese, Spanish or English.\nIt’s still astonishing to me how effective they are though.\nOne of the great weaknesses of LLMs is their tendency to hallucinate—to imagine things that don’t correspond to reality. You would expect this to be a particularly bad problem for code—if an LLM hallucinates a method that doesn’t exist, the code should be useless.",
        "blogging\n            68\n\n\n            ai\n            1098\n\n\n            generative-ai\n            942\n\n\n            llms\n            930\n\nNext: Tom Scott, and the formidable power of escalating streaks\nPrevious: Last weeknotes of 2023\n\n\n \n \n\n\nColophon\n©\n2002\n2003\n2004\n2005\n2006\n2007\n2008\n2009\n2010\n2011\n2012\n2013\n2014\n2015\n2016\n2017\n2018\n2019\n2020\n2021\n2022\n2023\n2024\n2025"
    ]
    embeddings = model.encode(sentences)
    
    similarities = model.similarity(embeddings, embeddings)
    print(similarities.shape)
    # [4, 4]
  • Notebooks
  • Google Colab
  • Kaggle
mt-1
1.34 GB
Ctrl+K
Ctrl+K
  • 1 contributor
History: 2 commits
dataera2013's picture
dataera2013
Add new SentenceTransformer model
0b52910 verified about 1 year ago
  • 1_Pooling
    Add new SentenceTransformer model about 1 year ago
  • .gitattributes
    1.52 kB
    initial commit about 1 year ago
  • README.md
    28.4 kB
    Add new SentenceTransformer model about 1 year ago
  • config.json
    641 Bytes
    Add new SentenceTransformer model about 1 year ago
  • config_sentence_transformers.json
    281 Bytes
    Add new SentenceTransformer model about 1 year ago
  • model.safetensors
    1.34 GB
    xet
    Add new SentenceTransformer model about 1 year ago
  • modules.json
    349 Bytes
    Add new SentenceTransformer model about 1 year ago
  • sentence_bert_config.json
    53 Bytes
    Add new SentenceTransformer model about 1 year ago
  • special_tokens_map.json
    695 Bytes
    Add new SentenceTransformer model about 1 year ago
  • tokenizer.json
    712 kB
    Add new SentenceTransformer model about 1 year ago
  • tokenizer_config.json
    1.41 kB
    Add new SentenceTransformer model about 1 year ago
  • vocab.txt
    232 kB
    Add new SentenceTransformer model about 1 year ago