Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up

Technologic101
/
finetuned_arctic_ft

Sentence Similarity
sentence-transformers
Safetensors
bert
feature-extraction
Generated from Trainer
dataset_size:156
loss:MatryoshkaLoss
loss:MultipleNegativesRankingLoss
Eval Results (legacy)
text-embeddings-inference
Model card Files Files and versions
xet
Community

Instructions to use Technologic101/finetuned_arctic_ft with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

  • Libraries
  • sentence-transformers

    How to use Technologic101/finetuned_arctic_ft with sentence-transformers:

    from sentence_transformers import SentenceTransformer
    
    model = SentenceTransformer("Technologic101/finetuned_arctic_ft")
    
    sentences = [
        "What significant multi-modal models were released by major vendors in 2024?",
        "The boring yet crucial secret behind good system prompts is test-driven development. You don’t write down a system prompt and find ways to test it. You write down tests and find a system prompt that passes them.\n\nIt’s become abundantly clear over the course of 2024 that writing good automated evals for LLM-powered systems is the skill that’s most needed to build useful applications on top of these models. If you have a strong eval suite you can adopt new models faster, iterate better and build more reliable and useful product features than your competition.\nVercel’s Malte Ubl:",
        "In 2024, almost every significant model vendor released multi-modal models. We saw the Claude 3 series from Anthropic in March, Gemini 1.5 Pro in April (images, audio and video), then September brought Qwen2-VL and Mistral’s Pixtral 12B and Meta’s Llama 3.2 11B and 90B vision models. We got audio input and output from OpenAI in October, then November saw SmolVLM from Hugging Face and December saw image and video models from Amazon Nova.\nIn October I upgraded my LLM CLI tool to support multi-modal models via attachments. It now has plugins for a whole collection of different vision models.",
        "Intuitively, one would expect that systems this powerful would take millions of lines of complex code. Instead, it turns out a few hundred lines of Python is genuinely enough to train a basic version!\nWhat matters most is the training  data. You need a lot of data to make these things work, and the quantity and quality of the training data appears to be the most important factor in how good the resulting model is.\nIf you can gather the right data, and afford to pay for the GPUs to train it, you can build an LLM."
    ]
    embeddings = model.encode(sentences)
    
    similarities = model.similarity(embeddings, embeddings)
    print(similarities.shape)
    # [4, 4]
  • Notebooks
  • Google Colab
  • Kaggle
finetuned_arctic_ft
1.34 GB
Ctrl+K
Ctrl+K
  • 1 contributor
History: 2 commits
Technologic101's picture
Technologic101
Add new SentenceTransformer model
5872059 verified about 1 year ago
  • 1_Pooling
    Add new SentenceTransformer model about 1 year ago
  • .gitattributes
    1.52 kB
    initial commit about 1 year ago
  • README.md
    29.3 kB
    Add new SentenceTransformer model about 1 year ago
  • config.json
    641 Bytes
    Add new SentenceTransformer model about 1 year ago
  • config_sentence_transformers.json
    281 Bytes
    Add new SentenceTransformer model about 1 year ago
  • model.safetensors
    1.34 GB
    xet
    Add new SentenceTransformer model about 1 year ago
  • modules.json
    349 Bytes
    Add new SentenceTransformer model about 1 year ago
  • sentence_bert_config.json
    53 Bytes
    Add new SentenceTransformer model about 1 year ago
  • special_tokens_map.json
    695 Bytes
    Add new SentenceTransformer model about 1 year ago
  • tokenizer.json
    712 kB
    Add new SentenceTransformer model about 1 year ago
  • tokenizer_config.json
    1.41 kB
    Add new SentenceTransformer model about 1 year ago
  • vocab.txt
    232 kB
    Add new SentenceTransformer model about 1 year ago