Sentence Similarity
sentence-transformers
Safetensors
bert
feature-extraction
Generated from Trainer
dataset_size:156
loss:MatryoshkaLoss
loss:MultipleNegativesRankingLoss
Eval Results (legacy)
text-embeddings-inference
Instructions to use Technologic101/finetuned_arctic_ft with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use Technologic101/finetuned_arctic_ft with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("Technologic101/finetuned_arctic_ft") sentences = [ "What significant multi-modal models were released by major vendors in 2024?", "The boring yet crucial secret behind good system prompts is test-driven development. You don’t write down a system prompt and find ways to test it. You write down tests and find a system prompt that passes them.\n\nIt’s become abundantly clear over the course of 2024 that writing good automated evals for LLM-powered systems is the skill that’s most needed to build useful applications on top of these models. If you have a strong eval suite you can adopt new models faster, iterate better and build more reliable and useful product features than your competition.\nVercel’s Malte Ubl:", "In 2024, almost every significant model vendor released multi-modal models. We saw the Claude 3 series from Anthropic in March, Gemini 1.5 Pro in April (images, audio and video), then September brought Qwen2-VL and Mistral’s Pixtral 12B and Meta’s Llama 3.2 11B and 90B vision models. We got audio input and output from OpenAI in October, then November saw SmolVLM from Hugging Face and December saw image and video models from Amazon Nova.\nIn October I upgraded my LLM CLI tool to support multi-modal models via attachments. It now has plugins for a whole collection of different vision models.", "Intuitively, one would expect that systems this powerful would take millions of lines of complex code. Instead, it turns out a few hundred lines of Python is genuinely enough to train a basic version!\nWhat matters most is the training data. You need a lot of data to make these things work, and the quantity and quality of the training data appears to be the most important factor in how good the resulting model is.\nIf you can gather the right data, and afford to pay for the GPUs to train it, you can build an LLM." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Notebooks
- Google Colab
- Kaggle
Ctrl+K