metadata
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
base_model: distilbert/distilbert-base-uncased
model-index:
- name: prdev/mini-gte
results:
- dataset:
config: en
name: MTEB AmazonCounterfactualClassification (en)
revision: e8379541af4e31359cca9fbcf4b00f2671dba205
split: test
type: mteb/amazon_counterfactual
metrics:
- type: accuracy
value: 74.8955
- type: f1
value: 68.84209999999999
- type: f1_weighted
value: 77.1819
- type: ap
value: 37.731500000000004
- type: ap_weighted
value: 37.731500000000004
- type: main_score
value: 74.8955
task:
type: Classification
pipeline_tag: sentence-similarity
library_name: sentence-transformers
Mini-GTE
This is a distillbert-based model trained from GTE-base. It can be used as a faster query encoder for the GTE series or as a standalone unit (MTEB scores are for standalone).
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: distilbert/distilbert-base-uncased
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'The weather is lovely today.',
"It's so sunny outside!",
'He drove to the stadium.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.3.1
- Transformers: 4.48.0.dev0
- PyTorch: 2.1.0a0+32f93b1
- Accelerate: 1.2.0
- Datasets: 2.21.0
- Tokenizers: 0.21.0