prdev
/

mini-gte

Sentence Similarity

sentence-transformers

feature-extraction

Eval Results (legacy)

text-embeddings-inference

Model card Files Files and versions

mini-gte / README.md

prdev's picture

Update README.md

934eade verified 12 months ago

|

3.46 kB

	---
	tags:
	- sentence-transformers
	- sentence-similarity
	- feature-extraction
	base_model: distilbert/distilbert-base-uncased

	model-index:
	- name: prdev/mini-gte
	results:
	- dataset:
	config: en
	name: MTEB AmazonCounterfactualClassification (en)
	revision: e8379541af4e31359cca9fbcf4b00f2671dba205
	split: test
	type: mteb/amazon_counterfactual
	metrics:
	- type: accuracy
	value: 74.8955
	- type: f1
	value: 68.84209999999999
	- type: f1_weighted
	value: 77.1819
	- type: ap
	value: 37.731500000000004
	- type: ap_weighted
	value: 37.731500000000004
	- type: main_score
	value: 74.8955
	task:
	type: Classification

	pipeline_tag: sentence-similarity
	library_name: sentence-transformers
	---

	# Mini-GTE

	This is a distillbert-based model trained from GTE-base. It can be used as a faster query encoder for the GTE series or as a standalone unit (MTEB scores are for standalone).

	## Model Details

	### Model Description
	- Model Type: Sentence Transformer
	- Base model: [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) <!-- at revision 12040accade4e8a0f71eabdb258fecc2e7e948be -->
	- Maximum Sequence Length: 512 tokens
	- Output Dimensionality: 768 dimensions
	- Similarity Function: Cosine Similarity

	## Usage

	### Direct Usage (Sentence Transformers)

	First install the Sentence Transformers library:

	```bash
	pip install -U sentence-transformers
	```

	Then you can load this model and run inference.
	```python
	from sentence_transformers import SentenceTransformer

	# Download from the 🤗 Hub
	model = SentenceTransformer("sentence_transformers_model_id")
	# Run inference
	sentences = [
	'The weather is lovely today.',
	"It's so sunny outside!",
	'He drove to the stadium.',
	]
	embeddings = model.encode(sentences)
	print(embeddings.shape)
	# [3, 768]

	# Get the similarity scores for the embeddings
	similarities = model.similarity(embeddings, embeddings)
	print(similarities.shape)
	# [3, 3]
	```

	<!--
	### Direct Usage (Transformers)

	<details><summary>Click to see the direct usage in Transformers</summary>

	</details>
	-->

	<!--
	### Downstream Usage (Sentence Transformers)

	You can finetune this model on your own dataset.

	<details><summary>Click to expand</summary>

	</details>
	-->

	<!--
	### Out-of-Scope Use

	List how the model may foreseeably be misused and address what users ought not to do with the model.
	-->

	<!--
	## Bias, Risks and Limitations

	What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.
	-->

	<!--
	### Recommendations

	What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.
	-->

	## Training Details

	### Framework Versions
	- Python: 3.10.12
	- Sentence Transformers: 3.3.1
	- Transformers: 4.48.0.dev0
	- PyTorch: 2.1.0a0+32f93b1
	- Accelerate: 1.2.0
	- Datasets: 2.21.0
	- Tokenizers: 0.21.0

	## Citation

	### BibTeX

	<!--
	## Glossary

	Clearly define terms in order to be accessible across audiences.
	-->

	<!--
	## Model Card Authors

	Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.
	-->

	<!--
	## Model Card Contact

	Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.
	-->