Feature Extraction
Transformers
ONNX
English
bert
sparse sparsity quantized onnx embeddings int8
mteb
Eval Results (legacy)
Instructions to use zeroshot/gte-small-quant with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use zeroshot/gte-small-quant with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="zeroshot/gte-small-quant")# Load model directly from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("zeroshot/gte-small-quant") model = AutoModel.from_pretrained("zeroshot/gte-small-quant") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -15,7 +15,7 @@ Current list of sparse and quantized gte-small ONNX models:
|
|
| 15 |
| Links | Sparsification Method |
|
| 16 |
| --------------------------------------------------------------------------------------------------- | ---------------------- |
|
| 17 |
| [zeroshot/bge-large-en-v1.5-sparse](https://huggingface.co/zeroshot/gte-small-sparse) | Quantization (INT8) & 50% Pruning |
|
| 18 |
-
| [zeroshot/bge-large-en-v1.5-quant](https://huggingface.co/zeroshot/gte-small
|
| 19 |
|
| 20 |
BGE models using this architecture:
|
| 21 |
|
|
|
|
| 15 |
| Links | Sparsification Method |
|
| 16 |
| --------------------------------------------------------------------------------------------------- | ---------------------- |
|
| 17 |
| [zeroshot/bge-large-en-v1.5-sparse](https://huggingface.co/zeroshot/gte-small-sparse) | Quantization (INT8) & 50% Pruning |
|
| 18 |
+
| [zeroshot/bge-large-en-v1.5-quant](https://huggingface.co/zeroshot/gte-small-quant) | Quantization (INT8) |
|
| 19 |
|
| 20 |
BGE models using this architecture:
|
| 21 |
|