Instructions to use sentence-transformers/all-MiniLM-L6-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use sentence-transformers/all-MiniLM-L6-v2 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Transformers
How to use sentence-transformers/all-MiniLM-L6-v2 with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/all-MiniLM-L6-v2") model = AutoModel.from_pretrained("sentence-transformers/all-MiniLM-L6-v2") - Inference
- Notebooks
- Google Colab
- Kaggle
Quantization training technique used for all miniLM L6 v2 quantized model.
Which quantization training technique used for all miniLM L6 v2 quantized model ??
Hello!
It depends, we have (u)int8 quantized models using arm64, avx2, avx512, avx512_vnni quantization as defined here: https://huggingface.co/docs/optimum/main/en/onnxruntime/package_reference/configuration#optimum.onnxruntime.AutoQuantizationConfig
And with OpenVINO, we use Static Quantization as described here: https://huggingface.co/docs/optimum/main/en/intel/openvino/optimization#static-quantization
- Tom Aarsen
Will quantization aware training technique can preserve the accuracy of all mini lm l6v2 model if we want to apply this technique ??