Instructions to use Alibaba-NLP/gte-Qwen2-7B-instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use Alibaba-NLP/gte-Qwen2-7B-instruct with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("Alibaba-NLP/gte-Qwen2-7B-instruct", trust_remote_code=True) sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Transformers
How to use Alibaba-NLP/gte-Qwen2-7B-instruct with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Alibaba-NLP/gte-Qwen2-7B-instruct", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("Alibaba-NLP/gte-Qwen2-7B-instruct", trust_remote_code=True) - Notebooks
- Google Colab
- Kaggle
Parameters for peak performance
Are there any stats on performance on the same dataset when changing the document chunk size, chunk strategy, languages, or model quantization?
By trial and error it seems to me that a smaller (i.e. a few sentences MAX) chunks tend to perform better.
I am trying to compare different embedders at their best, using the proper parameters for each.
Firstly, thank you for your interest in the GTE series models. This is a very interesting question. Currently, we do not have such experimental data, and our previous experimental results also did not indicate that shorter texts have better performance (due to the lack of such evaluation data).
The retrieval effectiveness of the model can be influenced by various factors, such as text length and language. We speculate that the better performance of shorter texts may be due to two reasons:
- The training data for existing models predominantly consists of short texts, as it is relatively difficult to obtain relevance data for long texts.
- The semantic expression of short texts is more precise and concise, which is more conducive for semantic search.