Instructions to use Qwen/Qwen3-Reranker-0.6B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Qwen/Qwen3-Reranker-0.6B with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-Reranker-0.6B") model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-Reranker-0.6B") - sentence-transformers
How to use Qwen/Qwen3-Reranker-0.6B with sentence-transformers:
from sentence_transformers import CrossEncoder model = CrossEncoder("Qwen/Qwen3-Reranker-0.6B") query = "Which planet is known as the Red Planet?" passages = [ "Venus is often called Earth's twin because of its similar size and proximity.", "Mars, known for its reddish appearance, is often referred to as the Red Planet.", "Jupiter, the largest planet in our solar system, has a prominent red spot.", "Saturn, famous for its rings, is sometimes mistaken for the Red Planet." ] scores = model.predict([(query, passage) for passage in passages]) print(scores) - Notebooks
- Google Colab
- Kaggle
Integrate with Sentence Transformers v5.4
Hello!
Pull Request overview
- Integrate this model using a Sentence Transformers
CrossEncoder
Details
This PR adds the configuration files needed to load this model directly as a CrossEncoder via Sentence Transformers. The model uses a text-generation Transformer with a LogitScore head that computes the logit difference between the "yes" and "no" tokens, i.e. the model's confidence that a document is relevant to a query.
A custom chat_template.jinja maps Sentence Transformers' structured messages (with "query" and "document" roles) to the model's expected format with the <Instruct>, <Query>, and <Document> fields, including the <think>\n\n</think> suffix. The template includes a default instruction ("Given a web search query, retrieve relevant passages that answer the query") as a fallback when no prompt is provided.
Added files:
modules.json: pipeline:Transformer&LogitScoresentence_bert_config.json:text-generationtask, flat message formatconfig_sentence_transformers.json: default prompt, Identity activationchat_template.jinja: custom template for the reranker format1_LogitScore/config.json: yes/no token IDs
Once the Sentence Transformers v5.4 release it out, the model can be used immediately like so:
from sentence_transformers import CrossEncoder
model = CrossEncoder("Qwen/Qwen3-Reranker-0.6B", revision="refs/pr/24")
query = "What is the capital of China?"
documents = [
"The capital of China is Beijing.",
"Gravity is a force that attracts two bodies towards each other. It gives weight to physical objects and is responsible for the movement of planets around the sun.",
]
pairs = [(query, doc) for doc in documents]
scores = model.predict(pairs)
print(scores)
# [ 7.625 -11.375]
rankings = model.rank(query, documents)
print(rankings)
# [{'corpus_id': 0, 'score': 7.625}, {'corpus_id': 1, 'score': -11.375}]
And after merging, the revision argument can be dropped. These scores match the transformers code.
Note that none of the old behaviour is affected/changed. It only adds an additional way to run this model in a familiar and common format.
if you are able to merge this before tomorrow's Sentence Transformers v5.4 release, then I will be able to include this in my blogpost and documentation as a release model without revision.
- Tom Aarsen