| | ---
|
| | tags:
|
| | - sentence-transformers
|
| | - cross-encoder
|
| | - reranker
|
| | base_model: mixedbread-ai/mxbai-rerank-base-v2
|
| | pipeline_tag: text-ranking
|
| | library_name: sentence-transformers
|
| | ---
|
| |
|
| | # CrossEncoder based on mixedbread-ai/mxbai-rerank-base-v2
|
| |
|
| | This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [mixedbread-ai/mxbai-rerank-base-v2](https://huggingface.co/mixedbread-ai/mxbai-rerank-base-v2) using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
|
| |
|
| | ## Model Details
|
| |
|
| | ### Model Description
|
| | - **Model Type:** Cross Encoder
|
| | - **Base model:** [mixedbread-ai/mxbai-rerank-base-v2](https://huggingface.co/mixedbread-ai/mxbai-rerank-base-v2) <!-- at revision 3334116c32105657feed9b0a733f451c012ee61a -->
|
| | - **Maximum Sequence Length:** 32768 tokens
|
| | - **Number of Output Labels:** 1 label
|
| | <!-- - **Training Dataset:** Unknown -->
|
| | <!-- - **Language:** Unknown -->
|
| | <!-- - **License:** Unknown -->
|
| |
|
| | ### Model Sources
|
| |
|
| | - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
|
| | - **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
|
| | - **Repository:** [Sentence Transformers on GitHub](https://github.com/huggingface/sentence-transformers)
|
| | - **Hugging Face:** [Cross Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=cross-encoder)
|
| |
|
| | ## Usage
|
| |
|
| | ### Direct Usage (Sentence Transformers)
|
| |
|
| | First install the Sentence Transformers library:
|
| |
|
| | ```bash
|
| | pip install -U sentence-transformers
|
| | ```
|
| |
|
| | Then you can load this model and run inference.
|
| | ```python
|
| | from sentence_transformers import CrossEncoder
|
| |
|
| | # Download from the 🤗 Hub
|
| | model = CrossEncoder("cross-encoder-testing/mxbai-rerank-base-v2-v6")
|
| | # Get scores for pairs of texts
|
| | pairs = [
|
| | ['How many calories in an egg', 'There are on average between 55 and 80 calories in an egg depending on its size.'],
|
| | ['How many calories in an egg', 'Egg whites are very low in calories, have no fat, no cholesterol, and are loaded with protein.'],
|
| | ['How many calories in an egg', 'Most of the calories in an egg come from the yellow yolk in the center.'],
|
| | ]
|
| | scores = model.predict(pairs)
|
| | print(scores.shape)
|
| | # (3,)
|
| |
|
| | # Or rank different texts based on similarity to a single text
|
| | ranks = model.rank(
|
| | 'How many calories in an egg',
|
| | [
|
| | 'There are on average between 55 and 80 calories in an egg depending on its size.',
|
| | 'Egg whites are very low in calories, have no fat, no cholesterol, and are loaded with protein.',
|
| | 'Most of the calories in an egg come from the yellow yolk in the center.',
|
| | ]
|
| | )
|
| | # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
|
| | ```
|
| |
|
| | <!--
|
| | ### Direct Usage (Transformers)
|
| |
|
| | <details><summary>Click to see the direct usage in Transformers</summary>
|
| |
|
| | </details>
|
| | -->
|
| |
|
| | <!--
|
| | ### Downstream Usage (Sentence Transformers)
|
| |
|
| | You can finetune this model on your own dataset.
|
| |
|
| | <details><summary>Click to expand</summary>
|
| |
|
| | </details>
|
| | -->
|
| |
|
| | <!--
|
| | ### Out-of-Scope Use
|
| |
|
| | *List how the model may foreseeably be misused and address what users ought not to do with the model.*
|
| | -->
|
| |
|
| | <!--
|
| | ## Bias, Risks and Limitations
|
| |
|
| | *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
|
| | -->
|
| |
|
| | <!--
|
| | ### Recommendations
|
| |
|
| | *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
|
| | -->
|
| |
|
| | ## Training Details
|
| |
|
| | ### Framework Versions
|
| | - Python: 3.11.6
|
| | - Sentence Transformers: 5.3.0.dev0
|
| | - Transformers: 4.57.3
|
| | - PyTorch: 2.9.1+cu126
|
| | - Accelerate: 1.6.0
|
| | - Datasets: 4.2.0
|
| | - Tokenizers: 0.22.1
|
| |
|
| | ## Citation
|
| |
|
| | ### BibTeX
|
| |
|
| | <!--
|
| | ## Glossary
|
| |
|
| | *Clearly define terms in order to be accessible across audiences.*
|
| | -->
|
| |
|
| | <!--
|
| | ## Model Card Authors
|
| |
|
| | *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
|
| | -->
|
| |
|
| | <!--
|
| | ## Model Card Contact
|
| |
|
| | *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
|
| | --> |