{
    "model_id": "Alibaba-NLP/gte-multilingual-reranker-base",
    "downloads": 228024,
    "tags": [
        "sentence-transformers",
        "safetensors",
        "new",
        "text-classification",
        "transformers",
        "text-embeddings-inference",
        "text-ranking",
        "custom_code",
        "af",
        "ar",
        "az",
        "be",
        "bg",
        "bn",
        "ca",
        "ceb",
        "cs",
        "cy",
        "da",
        "de",
        "el",
        "en",
        "es",
        "et",
        "eu",
        "fa",
        "fi",
        "fr",
        "gl",
        "gu",
        "he",
        "hi",
        "hr",
        "ht",
        "hu",
        "hy",
        "id",
        "is",
        "it",
        "ja",
        "jv",
        "ka",
        "kk",
        "km",
        "kn",
        "ko",
        "ky",
        "lo",
        "lt",
        "lv",
        "mk",
        "ml",
        "mn",
        "mr",
        "ms",
        "my",
        "ne",
        "nl",
        "no",
        "pa",
        "pl",
        "pt",
        "qu",
        "ro",
        "ru",
        "si",
        "sk",
        "sl",
        "so",
        "sq",
        "sr",
        "sv",
        "sw",
        "ta",
        "te",
        "th",
        "tl",
        "tr",
        "uk",
        "ur",
        "vi",
        "yo",
        "zh",
        "arxiv:2407.19669",
        "license:apache-2.0",
        "region:us"
    ],
    "description": "--- license: apache-2.0 pipeline_tag: text-ranking tags: - transformers - sentence-transformers - text-embeddings-inference language: - af - ar - az - be - bg - bn - ca - ceb - cs - cy - da - de - el - en - es - et - eu - fa - fi - fr - gl - gu - he - hi - hr - ht - hu - hy - id - is - it - ja - jv - ka - kk - km - kn - ko - ky - lo - lt - lv - mk - ml - mn - mr - ms - my - ne - nl - 'no' - pa - pl - pt - qu - ro - ru - si - sk - sl - so - sq - sr - sv - sw - ta - te - th - tl - tr - uk - ur - vi - yo - zh library_name: sentence-transformers --- ## gte-multilingual-reranker-base The **gte-multilingual-reranker-base** model is the first reranker model in the GTE family of models, featuring several key attributes: - **High Performance**: Achieves state-of-the-art (SOTA) results in multilingual retrieval tasks and multi-task representation model evaluations when compared to reranker models of similar size. - **Training Architecture**: Trained using an encoder-only transformers architecture, resulting in a smaller model size. Unlike previous models based on decode-only LLM architecture (e.g., gte-qwen2-1.5b-instruct), this model has lower hardware requirements for inference, offering a 10x increase in inference speed. - **Long Context**: Supports text lengths up to **8192** tokens. - **Multilingual Capability**: Supports over **70** languages. ## Model Information - Model Size: 306M - Max Input Tokens: 8192 ### Usage - **It is recommended to install xformers and enable unpadding for acceleration, refer to enable-unpadding-and-xformers.** - **How to use it offline: new-impl/discussions/2** Using Huggingface transformers (transformers>=4.36.0) Usage with infinity: Infinity, a MIT Licensed Inference RestAPI Server. ## Evaluation Results of reranking based on multiple text retreival datasets !image **More detailed experimental results can be found in the paper**. ## Cloud API Services In addition to the open-source GTE series models, GTE series models are also available as commercial API services on Alibaba Cloud. - Embedding Models: Three versions of the text embedding models are available: text-embedding-v1/v2/v3, with v3 being the latest API service. - ReRank Models: The gte-rerank model service is available. Note that the models behind the commercial APIs are not entirely identical to the open-source models. ## Citation If you find our paper or models helpful, please consider cite:",
    "model_explanation_gemini": "Multilingual text reranking model supporting 70+ languages with high performance, long-context handling (8192 tokens), and efficient inference for retrieval tasks."
}