Instructions to use mavis-ai/Multilingual-e5-large-Q6 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use mavis-ai/Multilingual-e5-large-Q6 with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir Multilingual-e5-large-Q6 mavis-ai/Multilingual-e5-large-Q6
- sentence-transformers
How to use mavis-ai/Multilingual-e5-large-Q6 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("mavis-ai/Multilingual-e5-large-Q6") sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
mavis-ai/Multilingual-e5-large-Q6
This repository contains a 6-bit quantized MLX-compatible distribution of intfloat/multilingual-e5-large, prepared for use with R.E.V.I.S. as a smaller local semantic embedding model.
The model is intended for local text embedding, semantic recall, RAG retrieval, and multilingual semantic search workflows.
Important Notice
This repository is hosted primarily as a dedicated download source for the R.E.V.I.S. application ecosystem. You are free to download and use this model package for your own local embedding or MLX workflows, subject to the MIT License and the attribution notices included in this repository.
This package is not a new embedding model and has not been fine-tuned. It is a quantized redistribution of intfloat/multilingual-e5-large.
For the original model card, training details, intended usage, and evaluation information, refer to the official upstream model:
- Original model: https://huggingface.co/intfloat/multilingual-e5-large
- Base architecture: XLM-RoBERTa large
- Embedding size: 1024
Quantization
This package stores selected 2D weight tensors using a R.E.V.I.S. MLX-native Q6 format:
- Quantization type:
mlx-native-affine - Bits:
6 - Group size:
64 - Mode:
affine - Stored tensors: packed
.qweightplus.scalesand.biases - Expected linear path:
mx.quantized_matmul(..., qweight, scales=scales, biases=biases, group_size=64, bits=6, mode="affine") - Expected embedding lookup path: gather the packed rows first, then
mx.dequantize(..., scales=scales, biases=biases, group_size=64, bits=6, mode="affine")
Typical tensor layout:
encoder.layer.0.attention.self.query.weight.qweight
encoder.layer.0.attention.self.query.weight.scales
encoder.layer.0.attention.self.query.weight.biases
The .qweight tensors are MLX packed integer tensors, not plain row-major integer arrays. Non-quantized tensors, such as LayerNorm parameters, bias tensors, and other small metadata tensors, are preserved in their original floating-point representation.
This format is optimized for smaller download and storage size than Q8 while preserving very similar keyword-retrieval behavior in R.E.V.I.S. Runtimes should read quantization.json for the exact tensor names and quantization parameters before loading the weights.
Optimized for R.E.V.I.S. (Local Cognitive OS)
We host this model package to serve as the local semantic embedding engine for R.E.V.I.S.
R.E.V.I.S. is a 100% local Cognitive OS for Multi-Agentic AI. It transforms your Mac devices into a distributed Agentic Swarm via zero-config Wi-Fi clustering, allowing you to run heavy AI workloads—like recursive web research, dynamic RAG generation, and multi-step logic—without killing single-machine performance.
If you are interested in pushing the absolute limits of local AI and open-weight models, check out our project.
- Official Website: https://mavis-ai.co.jp/revis/
- Watch the 13-min Raw Demo (Multi-node Dynamic RAG): https://x.gd/LxaBF
- Follow our updates on X: https://x.com/mavis_ai_jp
Usage Notes
For retrieval-style tasks, E5 models typically use different text prefixes for queries and passages. R.E.V.I.S. applies its own canonical query and passage formatting internally.
If you use this package outside R.E.V.I.S., refer to the upstream E5 instructions for recommended prompt prefixes and pooling behavior.
Files
Recommended repository files:
README.md
LICENSE
NOTICE
weights.00.safetensors
config.json
tokenizer.json
tokenizer_config.json
special_tokens_map.json
quantization.json
License
This repository redistributes a quantized package derived from intfloat/multilingual-e5-large, which is released under the MIT License.
The upstream copyright notice and MIT License text are preserved in LICENSE.
Additional attribution and redistribution notes are included in NOTICE.
Attribution
Original model:
intfloat/multilingual-e5-large
https://huggingface.co/intfloat/multilingual-e5-large
Original authors / associated paper:
Liang Wang, Nan Yang, Xiaolong Huang, Linjun Yang, Rangan Majumder, Furu Wei
Multilingual E5 Text Embeddings: A Technical Report
R.E.V.I.S. Q6 package:
Prepared and redistributed by MAVIS / R.E.V.I.S.
Quantization: MLX-native affine Q6 package for local MLX embedding runtime
Modification Notice
Compared with the upstream intfloat/multilingual-e5-large release, this repository applies the following packaging modification:
Selected 2D weight tensors were quantized to the R.E.V.I.S. MLX-native affine Q6 representation described in `quantization.json`.
No fine-tuning, additional training, or architecture-level modification has been applied.
- Downloads last month
- 17
Quantized
Model tree for mavis-ai/Multilingual-e5-large-Q6
Base model
intfloat/multilingual-e5-large