W1nd-navigator's picture
Update README.md
2cb0a6e verified
---
license: mit
datasets:
- mteb/nfcorpus
language:
- en
pipeline_tag: text-retrieval
library_name: sentence-transformers
tags:
- mteb
- text
- transformers
- text-embeddings-inference
- sparse-encoder
- sparse
- csr
model-index:
- name: NV-Embed-v2
results:
- dataset:
name: MTEB NFCorpus
type: mteb/nfcorpus
revision: ec0fa4fe99da2ff19ca1214b7966684033a58814
config: default
split: test
languages:
- eng-Latn
metrics:
- type: ndcg@1
value: 0.43189
- type: ndcg@3
value: 0.41132
- type: ndcg@5
value: 0.40406
- type: ndcg@10
value: 0.39624
- type: ndcg@20
value: 0.38517
- type: ndcg@100
value: 0.40068
- type: ndcg@1000
value: 0.49126
- type: map@10
value: 0.14342
- type: map@100
value: 0.21866
- type: map@1000
value: 0.2427
- type: recall@10
value: 0.1968
- type: recall@100
value: 0.45592
- type: recall@1000
value: 0.78216
- type: precision@1
value: 0.45511
- type: precision@10
value: 0.32353
- type: mrr@10
value: 0.537792
- type: main_score
value: 0.39624
task:
type: Retrieval
base_model:
- nvidia/NV-Embed-v2
---
For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our [Github](https://github.com/neilwen987/CSR_Adaptive_Rep).
## Usage
📌 **Tip**: For NV-Embed-V2, using Transformers versions **later** than 4.47.0 may lead to performance degradation, as ``model_type=bidir_mistral`` in ``config.json`` is no longer supported.
We recommend using ``Transformers 4.47.0.``
### Sentence Transformers Usage
You can evaluate this model loaded by Sentence Transformers with the following code snippet:
```python
import mteb
from sentence_transformers import SparseEncoder
model = SparseEncoder("Y-Research-Group/CSR-NV_Embed_v2-Retrieval-NFcorpus", trust_remote_code=True)
model.prompts = {
"NFCorpus-query": "Instruct: Given a question, retrieve relevant documents that answer the question\nQuery:"
}
task = mteb.get_tasks(tasks=["NFCorpus"])
evaluation = mteb.MTEB(tasks=task)
evaluation.run(
model,
eval_splits=["test"],
output_folder="./results/NFCorpus",
show_progress_bar=True,
encode_kwargs={"convert_to_sparse_tensor": False, "batch_size": 8},
) # MTEB don't support sparse tensors yet, so we need to convert to dense tensors
```
## Citation
```bibtex
@misc{wen2025matryoshkarevisitingsparsecoding,
title={Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation},
author={Tiansheng Wen and Yifei Wang and Zequn Zeng and Zhong Peng and Yudi Su and Xinyang Liu and Bo Chen and Hongwei Liu and Stefanie Jegelka and Chenyu You},
year={2025},
eprint={2503.01776},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2503.01776},
}
```