--- license: apache-2.0 language: - en base_model: - Qwen/Qwen3-Embedding-4B library_name: sentence-transformers --- ## Description This is one [CSRv2](https://arxiv.org/abs/2602.05735) model finetuned on [MTEB](https://huggingface.co/mteb) sts datasets with [Qwen3-Embedding-4B](https://huggingface.co/Qwen/Qwen3-Embedding-4B) as backbone. For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our [Github](https://github.com/Y-Research-SBU/CSRv2). ## Sentence Transformer Usage You can evaluate this model loaded by Sentence Transformers with the following code snippet (take STS12 as one example): ```python import mteb from sentence_transformers import SparseEncoder model = SparseEncoder( "Y-Research-Group/CSRv2-sts",  trust_remote_code=True ) model.prompts = { "STS12": "Instruct: Retrieve semantically similar text\n Query:"  } task = mteb.get_tasks(tasks=["STS12"]) evaluation = mteb.MTEB(tasks=task) evaluation.run( model, eval_splits=["test"], output_folder="./results/STS12", show_progress_bar=True encode_kwargs={"convert_to_sparse_tensor": False, "batch_size": 8} ) # MTEB don't support sparse tensors yet, so we need to convert to dense tensors ``` It is suggested that you use our [default prompts](https://github.com/Y-Research-SBU/CSRv2/blob/main/text/dataset_to_prompt.json) in evaluation. ## Multi-TopK Support Our model supports different sparsity levels due to the utilization of **Multi-TopK** loss in training. You can change sparsity model by adjusting the `k` parameter` in the file `3_SparseAutoEncoder/config.json`. We set sparsity level to 2 by default. For instance, if you want to evaluate with sparsity level $K=8$ (which means there are 8 activated neurons in each embedding vector), the `3_SparseAutoEncoder/config.json` should look like this: ```json { "input_dim": 2560, "hidden_dim": 10240, "k": 8, "k_aux": 1024, "normalize": false, "dead_threshold": 30 } ``` ## CSRv2 Qwen Series We will release a series of [CSRv2](https://arxiv.org/abs/2602.05735) models finetuned on common tasks in [MTEB](https://huggingface.co/mteb) with [Qwen3-Embedding-4B](https://huggingface.co/Qwen/Qwen3-Embedding-4B) as backbone. These tasks are:  - **[Classification](https://huggingface.co/Y-Research-Group/CSRv2-classification)** - **[Clustering](https://huggingface.co/Y-Research-Group/CSRv2-clustering)** - **[Retrieval](https://huggingface.co/Y-Research-Group/CSRv2-retrieval)** - **[STS](https://huggingface.co/Y-Research-Group/CSRv2-sts)** - **[Pair_classification](https://huggingface.co/Y-Research-Group/CSRv2-pair_classification)** - **[Reranking](https://huggingface.co/Y-Research-Group/CSRv2-reranking)** ## Citation ```bibtex @inproceedings{guo2026csrv2, title={{CSR}v2: Unlocking Ultra-sparse Embeddings}, author={Guo, Lixuan and Wang, Yifei and Wen, Tiansheng and Wang, Yifan and Feng, Aosong and Chen, Bo and Jegelka, Stefanie and You, Chenyu}, booktitle={International Conference on Learning Representations (ICLR)}, year={2026} } ```