DMIR01
/

DMRetriever-596M

@@ -12,7 +12,7 @@ This model is trained through the approach described in [DMRetriever: A Family o
 The associated GitHub repository is available [here](https://github.com/KaiYin97/DMRETRIEVER).
 This model has 596M parameters.
-## Model Overview
 **DMRetriever-596M** has the following features:
@@ -24,7 +24,7 @@ This model has 596M parameters.
 For more details, including model training, benchmark evaluation, and inference performance, please refer to our [paper](https://www.arxiv.org/abs/2510.15087), [GitHub](https://github.com/KaiYin97/DMRETRIEVER).
-## DMRetriever series model list
 | **Model** | **Description** | **Backbone** | **Backbone Type** | **Hidden Size** | **#Layers** |
 |:--|:--|:--|:--|:--:|:--:|
@@ -42,7 +42,7 @@ For more details, including model training, benchmark evaluation, and inference
 | [DMRetriever-7.6B-PT](https://huggingface.co/DMIR01/DMRetriever-7.6B-PT) | Pre-trained version of 7.6B | Qwen3-8B | Decoder-only | 4096 | 36 |
-## Usage
 Using HuggingFace Transformers:
@@ -55,7 +55,7 @@ from bidirectional_qwen3 import Qwen3BiModel  # custom bidirectional backbone
 MODEL_ID = "DMIR01/DMRetriever-596M"
-# Device & dtype (GPU -> fp16; CPU -> fp32)
 device = "cuda" if torch.cuda.is_available() else "cpu"
 dtype = torch.float16 if device == "cuda" else torch.float32
@@ -154,13 +154,16 @@ for i, q in enumerate(queries_raw):
 ```
-## Notice
-(1) The backbone is bidirectional Qwen3 instead of Qwen3, so please ensure bidirectional_qwen3 (we provided it in the released model checkpoint folder) is in the model folder.
-(2) Ensure the transformers>4.51.0 to avoid KeyError: 'qwen3'.
-## Citation
 If you find this repository helpful, please kindly consider citing the corresponding paper. Thanks!
 ```
 @article{yin2025dmretriever,

 The associated GitHub repository is available [here](https://github.com/KaiYin97/DMRETRIEVER).
 This model has 596M parameters.
+## 🧠 Model Overview
 **DMRetriever-596M** has the following features:
 For more details, including model training, benchmark evaluation, and inference performance, please refer to our [paper](https://www.arxiv.org/abs/2510.15087), [GitHub](https://github.com/KaiYin97/DMRETRIEVER).
+## 📦 DMRetriever Series Model List
 | **Model** | **Description** | **Backbone** | **Backbone Type** | **Hidden Size** | **#Layers** |
 |:--|:--|:--|:--|:--:|:--:|
 | [DMRetriever-7.6B-PT](https://huggingface.co/DMIR01/DMRetriever-7.6B-PT) | Pre-trained version of 7.6B | Qwen3-8B | Decoder-only | 4096 | 36 |
+## 🚀 Usage
 Using HuggingFace Transformers:
 MODEL_ID = "DMIR01/DMRetriever-596M"
+# Device & dtype
 device = "cuda" if torch.cuda.is_available() else "cpu"
 dtype = torch.float16 if device == "cuda" else torch.float32
 ```
+## ⚠️ Notice
+1. The **backbone** used in DMRetriever is **Bidirectional Qwen3**, not the standard Qwen3.
+   Please ensure that the `bidirectional_qwen3` module (included in the released model checkpoint folder) is correctly placed inside your model directory.
+2. Make sure that your **transformers** library version is **≥ 4.51.0** to avoid the error:
+   `KeyError: 'qwen3'`.
+## 🧾 Citation
 If you find this repository helpful, please kindly consider citing the corresponding paper. Thanks!
 ```
 @article{yin2025dmretriever,