Update README.md
Browse files
README.md
CHANGED
|
@@ -12,7 +12,7 @@ This model is trained through the approach described in [DMRetriever: A Family o
|
|
| 12 |
The associated GitHub repository is available [here](https://github.com/KaiYin97/DMRETRIEVER).
|
| 13 |
This model has 596M parameters.
|
| 14 |
|
| 15 |
-
## Model Overview
|
| 16 |
|
| 17 |
**DMRetriever-596M** has the following features:
|
| 18 |
|
|
@@ -24,7 +24,7 @@ This model has 596M parameters.
|
|
| 24 |
|
| 25 |
For more details, including model training, benchmark evaluation, and inference performance, please refer to our [paper](https://www.arxiv.org/abs/2510.15087), [GitHub](https://github.com/KaiYin97/DMRETRIEVER).
|
| 26 |
|
| 27 |
-
## DMRetriever
|
| 28 |
|
| 29 |
| **Model** | **Description** | **Backbone** | **Backbone Type** | **Hidden Size** | **#Layers** |
|
| 30 |
|:--|:--|:--|:--|:--:|:--:|
|
|
@@ -42,7 +42,7 @@ For more details, including model training, benchmark evaluation, and inference
|
|
| 42 |
| [DMRetriever-7.6B-PT](https://huggingface.co/DMIR01/DMRetriever-7.6B-PT) | Pre-trained version of 7.6B | Qwen3-8B | Decoder-only | 4096 | 36 |
|
| 43 |
|
| 44 |
|
| 45 |
-
## Usage
|
| 46 |
|
| 47 |
Using HuggingFace Transformers:
|
| 48 |
|
|
@@ -55,7 +55,7 @@ from bidirectional_qwen3 import Qwen3BiModel # custom bidirectional backbone
|
|
| 55 |
|
| 56 |
MODEL_ID = "DMIR01/DMRetriever-596M"
|
| 57 |
|
| 58 |
-
# Device & dtype
|
| 59 |
device = "cuda" if torch.cuda.is_available() else "cpu"
|
| 60 |
dtype = torch.float16 if device == "cuda" else torch.float32
|
| 61 |
|
|
@@ -154,13 +154,16 @@ for i, q in enumerate(queries_raw):
|
|
| 154 |
|
| 155 |
```
|
| 156 |
|
| 157 |
-
## Notice
|
| 158 |
|
| 159 |
-
|
|
|
|
| 160 |
|
| 161 |
-
|
|
|
|
| 162 |
|
| 163 |
-
|
|
|
|
| 164 |
If you find this repository helpful, please kindly consider citing the corresponding paper. Thanks!
|
| 165 |
```
|
| 166 |
@article{yin2025dmretriever,
|
|
|
|
| 12 |
The associated GitHub repository is available [here](https://github.com/KaiYin97/DMRETRIEVER).
|
| 13 |
This model has 596M parameters.
|
| 14 |
|
| 15 |
+
## 🧠 Model Overview
|
| 16 |
|
| 17 |
**DMRetriever-596M** has the following features:
|
| 18 |
|
|
|
|
| 24 |
|
| 25 |
For more details, including model training, benchmark evaluation, and inference performance, please refer to our [paper](https://www.arxiv.org/abs/2510.15087), [GitHub](https://github.com/KaiYin97/DMRETRIEVER).
|
| 26 |
|
| 27 |
+
## 📦 DMRetriever Series Model List
|
| 28 |
|
| 29 |
| **Model** | **Description** | **Backbone** | **Backbone Type** | **Hidden Size** | **#Layers** |
|
| 30 |
|:--|:--|:--|:--|:--:|:--:|
|
|
|
|
| 42 |
| [DMRetriever-7.6B-PT](https://huggingface.co/DMIR01/DMRetriever-7.6B-PT) | Pre-trained version of 7.6B | Qwen3-8B | Decoder-only | 4096 | 36 |
|
| 43 |
|
| 44 |
|
| 45 |
+
## 🚀 Usage
|
| 46 |
|
| 47 |
Using HuggingFace Transformers:
|
| 48 |
|
|
|
|
| 55 |
|
| 56 |
MODEL_ID = "DMIR01/DMRetriever-596M"
|
| 57 |
|
| 58 |
+
# Device & dtype
|
| 59 |
device = "cuda" if torch.cuda.is_available() else "cpu"
|
| 60 |
dtype = torch.float16 if device == "cuda" else torch.float32
|
| 61 |
|
|
|
|
| 154 |
|
| 155 |
```
|
| 156 |
|
| 157 |
+
## ⚠️ Notice
|
| 158 |
|
| 159 |
+
1. The **backbone** used in DMRetriever is **Bidirectional Qwen3**, not the standard Qwen3.
|
| 160 |
+
Please ensure that the `bidirectional_qwen3` module (included in the released model checkpoint folder) is correctly placed inside your model directory.
|
| 161 |
|
| 162 |
+
2. Make sure that your **transformers** library version is **≥ 4.51.0** to avoid the error:
|
| 163 |
+
`KeyError: 'qwen3'`.
|
| 164 |
|
| 165 |
+
|
| 166 |
+
## 🧾 Citation
|
| 167 |
If you find this repository helpful, please kindly consider citing the corresponding paper. Thanks!
|
| 168 |
```
|
| 169 |
@article{yin2025dmretriever,
|