Update README.md
Browse files
README.md
CHANGED
|
@@ -55,13 +55,16 @@ array([[0.66212064, 0.33066642],
|
|
| 55 |
### Evaluation:
|
| 56 |
|
| 57 |
- Dataset: Entire training dataset of Legal Zalo 2021. Our model was not trained on this dataset.
|
| 58 |
-
|
| 59 |
| Model | Accuracy@1 | Accuracy@3 | Accuracy@5 | Accuracy@10 | Accuracy@100 | MRR@10 |
|
| 60 |
|----------------------|------------|------------|------------|-------------|-------------|--------------|
|
| 61 |
-
|
|
|
|
|
|
|
|
| 62 |
| Vietnamese-bi-encoder (BKAI) | 0.7109 | 0.8680 | 0.9014 | 0.9299 | 0.9772 | 0.7951 |
|
| 63 |
| BGE-M3 | 0.5682 | 0.7728 | 0.8382 | 0.8921 | 0.9772 | 0.6822 |
|
| 64 |
|
|
|
|
| 65 |
You can reproduce the evaluation result by running code python evaluation_model.py (data downloaded from Kaggle).
|
| 66 |
|
| 67 |
**Developer**
|
|
@@ -70,9 +73,9 @@ Member: Nguyễn Nho Trung, Nguyễn Nhật Quang
|
|
| 70 |
|
| 71 |
## Contact
|
| 72 |
|
|
|
|
| 73 |
**Email**:
|
| 74 |
- nguyennhotrung3004@gmail.com
|
| 75 |
-
- nhatquang2306@gmail.com
|
| 76 |
|
| 77 |
## Citation
|
| 78 |
|
|
|
|
| 55 |
### Evaluation:
|
| 56 |
|
| 57 |
- Dataset: Entire training dataset of Legal Zalo 2021. Our model was not trained on this dataset.
|
| 58 |
+
79.443 93.242 95.369 97.403 86.717
|
| 59 |
| Model | Accuracy@1 | Accuracy@3 | Accuracy@5 | Accuracy@10 | Accuracy@100 | MRR@10 |
|
| 60 |
|----------------------|------------|------------|------------|-------------|-------------|--------------|
|
| 61 |
+
| Vietnamese Reranker (Phase 2) | 0.7944 | 0.9324 | 0.9537 | 0.9740 | NA | 0.8672 |
|
| 62 |
+
| Vietnamese_Embedding (Phase 2) | 0.7262 | 0.8927 | 0.9268 | 0.9578 | 0.9925 | 0.8149 |
|
| 63 |
+
| Vietnamese_Embedding (public) | 0.7274 | 0.8992 | 0.9305 | 0.9568 | 0.9922 | 0.8181 |
|
| 64 |
| Vietnamese-bi-encoder (BKAI) | 0.7109 | 0.8680 | 0.9014 | 0.9299 | 0.9772 | 0.7951 |
|
| 65 |
| BGE-M3 | 0.5682 | 0.7728 | 0.8382 | 0.8921 | 0.9772 | 0.6822 |
|
| 66 |
|
| 67 |
+
Vietnamese Reranker (Phase 2) and Vietnamese Reranker (Phased) was trained on 1100000 triplets. Although the score on the legal domain drops a bit, since this phase data is much larger, it is very good for other domains.
|
| 68 |
You can reproduce the evaluation result by running code python evaluation_model.py (data downloaded from Kaggle).
|
| 69 |
|
| 70 |
**Developer**
|
|
|
|
| 73 |
|
| 74 |
## Contact
|
| 75 |
|
| 76 |
+
|
| 77 |
**Email**:
|
| 78 |
- nguyennhotrung3004@gmail.com
|
|
|
|
| 79 |
|
| 80 |
## Citation
|
| 81 |
|