AITeamVN
/

Vietnamese_Embedding

Sentence Similarity

sentence-transformers

text-embeddings-inference

Model card Files Files and versions

AITeamVN commited on Apr 14, 2025

Commit

ee84815

·

verified ·

1 Parent(s): 95b4a1e

Update README.md

Files changed (1) hide show

README.md +6 -3

README.md CHANGED Viewed

@@ -55,13 +55,16 @@ array([[0.66212064, 0.33066642],
 ### Evaluation:
 - Dataset: Entire training dataset of Legal Zalo 2021. Our model was not trained on this dataset.
 | Model                | Accuracy@1 | Accuracy@3 | Accuracy@5 | Accuracy@10 | Accuracy@100 |  MRR@10 |
 |----------------------|------------|------------|------------|-------------|-------------|--------------|
-| Vietnamese_Embedding            | 0.7274     | 0.8992     | 0.9305     | 0.9568      | 0.9922     | 0.8181       |
 | Vietnamese-bi-encoder (BKAI)         | 0.7109     | 0.8680     | 0.9014     | 0.9299      | 0.9772      | 0.7951       |
 | BGE-M3 | 0.5682     | 0.7728     | 0.8382     | 0.8921      | 0.9772      | 0.6822       |
 You can reproduce the evaluation result by running code python evaluation_model.py (data downloaded from Kaggle).
 **Developer**
@@ -70,9 +73,9 @@ Member: Nguyễn Nho Trung, Nguyễn Nhật Quang
 ## Contact
 **Email**:
 - nguyennhotrung3004@gmail.com
-- nhatquang2306@gmail.com
 ## Citation

 ### Evaluation:
 - Dataset: Entire training dataset of Legal Zalo 2021. Our model was not trained on this dataset.
+79.443	93.242	95.369	97.403	86.717
 | Model                | Accuracy@1 | Accuracy@3 | Accuracy@5 | Accuracy@10 | Accuracy@100 |  MRR@10 |
 |----------------------|------------|------------|------------|-------------|-------------|--------------|
+| Vietnamese Reranker (Phase 2)            | 0.7944     | 0.9324    | 0.9537     | 0.9740      | NA     | 0.8672       |
+| Vietnamese_Embedding (Phase 2)          | 0.7262     | 0.8927     | 0.9268     | 0.9578      | 0.9925     | 0.8149       |
+| Vietnamese_Embedding  (public)          | 0.7274     | 0.8992     | 0.9305     | 0.9568      | 0.9922     | 0.8181       |
 | Vietnamese-bi-encoder (BKAI)         | 0.7109     | 0.8680     | 0.9014     | 0.9299      | 0.9772      | 0.7951       |
 | BGE-M3 | 0.5682     | 0.7728     | 0.8382     | 0.8921      | 0.9772      | 0.6822       |
+Vietnamese Reranker (Phase 2) and Vietnamese Reranker (Phased) was trained on 1100000 triplets. Although the score on the legal domain drops a bit, since this phase data is much larger, it is very good for other domains.
 You can reproduce the evaluation result by running code python evaluation_model.py (data downloaded from Kaggle).
 **Developer**
 ## Contact
 **Email**:
 - nguyennhotrung3004@gmail.com
 ## Citation