Update README.md
Browse files
README.md
CHANGED
|
@@ -95,22 +95,22 @@ print(cosine_sim_2.item()) # 0.9861876964569092
|
|
| 95 |
## Performance
|
| 96 |
|
| 97 |
Below is a comparision table of the results I achieved compared to some other embedding models on three
|
| 98 |
-
benchmarks: [ZAC](https://huggingface.co/datasets/GreenNode/zalo-ai-legal-text-retrieval-vn/viewer/default?views%5B%5D=default_train), [WebFaq](https://huggingface.co/datasets/PaDaS-Lab/webfaq-retrieval), [OwiFaq](https://huggingface.co/datasets/PaDaS-Lab/owi-faq-retrieval), [ViQuAD2.0](https://huggingface.co/datasets/taidng/UIT-ViQuAD2.0), [
|
| 99 |
with metric **Recall@3**
|
| 100 |
|
| 101 |
-
| Model Name | ZAC | WebFaq | OwiFaq | ViQuAD2.0 |
|
| 102 |
-
|
| 103 |
-
| [namdp-ptit/ViDense](https://huggingface.co/namdp-ptit/ViDense) | **54.72** | 82.26 | 85.62 | **61.28** | **58.42**
|
| 104 |
-
| [VoVanPhuc/sup-SimCSE-VietNamese-phobert-base](https://huggingface.co/VoVanPhuc/sup-SimCSE-VietNamese-phobert-base) | 53.64 | 81.52 | 85.02 | 59.12 | 55.70
|
| 105 |
-
| [keepitreal/vietnamese-sbert](https://huggingface.co/keepitreal/vietnamese-sbert) | 50.45 | 80.54 | 78.58 | 52.67 | 51.86
|
| 106 |
-
| [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3) | 46.12 | **83.45** | **86.08** | 58.27 | 49.02
|
| 107 |
|
| 108 |
Here are the information of these 3 benchmarks:
|
| 109 |
|
| 110 |
* ZAC: merge train and test into a new benchmark, ~ 3200 queries, ~ 330K documents in corpus.
|
| 111 |
* WebFAQ and OwiFaq: merge train and test into a new benchmark, ~ 124K queries, ~ 124K documents in corpus.
|
| 112 |
* ViQuAD2.0: merge train, validation and test into a new benchmark, ~ 39.6K queries, ~ 39.6K documents in corpus.
|
| 113 |
-
*
|
| 114 |
|
| 115 |
## Contact
|
| 116 |
|
|
|
|
| 95 |
## Performance
|
| 96 |
|
| 97 |
Below is a comparision table of the results I achieved compared to some other embedding models on three
|
| 98 |
+
benchmarks: [ZAC](https://huggingface.co/datasets/GreenNode/zalo-ai-legal-text-retrieval-vn/viewer/default?views%5B%5D=default_train), [WebFaq](https://huggingface.co/datasets/PaDaS-Lab/webfaq-retrieval), [OwiFaq](https://huggingface.co/datasets/PaDaS-Lab/owi-faq-retrieval), [ViQuAD2.0](https://huggingface.co/datasets/taidng/UIT-ViQuAD2.0), [ViLegal](https://huggingface.co/datasets/CATI-AI/vietnamese-legal-retrieval-with-negatives)
|
| 99 |
with metric **Recall@3**
|
| 100 |
|
| 101 |
+
| Model Name | ZAC | WebFaq | OwiFaq | ViQuAD2.0 | ViLegal |
|
| 102 |
+
|---------------------------------------------------------------------------------------------------------------------|:----------|:----------|:----------|:----------|:----------|
|
| 103 |
+
| [namdp-ptit/ViDense](https://huggingface.co/namdp-ptit/ViDense) | **54.72** | 82.26 | 85.62 | **61.28** | **58.42** |
|
| 104 |
+
| [VoVanPhuc/sup-SimCSE-VietNamese-phobert-base](https://huggingface.co/VoVanPhuc/sup-SimCSE-VietNamese-phobert-base) | 53.64 | 81.52 | 85.02 | 59.12 | 55.70 |
|
| 105 |
+
| [keepitreal/vietnamese-sbert](https://huggingface.co/keepitreal/vietnamese-sbert) | 50.45 | 80.54 | 78.58 | 52.67 | 51.86 |
|
| 106 |
+
| [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3) | 46.12 | **83.45** | **86.08** | 58.27 | 49.02 |
|
| 107 |
|
| 108 |
Here are the information of these 3 benchmarks:
|
| 109 |
|
| 110 |
* ZAC: merge train and test into a new benchmark, ~ 3200 queries, ~ 330K documents in corpus.
|
| 111 |
* WebFAQ and OwiFaq: merge train and test into a new benchmark, ~ 124K queries, ~ 124K documents in corpus.
|
| 112 |
* ViQuAD2.0: merge train, validation and test into a new benchmark, ~ 39.6K queries, ~ 39.6K documents in corpus.
|
| 113 |
+
* ViLegal: ~ 144K queries, ~ 144K documents in corpus.
|
| 114 |
|
| 115 |
## Contact
|
| 116 |
|