Sentence Similarity
sentence-transformers
ONNX
Safetensors
Vietnamese
xlm-roberta
Embedding
text-embeddings-inference
Instructions to use AITeamVN/Vietnamese_Embedding with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use AITeamVN/Vietnamese_Embedding with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("AITeamVN/Vietnamese_Embedding") sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Inference
- Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -58,15 +58,17 @@ array([[0.66212064, 0.33066642],
|
|
| 58 |
|
| 59 |
| Model | Accuracy@1 | Accuracy@3 | Accuracy@5 | Accuracy@10 | MRR@10 |
|
| 60 |
|----------------------|------------|------------|------------|-------------|--------------|
|
| 61 |
-
| Vietnamese_Reranker
|
| 62 |
-
|
|
| 63 |
| Vietnamese_Embedding (public) | 0.7274 | 0.8992 | 0.9305 | 0.9568 | 0.8181 |
|
| 64 |
| Vietnamese-bi-encoder (BKAI) | 0.7109 | 0.8680 | 0.9014 | 0.9299 | 0.7951 |
|
| 65 |
| BGE-M3 | 0.5682 | 0.7728 | 0.8382 | 0.8921 | 0.6822 |
|
| 66 |
|
| 67 |
-
Vietnamese_Reranker
|
| 68 |
|
| 69 |
-
Although the score on the legal domain drops a bit on
|
|
|
|
|
|
|
| 70 |
|
| 71 |
You can reproduce the evaluation result by running code python evaluation_model.py (data downloaded from Kaggle).
|
| 72 |
|
|
|
|
| 58 |
|
| 59 |
| Model | Accuracy@1 | Accuracy@3 | Accuracy@5 | Accuracy@10 | MRR@10 |
|
| 60 |
|----------------------|------------|------------|------------|-------------|--------------|
|
| 61 |
+
| Vietnamese_Reranker | 0.7944 | 0.9324 | 0.9537 | 0.9740 | 0.8672 |
|
| 62 |
+
| Vietnamese_Embedding_v2 | 0.7262 | 0.8927 | 0.9268 | 0.9578 | 0.8149 |
|
| 63 |
| Vietnamese_Embedding (public) | 0.7274 | 0.8992 | 0.9305 | 0.9568 | 0.8181 |
|
| 64 |
| Vietnamese-bi-encoder (BKAI) | 0.7109 | 0.8680 | 0.9014 | 0.9299 | 0.7951 |
|
| 65 |
| BGE-M3 | 0.5682 | 0.7728 | 0.8382 | 0.8921 | 0.6822 |
|
| 66 |
|
| 67 |
+
Vietnamese_Reranker and Vietnamese_Embedding_v2 was trained on 1100000 triplets.
|
| 68 |
|
| 69 |
+
Although the score on the legal domain drops a bit on Vietnamese_Embedding_v2, since this phase data is much larger, it is very good for other domains.
|
| 70 |
+
|
| 71 |
+
You can access 2 model via link: [Vietnamese_Embedding_v2](AITeamVN/Vietnamese_Embedding_v2), [Vietnamese_Reranker](https://huggingface.co/AITeamVN/Vietnamese_Reranker)
|
| 72 |
|
| 73 |
You can reproduce the evaluation result by running code python evaluation_model.py (data downloaded from Kaggle).
|
| 74 |
|