Improve model card: Add library_name, correct paper link, add code, model, and dataset links
#1
by
nielsr
HF Staff
- opened
README.md
CHANGED
|
@@ -1,19 +1,21 @@
|
|
| 1 |
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
language:
|
| 4 |
-
- vi
|
| 5 |
base_model:
|
| 6 |
- vinai/phobert-large
|
|
|
|
|
|
|
|
|
|
| 7 |
pipeline_tag: feature-extraction
|
| 8 |
tags:
|
| 9 |
- bert
|
| 10 |
- wsd
|
| 11 |
- vietnamese
|
| 12 |
- semantic_similarity
|
|
|
|
| 13 |
---
|
|
|
|
| 14 |
# ViConBERT: Context-Gloss Aligned Vietnamese Word Embedding for Polysemous and Sense-Aware Representations
|
| 15 |
|
| 16 |
-
[Paper](https://huggingface.co/tkhangg0910/viconbert-base)
|
| 17 |
|
| 18 |
This repository is official implementation of the paper: ViConBERT: Context-Gloss Aligned Vietnamese Word Embedding for Polysemous and Sense-Aware Representations
|
| 19 |
|
|
@@ -42,10 +44,10 @@ pip3 install -r requirements.txt
|
|
| 42 |
### ViConBERT models <a name="models2"></a>
|
| 43 |
|
| 44 |
|
| 45 |
-
Model | #params | Arch. | Max length | Training data
|
| 46 |
-
|
| 47 |
-
[`tkhangg0910/viconbert-base`](https://huggingface.co/tkhangg0910/viconbert-base) | 135M | base | 256 | [ViConWSD](https://huggingface.co/datasets/tkhangg0910/ViConWSD)
|
| 48 |
-
[`tkhangg0910/viconbert-large`](https://huggingface.co/tkhangg0910/viconbert-large) | 370M | large | 256 | [ViConWSD](https://huggingface.co/datasets/tkhangg0910/ViConWSD)
|
| 49 |
|
| 50 |
|
| 51 |
### Example usage <a name="usage2"></a>
|
|
@@ -127,3 +129,17 @@ print(f"Similarity between 2: {target_2} and 3:{target_3}: {sim_2:.4f}")
|
|
| 127 |
<em>Contextual separation of "Khoan", "chạy", and zero-shot ability for unseen words</em>
|
| 128 |
</p>
|
| 129 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
| 2 |
base_model:
|
| 3 |
- vinai/phobert-large
|
| 4 |
+
language:
|
| 5 |
+
- vi
|
| 6 |
+
license: apache-2.0
|
| 7 |
pipeline_tag: feature-extraction
|
| 8 |
tags:
|
| 9 |
- bert
|
| 10 |
- wsd
|
| 11 |
- vietnamese
|
| 12 |
- semantic_similarity
|
| 13 |
+
library_name: transformers
|
| 14 |
---
|
| 15 |
+
|
| 16 |
# ViConBERT: Context-Gloss Aligned Vietnamese Word Embedding for Polysemous and Sense-Aware Representations
|
| 17 |
|
| 18 |
+
[Paper](https://huggingface.co/papers/2511.12249) | [Code](https://github.com/tkhangg0910/ViConBERT) | [Model](https://huggingface.co/tkhangg0910/viconbert-base) | [Dataset](https://huggingface.co/datasets/tkhangg0910/ViConWSD)
|
| 19 |
|
| 20 |
This repository is official implementation of the paper: ViConBERT: Context-Gloss Aligned Vietnamese Word Embedding for Polysemous and Sense-Aware Representations
|
| 21 |
|
|
|
|
| 44 |
### ViConBERT models <a name="models2"></a>
|
| 45 |
|
| 46 |
|
| 47 |
+
Model | #params | Arch. | Max length | Backbone | Training data
|
| 48 |
+
---|---|---|---|---|---
|
| 49 |
+
[`tkhangg0910/viconbert-base`](https://huggingface.co/tkhangg0910/viconbert-base) | 135M | base | 256 | [PhoBERT-base](https://huggingface.co/vinai/phobert-base) | [ViConWSD](https://huggingface.co/datasets/tkhangg0910/ViConWSD)
|
| 50 |
+
[`tkhangg0910/viconbert-large`](https://huggingface.co/tkhangg0910/viconbert-large) | 370M | large | 256 | [PhoBERT-large](https://huggingface.co/vinai/phobert-large) | [ViConWSD](https://huggingface.co/datasets/tkhangg0910/ViConWSD)
|
| 51 |
|
| 52 |
|
| 53 |
### Example usage <a name="usage2"></a>
|
|
|
|
| 129 |
<em>Contextual separation of "Khoan", "chạy", and zero-shot ability for unseen words</em>
|
| 130 |
</p>
|
| 131 |
|
| 132 |
+
## Citation
|
| 133 |
+
If you find ViConBERT useful for your research and applications, please cite using this BibTeX:
|
| 134 |
+
|
| 135 |
+
```bibtex
|
| 136 |
+
@article{tkhangg09102025viconbert,
|
| 137 |
+
title={ViConBERT: Context-Gloss Aligned Vietnamese Word Embedding for Polysemous and Sense-Aware Representations},
|
| 138 |
+
author={Tkhangg0910 and {others}},
|
| 139 |
+
journal={arXiv preprint arXiv:2511.12249},
|
| 140 |
+
year={2025}
|
| 141 |
+
}
|
| 142 |
+
```
|
| 143 |
+
|
| 144 |
+
## Acknowledgement
|
| 145 |
+
[PhoBERT](https://github.com/VinAIResearch/PhoBERT): ViConBERT used PhoBERT as backbone model.
|