tkhangg0910
/

viconbert-large

@@ -1,19 +1,21 @@
 ---
-license: apache-2.0
-language:
-- vi
 base_model:
 - vinai/phobert-large
 pipeline_tag: feature-extraction
 tags:
 - bert
 - wsd
 - vietnamese
 - semantic_similarity
 ---
 # ViConBERT: Context-Gloss Aligned Vietnamese Word Embedding for Polysemous and Sense-Aware Representations
-[Paper](https://huggingface.co/tkhangg0910/viconbert-base)
 This repository is official implementation of the paper: ViConBERT: Context-Gloss Aligned Vietnamese Word Embedding for Polysemous and Sense-Aware Representations
@@ -42,10 +44,10 @@ pip3 install -r requirements.txt
 ### ViConBERT models <a name="models2"></a>
-Model | #params | Arch. | Max length | Training data
----|---|---|---|---
-[`tkhangg0910/viconbert-base`](https://huggingface.co/tkhangg0910/viconbert-base) | 135M | base | 256 | [ViConWSD](https://huggingface.co/datasets/tkhangg0910/ViConWSD)
-[`tkhangg0910/viconbert-large`](https://huggingface.co/tkhangg0910/viconbert-large) | 370M | large | 256 | [ViConWSD](https://huggingface.co/datasets/tkhangg0910/ViConWSD)
 ### Example usage <a name="usage2"></a>
@@ -127,3 +129,17 @@ print(f"Similarity between 2: {target_2} and 3:{target_3}: {sim_2:.4f}")
   <em>Contextual separation of "Khoan", "chạy", and zero-shot ability for unseen words</em>
 </p>

 ---
 base_model:
 - vinai/phobert-large
+language:
+- vi
+license: apache-2.0
 pipeline_tag: feature-extraction
 tags:
 - bert
 - wsd
 - vietnamese
 - semantic_similarity
+library_name: transformers
 ---
 # ViConBERT: Context-Gloss Aligned Vietnamese Word Embedding for Polysemous and Sense-Aware Representations
+[Paper](https://huggingface.co/papers/2511.12249) | [Code](https://github.com/tkhangg0910/ViConBERT) | [Model](https://huggingface.co/tkhangg0910/viconbert-base) | [Dataset](https://huggingface.co/datasets/tkhangg0910/ViConWSD)
 This repository is official implementation of the paper: ViConBERT: Context-Gloss Aligned Vietnamese Word Embedding for Polysemous and Sense-Aware Representations
 ### ViConBERT models <a name="models2"></a>
+Model | #params | Arch. | Max length | Backbone | Training data
+---|---|---|---|---|---
+[`tkhangg0910/viconbert-base`](https://huggingface.co/tkhangg0910/viconbert-base) | 135M | base | 256 | [PhoBERT-base](https://huggingface.co/vinai/phobert-base) | [ViConWSD](https://huggingface.co/datasets/tkhangg0910/ViConWSD)
+[`tkhangg0910/viconbert-large`](https://huggingface.co/tkhangg0910/viconbert-large) | 370M | large | 256 | [PhoBERT-large](https://huggingface.co/vinai/phobert-large) | [ViConWSD](https://huggingface.co/datasets/tkhangg0910/ViConWSD)
 ### Example usage <a name="usage2"></a>
   <em>Contextual separation of "Khoan", "chạy", and zero-shot ability for unseen words</em>
 </p>
+## Citation
+If you find ViConBERT useful for your research and applications, please cite using this BibTeX:
+```bibtex
+@article{tkhangg09102025viconbert,
+  title={ViConBERT: Context-Gloss Aligned Vietnamese Word Embedding for Polysemous and Sense-Aware Representations},
+  author={Tkhangg0910 and {others}},
+  journal={arXiv preprint arXiv:2511.12249},
+  year={2025}
+}
+```
+## Acknowledgement
+[PhoBERT](https://github.com/VinAIResearch/PhoBERT): ViConBERT used PhoBERT as backbone model.