| | --- |
| | license: apache-2.0 |
| | language: |
| | - bn |
| | base_model: |
| | - google/electra-small-discriminator |
| | --- |
| | |
| | # VĀC-BERT |
| |
|
| | **VĀC-BERT** is a 17 million-parameter model, trained on the Vācaspati literary dataset. Despite its compact size, VĀC-BERT achieves competitive performance with state-of-the-art masked-language and downstream models that are over seven times larger. |
| |
|
| | ## Model Details |
| |
|
| | - **Architecture:** Electra-small (but reduced to 17 M parameters) |
| | - **Pretraining Corpus:** Vācaspati — a curated Bangla literary corpus |
| | - **Parameter Count:** 17 M (≈ 1/7th the size of BERT-base) |
| | - **Tokenizer:** WordPiece, vocabulary size 50 K |
| |
|
| |
|
| | ## Usage Example |
| |
|
| | ```python |
| | from transformers import BertTokenizer, AutoModelForSequenceClassification |
| | |
| | tokenizer = BertTokenizer.from_pretrained("Vacaspati/VAC-BERT") |
| | model = AutoModelForSequenceClassification.from_pretrained("Vacaspati/VAC-BERT") |
| | ``` |
| |
|
| | We are releasing the Vācaspati dataset. For access to Vācaspati dataset please fill this form. |
| |
|
| | Link: https://forms.gle/DiVm2fSVCyXXMbkU9 |
| |
|
| | Vācaspati dataset can also be accessed from: https://huggingface.co/datasets/Vacaspati/Vacaspati |
| |
|
| | ## Citation |
| |
|
| | If you are using this model please cite: |
| |
|
| | ```bibtex |
| | |
| | @inproceedings{bhattacharyya-etal-2023-vacaspati, |
| | title = "{VACASPATI}: A Diverse Corpus of {B}angla Literature", |
| | author = "Bhattacharyya, Pramit and |
| | Mondal, Joydeep and |
| | Maji, Subhadip and |
| | Bhattacharya, Arnab", |
| | editor = "Park, Jong C. and |
| | Arase, Yuki and |
| | Hu, Baotian and |
| | Lu, Wei and |
| | Wijaya, Derry and |
| | Purwarianti, Ayu and |
| | Krisnadhi, Adila Alfa", |
| | booktitle = "Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)", |
| | month = nov, |
| | year = "2023", |
| | address = "Nusa Dua, Bali", |
| | publisher = "Association for Computational Linguistics", |
| | url = "https://aclanthology.org/2023.ijcnlp-main.72/", |
| | doi = "10.18653/v1/2023.ijcnlp-main.72", |
| | pages = "1118--1130" |
| | } |
| | |
| | ``` |