Mixed-Distil-BERT: Code-mixed Language Modeling for Bangla, English, and Hindi
Paper • 2309.10272 • Published
How to use md-nishat-008/Mixed-Distil-BERT with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("fill-mask", model="md-nishat-008/Mixed-Distil-BERT") # Load model directly
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("md-nishat-008/Mixed-Distil-BERT")
model = AutoModelForMaskedLM.from_pretrained("md-nishat-008/Mixed-Distil-BERT")The model is pretrained on the OSCAR dataset for Bangla, English and Hindi. And further pre-trained on 560k code-mixed data (Bangla-English-Hindi). The base model is Distil-BERT and the intended use for this model is for the datasets that contain a Code-mixing of these languages.
To cite:
@article{raihan2023mixed, title={Mixed-Distil-BERT: Code-mixed Language Modeling for Bangla, English, and Hindi}, author={Raihan, Md Nishat and Goswami, Dhiman and Mahmud, Antara}, journal={arXiv preprint arXiv:2309.10272}, year={2023} }