bh4
/

bb335m

@@ -1,12 +1,17 @@
 ---
-license: cc-by-nc-sa-4.0
 language:
 - bn
 ---
 ## Description
 **Biswabangla-335M-io** is a 335 million parameters open source instruction-tuned Generative pretrained Language Model for Bangla/Bengali.
 Biswabangla is a monolingual Bangla/Bengali Generative Language model. The tokenizer of Biswabangla also works for Assamese language.
 This is a pretrained model from scratch at a context size of 4096. Furthermore instruction-tuned on 1 million Bengali input-output pairs across various Bengali NLP tasks.
@@ -21,7 +26,7 @@ If you use our model, please cite our paper [Niyogi and Bhattacharya, 2024](http
 The architecture of Biswabangla is different than the language models, mentioned in [Niyogi and Bhattacharya, 2024](https://arxiv.org/abs/2401.18034)
 ### Model Architecture
-Transformer Decoder Only Auto Regressive Model
 ### Limitations
 The model was trained on data that contains toxic language, unsafe content, and societal biases originally crawled from the internet.
@@ -33,9 +38,9 @@ Gyan AI Research does own the output generated from the model.
 ### Citations
-```
 @misc{niyogi2024paramanufamilynovelefficient,
-      title={Paramanu: A Family of Novel Efficient Generative Foundation Language Models for Indian Languages},
       author={Mitodru Niyogi and Arnab Bhattacharya},
       year={2024},
       eprint={2401.18034},
@@ -43,3 +48,4 @@ Gyan AI Research does own the output generated from the model.
       primaryClass={cs.CL},
       url={https://arxiv.org/abs/2401.18034},
 }

 ---
 language:
 - bn
+license: cc-by-nc-sa-4.0
+library_name: transformers
+pipeline_tag: text-generation
 ---
 ## Description
 **Biswabangla-335M-io** is a 335 million parameters open source instruction-tuned Generative pretrained Language Model for Bangla/Bengali.
+This model was presented in the paper [Paramanu: Compact and Competitive Monolingual Language Models for Low-Resource Morphologically Rich Indian Languages](https://huggingface.co/papers/2401.18034).
 Biswabangla is a monolingual Bangla/Bengali Generative Language model. The tokenizer of Biswabangla also works for Assamese language.
 This is a pretrained model from scratch at a context size of 4096. Furthermore instruction-tuned on 1 million Bengali input-output pairs across various Bengali NLP tasks.
 The architecture of Biswabangla is different than the language models, mentioned in [Niyogi and Bhattacharya, 2024](https://arxiv.org/abs/2401.18034)
 ### Model Architecture
+Transformer Decoder Only Auto Regressive Model (Llama-based)
 ### Limitations
 The model was trained on data that contains toxic language, unsafe content, and societal biases originally crawled from the internet.
 ### Citations
+```bibtex
 @misc{niyogi2024paramanufamilynovelefficient,
+      title={Paramanu: Compact and Competitive Monolingual Language Models for Low-Resource Morphologically Rich Indian Languages},
       author={Mitodru Niyogi and Arnab Bhattacharya},
       year={2024},
       eprint={2401.18034},
       primaryClass={cs.CL},
       url={https://arxiv.org/abs/2401.18034},
 }
+```