Improve model card metadata and add paper link

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +10 -4
README.md CHANGED
@@ -1,12 +1,17 @@
1
  ---
2
- license: cc-by-nc-sa-4.0
3
  language:
4
  - bn
 
 
 
5
  ---
 
6
  ## Description
7
 
8
  **Biswabangla-335M-io** is a 335 million parameters open source instruction-tuned Generative pretrained Language Model for Bangla/Bengali.
9
 
 
 
10
  Biswabangla is a monolingual Bangla/Bengali Generative Language model. The tokenizer of Biswabangla also works for Assamese language.
11
 
12
  This is a pretrained model from scratch at a context size of 4096. Furthermore instruction-tuned on 1 million Bengali input-output pairs across various Bengali NLP tasks.
@@ -21,7 +26,7 @@ If you use our model, please cite our paper [Niyogi and Bhattacharya, 2024](http
21
  The architecture of Biswabangla is different than the language models, mentioned in [Niyogi and Bhattacharya, 2024](https://arxiv.org/abs/2401.18034)
22
 
23
  ### Model Architecture
24
- Transformer Decoder Only Auto Regressive Model
25
 
26
  ### Limitations
27
  The model was trained on data that contains toxic language, unsafe content, and societal biases originally crawled from the internet.
@@ -33,9 +38,9 @@ Gyan AI Research does own the output generated from the model.
33
 
34
  ### Citations
35
 
36
- ```
37
  @misc{niyogi2024paramanufamilynovelefficient,
38
- title={Paramanu: A Family of Novel Efficient Generative Foundation Language Models for Indian Languages},
39
  author={Mitodru Niyogi and Arnab Bhattacharya},
40
  year={2024},
41
  eprint={2401.18034},
@@ -43,3 +48,4 @@ Gyan AI Research does own the output generated from the model.
43
  primaryClass={cs.CL},
44
  url={https://arxiv.org/abs/2401.18034},
45
  }
 
 
1
  ---
 
2
  language:
3
  - bn
4
+ license: cc-by-nc-sa-4.0
5
+ library_name: transformers
6
+ pipeline_tag: text-generation
7
  ---
8
+
9
  ## Description
10
 
11
  **Biswabangla-335M-io** is a 335 million parameters open source instruction-tuned Generative pretrained Language Model for Bangla/Bengali.
12
 
13
+ This model was presented in the paper [Paramanu: Compact and Competitive Monolingual Language Models for Low-Resource Morphologically Rich Indian Languages](https://huggingface.co/papers/2401.18034).
14
+
15
  Biswabangla is a monolingual Bangla/Bengali Generative Language model. The tokenizer of Biswabangla also works for Assamese language.
16
 
17
  This is a pretrained model from scratch at a context size of 4096. Furthermore instruction-tuned on 1 million Bengali input-output pairs across various Bengali NLP tasks.
 
26
  The architecture of Biswabangla is different than the language models, mentioned in [Niyogi and Bhattacharya, 2024](https://arxiv.org/abs/2401.18034)
27
 
28
  ### Model Architecture
29
+ Transformer Decoder Only Auto Regressive Model (Llama-based)
30
 
31
  ### Limitations
32
  The model was trained on data that contains toxic language, unsafe content, and societal biases originally crawled from the internet.
 
38
 
39
  ### Citations
40
 
41
+ ```bibtex
42
  @misc{niyogi2024paramanufamilynovelefficient,
43
+ title={Paramanu: Compact and Competitive Monolingual Language Models for Low-Resource Morphologically Rich Indian Languages},
44
  author={Mitodru Niyogi and Arnab Bhattacharya},
45
  year={2024},
46
  eprint={2401.18034},
 
48
  primaryClass={cs.CL},
49
  url={https://arxiv.org/abs/2401.18034},
50
  }
51
+ ```