nielsr HF Staff commited on
Commit
c856ce7
·
verified ·
1 Parent(s): caa551c

Improve model card metadata and add paper link

Browse files

Hi! I'm Niels from the Hugging Face community team.

I've opened this PR to improve the metadata and structure of your model card. Specifically, I've added:
- `pipeline_tag: text-generation` for better discoverability on the Hub.
- `library_name: transformers` as the model uses the Llama architecture and is compatible with the `transformers` library, enabling the "Use in Transformers" button and code snippets.
- A direct link to the paper page on Hugging Face at the start of the model description, without replacing your existing arXiv citation.

Additionally, I've clarified the model architecture by adding "(Llama-based)" and updated the BibTeX citation to match the official paper title and use the `bibtex` code block type.

These changes will make your model more accessible and easier to use for the community.

Files changed (1) hide show
  1. README.md +10 -4
README.md CHANGED
@@ -1,12 +1,17 @@
1
  ---
2
- license: cc-by-nc-sa-4.0
3
  language:
4
  - bn
 
 
 
5
  ---
 
6
  ## Description
7
 
8
  **Biswabangla-335M-io** is a 335 million parameters open source instruction-tuned Generative pretrained Language Model for Bangla/Bengali.
9
 
 
 
10
  Biswabangla is a monolingual Bangla/Bengali Generative Language model. The tokenizer of Biswabangla also works for Assamese language.
11
 
12
  This is a pretrained model from scratch at a context size of 4096. Furthermore instruction-tuned on 1 million Bengali input-output pairs across various Bengali NLP tasks.
@@ -21,7 +26,7 @@ If you use our model, please cite our paper [Niyogi and Bhattacharya, 2024](http
21
  The architecture of Biswabangla is different than the language models, mentioned in [Niyogi and Bhattacharya, 2024](https://arxiv.org/abs/2401.18034)
22
 
23
  ### Model Architecture
24
- Transformer Decoder Only Auto Regressive Model
25
 
26
  ### Limitations
27
  The model was trained on data that contains toxic language, unsafe content, and societal biases originally crawled from the internet.
@@ -33,9 +38,9 @@ Gyan AI Research does own the output generated from the model.
33
 
34
  ### Citations
35
 
36
- ```
37
  @misc{niyogi2024paramanufamilynovelefficient,
38
- title={Paramanu: A Family of Novel Efficient Generative Foundation Language Models for Indian Languages},
39
  author={Mitodru Niyogi and Arnab Bhattacharya},
40
  year={2024},
41
  eprint={2401.18034},
@@ -43,3 +48,4 @@ Gyan AI Research does own the output generated from the model.
43
  primaryClass={cs.CL},
44
  url={https://arxiv.org/abs/2401.18034},
45
  }
 
 
1
  ---
 
2
  language:
3
  - bn
4
+ license: cc-by-nc-sa-4.0
5
+ library_name: transformers
6
+ pipeline_tag: text-generation
7
  ---
8
+
9
  ## Description
10
 
11
  **Biswabangla-335M-io** is a 335 million parameters open source instruction-tuned Generative pretrained Language Model for Bangla/Bengali.
12
 
13
+ This model was presented in the paper [Paramanu: Compact and Competitive Monolingual Language Models for Low-Resource Morphologically Rich Indian Languages](https://huggingface.co/papers/2401.18034).
14
+
15
  Biswabangla is a monolingual Bangla/Bengali Generative Language model. The tokenizer of Biswabangla also works for Assamese language.
16
 
17
  This is a pretrained model from scratch at a context size of 4096. Furthermore instruction-tuned on 1 million Bengali input-output pairs across various Bengali NLP tasks.
 
26
  The architecture of Biswabangla is different than the language models, mentioned in [Niyogi and Bhattacharya, 2024](https://arxiv.org/abs/2401.18034)
27
 
28
  ### Model Architecture
29
+ Transformer Decoder Only Auto Regressive Model (Llama-based)
30
 
31
  ### Limitations
32
  The model was trained on data that contains toxic language, unsafe content, and societal biases originally crawled from the internet.
 
38
 
39
  ### Citations
40
 
41
+ ```bibtex
42
  @misc{niyogi2024paramanufamilynovelefficient,
43
+ title={Paramanu: Compact and Competitive Monolingual Language Models for Low-Resource Morphologically Rich Indian Languages},
44
  author={Mitodru Niyogi and Arnab Bhattacharya},
45
  year={2024},
46
  eprint={2401.18034},
 
48
  primaryClass={cs.CL},
49
  url={https://arxiv.org/abs/2401.18034},
50
  }
51
+ ```