hooshafzar
/

SINA-BERT

Model card Files Files and versions

Nasrin-Taghizadeh commited on Jun 25, 2025

Commit

f05b0a1

·

verified ·

1 Parent(s): 4116264

Update README.md

Files changed (1) hide show

README.md +62 -3

README.md CHANGED Viewed

@@ -1,3 +1,62 @@
----
-license: lgpl-3.0
----

+---
+license: lgpl-3.0
+language:
+- fa
+base_model:
+- HooshvareLab/bert-base-parsbert-uncased
+---
+# SINA-BERT: A Pre-trained Language Model for Analysis of Medical Texts in Persian
+SINA-BERT is the first Persian medical language model pre-trained on BERT (Devlin et al.,2018). SINA-BERT utilizes pre-training on a large-scale corpus of medical contents including formal and informal texts collected from a variety of online resources in order to improve the performance on health-care related tasks.
+## Model Evaluation
+SINA-BERT can be used for any Persian medical representative task. In our paper we have examined the followings:
+1) categorization of medical questions,
+2) medical sentiment analysis,
+3) and medical question retrieval.
+For each task, we have developed Persian annotated data sets, and learnt a representation for the data of each task especially complex and long medical questions. With the same architecture being used across tasks, SINA-BERT outperforms BERT-based models that were previously made available in the Persian language.
+To read about the datasets and results, please refer to SINA-BERT paper: [arXiv:2104.07613v1](https://arxiv.org/pdf/2104.07613)
+- **Developed by:** HooshAfzar Salamat Team
+- **Language(s) (NLP):** Persian
+- **Finetuned from model:** [ParsBert](https://huggingface.co/HooshvareLab/bert-base-parsbert-uncased)
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [GitHub](https://github.com/nasrin-taghizadeh/SinaBERT)
+- **Paper [optional]:** [arXive paper](https://arxiv.org/pdf/2104.07613)
+## How to use
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+```
+from transformers import AutoConfig, AutoTokenizer, AutoModel
+config = AutoConfig.from_pretrained("hooshafzar/SINA-BERT")
+tokenizer = AutoTokenizer.from_pretrained("hooshafzar/SINA-BERT")
+model = AutoModel.from_pretrained("hooshafzar/SINA-BERT")
+```
+## Citation
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+```bibtex
+@article{taghizadeh2021sina,
+  title={SINA-BERT: a pre-trained language model for analysis of medical texts in Persian},
+  author={Taghizadeh, Nasrin and Doostmohammadi, Ehsan and Seifossadat, Elham and Rabiee, Hamid R and Tahaei, Maedeh S},
+  journal={arXiv preprint arXiv:2104.07613},
+  year={2021}
+}
+```