sbunlp
/

fabert

@@ -23,4 +23,105 @@ widget:
     example_title: "Formal 1"
   - text: "در حکم اولیه این شرکت مجاز به فعالیت شد ولی پس از بررسی مجدد، مجوز این شرکت [MASK] شد."
     example_title: "Formal 2"
----

     example_title: "Formal 1"
   - text: "در حکم اولیه این شرکت مجاز به فعالیت شد ولی پس از بررسی مجدد، مجوز این شرکت [MASK] شد."
     example_title: "Formal 2"
+---
+# FaBERT: Pre-training BERT on Persian Blogs
+## Model Details
+FaBERT is a Persian BERT-base model trained on the diverse HmBlogs corpus, encompassing both casual and formal Persian texts. Developed for natural language processing tasks, FaBERT is a robust solution for processing Persian text. Through evaluation across various Natural Language Understanding (NLU) tasks, FaBERT consistently demonstrates notable improvements, while having a compact model size. Now available on Hugging Face, integrating FaBERT into your projects is hassle-free. Experience enhanced performance without added complexity as FaBERT tackles a variety of NLP tasks.
+## Features
+- Pre-trained on the diverse HmBlogs corpus consisting more than 50 GB of text from Persian Blogs
+- Remarkable performance across various downstream NLP tasks
+- BERT architecture with 124 million parameters
+## Useful Links
+- **Repository:** [FaBERT on Github](https://github.com/SBU-NLP-LAB/FaBERT)
+- **Paper:** [arXiv preprint](https://arxiv.org/abs/2402.06617)
+## Usage
+### Loading the Model with MLM head
+```python
+from transformers import AutoTokenizer, AutoModelForMaskedLM
+tokenizer = AutoTokenizer.from_pretrained("sbunlp/fabert") # make sure to use the default fast tokenizer
+model = AutoModelForMaskedLM.from_pretrained("sbunlp/fabert")
+```
+### Downstream Tasks
+Similar to the original English BERT, FaBERT can be fine-tuned on many downstream tasks.(https://huggingface.co/docs/transformers/en/training)
+Examples on Persian datasets are available in our [GitHub repository](#useful-links).
+**make sure to use the default Fast Tokenizer**
+## Training Details
+FaBERT was pre-trained with the MLM (WWM) objective, and the resulting perplexity on validation set was 7.76.
+| Hyperparameter    | Value        |
+|-------------------|:--------------:|
+| Batch Size        | 32           |
+| Optimizer         | Adam         |
+| Learning Rate     | 6e-5         |
+| Weight Decay      | 0.01         |
+| Total Steps       | 18 Million    |
+| Warmup Steps      | 1.8 Million   |
+| Precision Format  | TF32          |
+## Evaluation
+Here are some key performance results for the FaBERT model:
+**Sentiment Analysis**
+| Task         | FaBERT | ParsBERT | XLM-R |
+|:-------------|:------:|:--------:|:-----:|
+| MirasOpinion | **87.51**      | 86.73     | 84.92  |
+| MirasIrony | 74.82      | 71.08     | **75.51**  |
+| DeepSentiPers | **79.85**      | 74.94     | 79.00  |
+**Named Entity Recognition**
+| Task         | FaBERT | ParsBERT | XLM-R |
+|:-------------|:------:|:--------:|:-----:|
+| PEYMA        |   **91.39**    |   91.24   | 90.91  |
+| ParsTwiner   |   **82.22**    |  81.13   | 79.50  |
+| MultiCoNER v2   |   57.92    |   **58.09**   | 51.47  |
+**Question Answering**
+| Task         | FaBERT | ParsBERT | XLM-R |
+|:-------------|:------:|:--------:|:-----:|
+| ParsiNLU | **55.87**      | 44.89     | 42.55  |
+| PQuAD  | 87.34      | 86.89     | **87.60**  |
+| PCoQA  | **53.51**      | 50.96     | 51.12  |
+**Natural Language Inference & QQP**
+| Task         | FaBERT | ParsBERT | XLM-R |
+|:-------------|:------:|:--------:|:-----:|
+| FarsTail | **84.45**      | 82.52     | 83.50  |
+| SBU-NLI | **66.65**      | 58.41     | 58.85  |
+| ParsiNLU QQP | **82.62**      | 77.60     | 79.74  |
+**Number of Parameters**
+|          | FaBERT | ParsBERT | XLM-R |
+|:-------------|:------:|:--------:|:-----:|
+| Parameter Count (M) | 124      | 162     | 278  |
+| Vocabulary Size (K) | 50      | 100     | 250  |
+For a more detailed performance analysis refer to the paper.
+## How to Cite
+If you use FaBERT in your research or projects, please cite it using the following BibTeX:
+```bibtex
+@article{masumi2024fabert,
+  title={FaBERT: Pre-training BERT on Persian Blogs},
+  author={Masumi, Mostafa and Majd, Seyed Soroush and Shamsfard, Mehrnoush and Beigy, Hamid},
+  journal={arXiv preprint arXiv:2402.06617},
+  year={2024}
+}
+```