Model Card for mmatinm/mpersian_xlm_roberta_large

This model is a fine-tuned XLM-RoBERTa Large for Persian question answering, trained on the PersianQA dataset.
It was originally based on pedramyazdipoor/persian_xlm_roberta_large, which was fine-tuned on the PQuAD dataset.

Model Details

Model Description

Developed by: mmatinm
Base model: XLM-RoBERTa Large (from Hugging Face Transformers)
Language(s): Persian (fa)
Task: Extractive Question Answering (SQuAD v2 style)
Finetuned from: pedramyazdipoor/persian_xlm_roberta_large

Model Sources

Repository: GitHub Repo
Demo: Colab Demo

Uses

Direct Use

Answering questions given a Persian context paragraph.
Can be used as a QA backend in chatbots or search engines for Persian content.

Downstream Use

Further fine-tuning for domain-specific QA in Persian.
Integration into multi-lingual QA systems.

Out-of-Scope Use

Generative QA (this is extractive only).
Languages other than Persian.

Bias, Risks, and Limitations

Model performance is dependent on the quality and coverage of PersianQA.
May fail on highly domain-specific or slang-heavy texts.
May return incorrect spans for ambiguous questions.

How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForQuestionAnswering

repo_id = "mmatinm/mpersian_xlm_roberta_large"

tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForQuestionAnswering.from_pretrained(repo_id)

Results

Model & Method	F1 Score	EM Score	No-Answer F1
XLM-R ( LoRA + QA Head)	85.3	71.6	90.7

Citation

@misc{mmatinm2025xlmr,
  title={Persian XLM-RoBERTa Fine-Tuned on PersianQA},
  author={Matin M.},
  year={2025},
  publisher={Hugging Face}
}

Downloads last month: 1