Model Card for mmatinm/mpersian_xlm_roberta_large

This model is a fine-tuned XLM-RoBERTa Large for Persian question answering, trained on the PersianQA dataset.
It was originally based on pedramyazdipoor/persian_xlm_roberta_large, which was fine-tuned on the PQuAD dataset.


Model Details

Model Description

  • Developed by: mmatinm
  • Base model: XLM-RoBERTa Large (from Hugging Face Transformers)
  • Language(s): Persian (fa)
  • Task: Extractive Question Answering (SQuAD v2 style)
  • Finetuned from: pedramyazdipoor/persian_xlm_roberta_large

Model Sources


Uses

Direct Use

  • Answering questions given a Persian context paragraph.
  • Can be used as a QA backend in chatbots or search engines for Persian content.

Downstream Use

  • Further fine-tuning for domain-specific QA in Persian.
  • Integration into multi-lingual QA systems.

Out-of-Scope Use

  • Generative QA (this is extractive only).
  • Languages other than Persian.

Bias, Risks, and Limitations

  • Model performance is dependent on the quality and coverage of PersianQA.
  • May fail on highly domain-specific or slang-heavy texts.
  • May return incorrect spans for ambiguous questions.

How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForQuestionAnswering

repo_id = "mmatinm/mpersian_xlm_roberta_large"

tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForQuestionAnswering.from_pretrained(repo_id)

Results

Model & Method F1 Score EM Score No-Answer F1
XLM-R ( LoRA + QA Head) 85.3 71.6 90.7

Citation

@misc{mmatinm2025xlmr,
  title={Persian XLM-RoBERTa Fine-Tuned on PersianQA},
  author={Matin M.},
  year={2025},
  publisher={Hugging Face}
}
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support