| --- |
| language: |
| - ar |
| library_name: transformers |
| pipeline_tag: text-generation |
| datasets: |
| - IBB-University/DATA_FATAWA |
| metrics: |
| - accuracy |
| - bleu |
| - bertscore |
| widget: |
| - text: ' اركان الاسلام' |
| - text: ما حكم الاحتفال بالمولد النبوي |
| - text: ما هي الكتب السماوية |
| --- |
| ## Testing the model using `transformers`: |
|
|
| ```python |
| from transformers import GPT2TokenizerFast, pipeline |
| #for base and medium |
| from transformers import GPT2LMHeadModel |
| #for large and mega |
| # pip install arabert |
| from arabert.aragpt2.grover.modeling_gpt2 import GPT2LMHeadModel |
| |
| from arabert.preprocess import ArabertPreprocessor |
| |
| MODEL_NAME='IBB-University/ghadeer_question_answer' |
| arabert_prep = ArabertPreprocessor(model_name=MODEL_NAME) |
| |
| text="" |
| text_clean = arabert_prep.preprocess(text) |
| |
| model = GPT2LMHeadModel.from_pretrained(MODEL_NAME) |
| tokenizer = GPT2TokenizerFast.from_pretrained(MODEL_NAME) |
| generation_pipeline = pipeline("text-generation",model=model,tokenizer=tokenizer) |
| |
| #feel free to try different decoding settings |
| generation_pipeline(text, |
| pad_token_id=tokenizer.eos_token_id, |
| max_length=512, |
| penalty_alpha=0.6, |
| top_k=4 )[0]['generated_text'] |
| ``` |