aubmindlab
/

aragpt2-medium

Text Generation

text-generation-inference

Model card Files Files and versions

Metrics Training metrics Community

wissamantoun commited on Apr 18, 2021

Commit

5c01fd5

·

1 Parent(s): aac1054

added citation

Files changed (1) hide show

README.md +32 -27

README.md CHANGED Viewed

@@ -63,7 +63,7 @@ Follow the guide linked [here](https://towardsdatascience.com/fine-tuning-gpt2-o
 ## Finetuning using our code with TF 1.15.4:
-- Create the Training TFRecords:
 ```bash
 python create_pretraining_data.py
  --input_file=<RAW TEXT FILE with documents/article sperated by an empty line>
@@ -71,26 +71,26 @@ python create_pretraining_data.py
  --tokenizer_dir=<Directory with the GPT2 Tokenizer files>
  ```
- - Finetuning:
  ```bash
- python3 run_pretraining.py \
- --input_file="gs://<GS_BUCKET>/pretraining_data/*" \
- --output_dir="gs://<GS_BUCKET>/pretraining_model/" \
- --config_file="config/small_hparams.json" \
- --batch_size=128 \
- --eval_batch_size=8 \
- --num_train_steps= \
- --num_warmup_steps= \
- --learning_rate= \
- --save_checkpoints_steps= \
- --max_seq_length=1024 \
- --max_eval_steps= \
- --optimizer="lamb" \
- --iterations_per_loop=5000 \
- --keep_checkpoint_max=10 \
- --use_tpu=True \
- --tpu_name=<TPU NAME> \
- --do_train=True \
  --do_eval=False
  ```
 # Model Sizes
@@ -133,13 +133,18 @@ The text generated by AraGPT2 is automatically generated by a neural network mod
 # If you used this model please cite us as :
 ```
-@misc{antoun2020aragpt2,
-      title={AraGPT2: Pre-Trained Transformer for Arabic Language Generation},
-      author={Wissam Antoun and Fady Baly and Hazem Hajj},
-      year={2020},
-      eprint={2012.15520},
-      archivePrefix={arXiv},
-      primaryClass={cs.CL}
 }
 ```

 ## Finetuning using our code with TF 1.15.4:
+Create the Training TFRecords:
 ```bash
 python create_pretraining_data.py
  --input_file=<RAW TEXT FILE with documents/article sperated by an empty line>
  --tokenizer_dir=<Directory with the GPT2 Tokenizer files>
  ```
+ Finetuning:
  ```bash
+ python3 run_pretraining.py \\
+ --input_file="gs://<GS_BUCKET>/pretraining_data/*" \\
+ --output_dir="gs://<GS_BUCKET>/pretraining_model/" \\
+ --config_file="config/small_hparams.json" \\
+ --batch_size=128 \\
+ --eval_batch_size=8 \\
+ --num_train_steps= \\
+ --num_warmup_steps= \\
+ --learning_rate= \\
+ --save_checkpoints_steps= \\
+ --max_seq_length=1024 \\
+ --max_eval_steps= \\
+ --optimizer="lamb" \\
+ --iterations_per_loop=5000 \\
+ --keep_checkpoint_max=10 \\
+ --use_tpu=True \\
+ --tpu_name=<TPU NAME> \\
+ --do_train=True \\
  --do_eval=False
  ```
 # Model Sizes
 # If you used this model please cite us as :
 ```
+@inproceedings{antoun-etal-2021-aragpt2,
+    title = "{A}ra{GPT}2: Pre-Trained Transformer for {A}rabic Language Generation",
+    author = "Antoun, Wissam  and
+      Baly, Fady  and
+      Hajj, Hazem",
+    booktitle = "Proceedings of the Sixth Arabic Natural Language Processing Workshop",
+    month = apr,
+    year = "2021",
+    address = "Kyiv, Ukraine (Virtual)",
+    publisher = "Association for Computational Linguistics",
+    url = "https://www.aclweb.org/anthology/2021.wanlp-1.21",
+    pages = "196--207",
 }
 ```