Initial upload of FinAraT5 model
Browse files- README.md +46 -0
- config.json +30 -0
- pytorch_model.bin +3 -0
- special_tokens_map.json +1 -0
- spiece.model +3 -0
- tokenizer_config.json +1 -0
README.md
ADDED
|
@@ -0,0 +1,46 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
language:
|
| 4 |
+
- ar
|
| 5 |
+
tags:
|
| 6 |
+
- Financial Arabic T5
|
| 7 |
+
- MSA
|
| 8 |
+
- Financial Annual Report
|
| 9 |
+
- Arabic Machine Translation
|
| 10 |
+
- Arabic Text Summarization
|
| 11 |
+
- Arabic News Title and Question Generation
|
| 12 |
+
- Arabic Paraphrasing and Transliteration
|
| 13 |
+
- Arabic Code-Switched Translation
|
| 14 |
+
---
|
| 15 |
+
|
| 16 |
+
# FinAraT5-msa-base
|
| 17 |
+
# FinAraT5: A text tot text model for financial Arabic text generation
|
| 18 |
+
|
| 19 |
+
<img src="FinAraT5.png" alt="FinAraT5" width="45%" height="35%" align="right"/>
|
| 20 |
+
|
| 21 |
+
|
| 22 |
+
# Abstract
|
| 23 |
+
Transfer learning and language models have changed the NLP landscape during the last years since the appearance of BERT, BART and T5. These models are setting a new state of the art for several NLU and NLG tasks by being trained on a large scale of unlabeled text resources. The majority of available models are pretrained on English corpora and are for general purposes. In this work, we present FinAraT5, the first text-to-text pretrained Arabic Language model designed for financial use cases. The financial sector generates a huge amount of data in a multilingual manner. However, there is no pretrained finance-specific language model in Arabic. FinAraT5 is based on AraT5.
|
| 24 |
+
|
| 25 |
+
|
| 26 |
+
---
|
| 27 |
+
# How to finetune this model
|
| 28 |
+
Below is an example for fine-tuning **FinAraT5-base** for News Title Generation on any dataset
|
| 29 |
+
``` bash
|
| 30 |
+
!python run_trainier_seq2seq_huggingface.py \
|
| 31 |
+
--learning_rate 5e-5 \
|
| 32 |
+
--max_target_length 128 --max_source_length 128 \
|
| 33 |
+
--per_device_train_batch_size 8 --per_device_eval_batch_size 8 \
|
| 34 |
+
--model_name_or_path "model_id" \
|
| 35 |
+
--output_dir "/content/FinAraT5_FT_title_generation" --overwrite_output_dir \
|
| 36 |
+
--num_train_epochs 3 \
|
| 37 |
+
--train_file "/content/FinARGEn_title_genration_sample_train.tsv" \
|
| 38 |
+
--validation_file "/content/FinARGEn_title_genration_sample_valid.tsv" \
|
| 39 |
+
--task "title_generation" --text_column "document" --summary_column "title" \
|
| 40 |
+
--load_best_model_at_end --metric_for_best_model "eval_bertscore" --greater_is_better True --evaluation_strategy epoch --logging_strategy epoch --predict_with_generate\
|
| 41 |
+
--do_train --do_eval
|
| 42 |
+
```
|
| 43 |
+
|
| 44 |
+
## Acknowledgments
|
| 45 |
+
We gratefully acknowledge the [Google TensorFlow Research Cloud (TFRC)](https://www.tensorflow.org/tfrc) program for the free TPU V3.8 access and we thank the google cloud team for the free GCP credits to perform this research.
|
| 46 |
+
|
config.json
ADDED
|
@@ -0,0 +1,30 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_name_or_path": "/home/patrick/hugging_face/t5/t5-v1_1-base",
|
| 3 |
+
"architectures": [
|
| 4 |
+
"T5ForConditionalGeneration"
|
| 5 |
+
],
|
| 6 |
+
"d_ff": 2048,
|
| 7 |
+
"d_kv": 64,
|
| 8 |
+
"d_model": 768,
|
| 9 |
+
"decoder_start_token_id": 0,
|
| 10 |
+
"dense_act_fn": "gelu_new",
|
| 11 |
+
"dropout_rate": 0.1,
|
| 12 |
+
"eos_token_id": 1,
|
| 13 |
+
"feed_forward_proj": "gated-gelu",
|
| 14 |
+
"initializer_factor": 1.0,
|
| 15 |
+
"is_encoder_decoder": true,
|
| 16 |
+
"is_gated_act": true,
|
| 17 |
+
"layer_norm_epsilon": 1e-06,
|
| 18 |
+
"model_type": "t5",
|
| 19 |
+
"num_decoder_layers": 12,
|
| 20 |
+
"num_heads": 12,
|
| 21 |
+
"num_layers": 12,
|
| 22 |
+
"output_past": true,
|
| 23 |
+
"pad_token_id": 0,
|
| 24 |
+
"relative_attention_max_distance": 128,
|
| 25 |
+
"relative_attention_num_buckets": 32,
|
| 26 |
+
"torch_dtype": "float32",
|
| 27 |
+
"transformers_version": "4.24.0.dev0",
|
| 28 |
+
"use_cache": true,
|
| 29 |
+
"vocab_size": 110080
|
| 30 |
+
}
|
pytorch_model.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:394a709edfe0052bd35130ea6da1a8e4a15e52cdfada70266b27127fd10d83a5
|
| 3 |
+
size 1131173775
|
special_tokens_map.json
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
{"eos_token": "</s>", "unk_token": "<unk>", "pad_token": "<pad>", "additional_special_tokens": ["<extra_id_0>", "<extra_id_1>", "<extra_id_2>", "<extra_id_3>", "<extra_id_4>", "<extra_id_5>", "<extra_id_6>", "<extra_id_7>", "<extra_id_8>", "<extra_id_9>", "<extra_id_10>", "<extra_id_11>", "<extra_id_12>", "<extra_id_13>", "<extra_id_14>", "<extra_id_15>", "<extra_id_16>", "<extra_id_17>", "<extra_id_18>", "<extra_id_19>", "<extra_id_20>", "<extra_id_21>", "<extra_id_22>", "<extra_id_23>", "<extra_id_24>", "<extra_id_25>", "<extra_id_26>", "<extra_id_27>", "<extra_id_28>", "<extra_id_29>", "<extra_id_30>", "<extra_id_31>", "<extra_id_32>", "<extra_id_33>", "<extra_id_34>", "<extra_id_35>", "<extra_id_36>", "<extra_id_37>", "<extra_id_38>", "<extra_id_39>", "<extra_id_40>", "<extra_id_41>", "<extra_id_42>", "<extra_id_43>", "<extra_id_44>", "<extra_id_45>", "<extra_id_46>", "<extra_id_47>", "<extra_id_48>", "<extra_id_49>", "<extra_id_50>", "<extra_id_51>", "<extra_id_52>", "<extra_id_53>", "<extra_id_54>", "<extra_id_55>", "<extra_id_56>", "<extra_id_57>", "<extra_id_58>", "<extra_id_59>", "<extra_id_60>", "<extra_id_61>", "<extra_id_62>", "<extra_id_63>", "<extra_id_64>", "<extra_id_65>", "<extra_id_66>", "<extra_id_67>", "<extra_id_68>", "<extra_id_69>", "<extra_id_70>", "<extra_id_71>", "<extra_id_72>", "<extra_id_73>", "<extra_id_74>", "<extra_id_75>", "<extra_id_76>", "<extra_id_77>", "<extra_id_78>", "<extra_id_79>", "<extra_id_80>", "<extra_id_81>", "<extra_id_82>", "<extra_id_83>", "<extra_id_84>", "<extra_id_85>", "<extra_id_86>", "<extra_id_87>", "<extra_id_88>", "<extra_id_89>", "<extra_id_90>", "<extra_id_91>", "<extra_id_92>", "<extra_id_93>", "<extra_id_94>", "<extra_id_95>", "<extra_id_96>", "<extra_id_97>", "<extra_id_98>", "<extra_id_99>"]}
|
spiece.model
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5c7ae4407451bf02b459edec774d0539a06615005dd34d5c85c5c06765ff1606
|
| 3 |
+
size 2435308
|
tokenizer_config.json
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
{"eos_token": "</s>", "unk_token": "<unk>", "pad_token": "<pad>", "extra_ids": 100, "additional_special_tokens": ["<extra_id_0>", "<extra_id_1>", "<extra_id_2>", "<extra_id_3>", "<extra_id_4>", "<extra_id_5>", "<extra_id_6>", "<extra_id_7>", "<extra_id_8>", "<extra_id_9>", "<extra_id_10>", "<extra_id_11>", "<extra_id_12>", "<extra_id_13>", "<extra_id_14>", "<extra_id_15>", "<extra_id_16>", "<extra_id_17>", "<extra_id_18>", "<extra_id_19>", "<extra_id_20>", "<extra_id_21>", "<extra_id_22>", "<extra_id_23>", "<extra_id_24>", "<extra_id_25>", "<extra_id_26>", "<extra_id_27>", "<extra_id_28>", "<extra_id_29>", "<extra_id_30>", "<extra_id_31>", "<extra_id_32>", "<extra_id_33>", "<extra_id_34>", "<extra_id_35>", "<extra_id_36>", "<extra_id_37>", "<extra_id_38>", "<extra_id_39>", "<extra_id_40>", "<extra_id_41>", "<extra_id_42>", "<extra_id_43>", "<extra_id_44>", "<extra_id_45>", "<extra_id_46>", "<extra_id_47>", "<extra_id_48>", "<extra_id_49>", "<extra_id_50>", "<extra_id_51>", "<extra_id_52>", "<extra_id_53>", "<extra_id_54>", "<extra_id_55>", "<extra_id_56>", "<extra_id_57>", "<extra_id_58>", "<extra_id_59>", "<extra_id_60>", "<extra_id_61>", "<extra_id_62>", "<extra_id_63>", "<extra_id_64>", "<extra_id_65>", "<extra_id_66>", "<extra_id_67>", "<extra_id_68>", "<extra_id_69>", "<extra_id_70>", "<extra_id_71>", "<extra_id_72>", "<extra_id_73>", "<extra_id_74>", "<extra_id_75>", "<extra_id_76>", "<extra_id_77>", "<extra_id_78>", "<extra_id_79>", "<extra_id_80>", "<extra_id_81>", "<extra_id_82>", "<extra_id_83>", "<extra_id_84>", "<extra_id_85>", "<extra_id_86>", "<extra_id_87>", "<extra_id_88>", "<extra_id_89>", "<extra_id_90>", "<extra_id_91>", "<extra_id_92>", "<extra_id_93>", "<extra_id_94>", "<extra_id_95>", "<extra_id_96>", "<extra_id_97>", "<extra_id_98>", "<extra_id_99>"]}
|