Commit
·
f1d8837
1
Parent(s):
67717d7
update readme
Browse files
README.md
CHANGED
|
@@ -20,7 +20,7 @@ language:
|
|
| 20 |
|
| 21 |
## Model desription
|
| 22 |
|
| 23 |
-
AfriTeVa
|
| 24 |
|
| 25 |
## Languages
|
| 26 |
|
|
@@ -30,7 +30,7 @@ Afaan Oromoo(orm), Amharic(amh), Gahuza(gah), Hausa(hau), Igbo(igb), Nigerian Pi
|
|
| 30 |
|
| 31 |
### The model
|
| 32 |
|
| 33 |
-
-
|
| 34 |
- 12 layers, 12 attention heads and 512 token sequence length
|
| 35 |
|
| 36 |
### The dataset
|
|
@@ -39,6 +39,27 @@ Afaan Oromoo(orm), Amharic(amh), Gahuza(gah), Hausa(hau), Igbo(igb), Nigerian Pi
|
|
| 39 |
- 143 Million Tokens (1GB of text data)
|
| 40 |
- Tokenizer Vocabulary Size: 70,000 tokens
|
| 41 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 42 |
## Training Procedure
|
| 43 |
|
| 44 |
For information on training procedures, please refer to the AfriTeVa [paper](#) or [repository](https://github.com/castorini/afriteva)
|
|
|
|
| 20 |
|
| 21 |
## Model desription
|
| 22 |
|
| 23 |
+
AfriTeVa large is a multilingual sequence to sequence model pretrained on 10 African languages
|
| 24 |
|
| 25 |
## Languages
|
| 26 |
|
|
|
|
| 30 |
|
| 31 |
### The model
|
| 32 |
|
| 33 |
+
- 745M parameters encoder-decoder architecture (T5-like)
|
| 34 |
- 12 layers, 12 attention heads and 512 token sequence length
|
| 35 |
|
| 36 |
### The dataset
|
|
|
|
| 39 |
- 143 Million Tokens (1GB of text data)
|
| 40 |
- Tokenizer Vocabulary Size: 70,000 tokens
|
| 41 |
|
| 42 |
+
|
| 43 |
+
## Intended uses & limitations
|
| 44 |
+
|
| 45 |
+
`afriteva_base` is pre-trained model and primarily aimed at being fine-tuned on multilingual sequence-to-sequence tasks.
|
| 46 |
+
|
| 47 |
+
```python
|
| 48 |
+
>>> from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
|
| 49 |
+
|
| 50 |
+
>>> tokenizer = AutoTokenizer.from_pretrained("castorini/afriteva_base")
|
| 51 |
+
>>> model = AutoModelForSeq2SeqLM.from_pretrained("castorini/afriteva_base")
|
| 52 |
+
|
| 53 |
+
>>> src_text = "Ó hùn ọ́ láti di ara wa bí?"
|
| 54 |
+
>>> tgt_text = "Would you like to be?"
|
| 55 |
+
|
| 56 |
+
>>> model_inputs = tokenizer(src_text, return_tensors="pt")
|
| 57 |
+
>>> with tokenizer.as_target_tokenizer():
|
| 58 |
+
labels = tokenizer(tgt_text, return_tensors="pt").input_ids
|
| 59 |
+
|
| 60 |
+
>>> model(**model_inputs, labels=labels) # forward pass
|
| 61 |
+
```
|
| 62 |
+
|
| 63 |
## Training Procedure
|
| 64 |
|
| 65 |
For information on training procedures, please refer to the AfriTeVa [paper](#) or [repository](https://github.com/castorini/afriteva)
|