Hailay commited on
Commit
b21073e
·
verified ·
1 Parent(s): b0faab7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -13
README.md CHANGED
@@ -5,23 +5,23 @@ language:
5
  - ti
6
  ---
7
  ---
8
- ## ***1. Model Description
9
- Hailay/FT_EXLMR is a fine-tuned version of the EXLMR model, designed specifically for sentiment analysis and text classification tasks in low-resource African languages such as Tigrinya, Amharic, and Oromo. This model leverages the architecture of EXLMR but has been further fine-tuned to improve its performance on multilingual tasks, especially for languages not widely represented in existing NLP models.
10
  The model was trained using the AfriSent-Semeval-2023 dataset, a benchmark dataset for African languages, which is publicly available on GitHub:[AfriSent-Semeval-2023 GitHub Repository](https://github.com/afrisenti-semeval/afrisent-semeval-2023)
11
 
12
- ## ***2. Intended Use
13
  This model is ideal for:
14
 
15
- Researchers and developers working on multilingual sentiment analysis in African languages.
16
  Applications that require text classification in low-resource languages.
17
  It is designed specifically for tasks such as:
18
-
19
  Sentiment analysis
20
  Text classification
21
- Note: The model is not suitable for other tasks like machine translation or named entity recognition without further fine-tuning.
22
 
23
- ## ***3. Training Data**
24
- The `Hailay/FT_EXLMR` model was trained using the dataset from the **SemEval 2023 Shared Task 12: Sentiment Analysis in African Languages (AfriSenti-SemEval)**. This dataset comprises sentiment-labeled text from 14 African languages:
 
 
25
 
26
  1. Algerian Arabic (arq) - Algeria
27
  2. Amharic (ama) - Ethiopia
@@ -33,19 +33,20 @@ The `Hailay/FT_EXLMR` model was trained using the dataset from the **SemEval 202
33
  8. Nigerian Pidgin (pcm) - Nigeria
34
  9. Oromo (orm) - Ethiopia
35
  10. Swahili (swa) - Kenya/Tanzania
36
- 11. Tigrinya (tir) - Ethiopia
37
- 12. Twi (twi) - Ghana
38
  13. Xithonga (tso) - Mozambique
39
  14. Yoruba (yor) - Nigeria
40
 
41
- The dataset covers multiple countries and linguistic groups, providing diverse data for training multilingual models like `Hailay/FT_EXLMR`. You can access the dataset via the [AfriSent-Semeval-2023 GitHub Repository](https://github.com/afrisenti-semeval/afrisent-semeval-2023).
42
- The Hailay/FT_EXLMR model was trained using the following configuration:
 
43
  Epochs: 3
44
  Learning Rate: 1e-5
45
  Optimizer: AdamW
46
  Batch Size: 16
47
 
48
- ## *** 4. Evaluation
49
 
50
  The model was evaluated using accuracy and loss as the primary metrics. The results are as follows:
51
 
 
5
  - ti
6
  ---
7
  ---
8
+ ## 1. Model Description
9
+ **Hailay/FT_EXLMR** is a fine-tuned version of the EXLMR model, designed specifically for sentiment analysis and text classification tasks in low-resource African languages such as Tigrinya, Amharic, and Oromo. This model leverages the architecture of EXLMR but has been further fine-tuned to improve its performance on multilingual tasks, especially for languages not widely represented in existing NLP models.
10
  The model was trained using the AfriSent-Semeval-2023 dataset, a benchmark dataset for African languages, which is publicly available on GitHub:[AfriSent-Semeval-2023 GitHub Repository](https://github.com/afrisenti-semeval/afrisent-semeval-2023)
11
 
12
+ ## 2.Intended Use
13
  This model is ideal for:
14
 
15
+ Researchers and developers who are working on multilingual sentiment analysis in African languages.
16
  Applications that require text classification in low-resource languages.
17
  It is designed specifically for tasks such as:
 
18
  Sentiment analysis
19
  Text classification
 
20
 
21
+ **Note:** The model is not suitable for other tasks like machine translation or named entity recognition without further fine-tuning.
22
+
23
+ ## 3.Training Data**
24
+ The **Hailay/FT_EXLMR** model was trained using the dataset from the **SemEval 2023 Shared Task 12: Sentiment Analysis in African Languages (AfriSenti-SemEval)**. This dataset comprises sentiment-labeled text from 14 African languages:
25
 
26
  1. Algerian Arabic (arq) - Algeria
27
  2. Amharic (ama) - Ethiopia
 
33
  8. Nigerian Pidgin (pcm) - Nigeria
34
  9. Oromo (orm) - Ethiopia
35
  10. Swahili (swa) - Kenya/Tanzania
36
+ 11. Tigrinya (tir) - Ethiopia
37
+ 12. Twi (twi) - Ghana
38
  13. Xithonga (tso) - Mozambique
39
  14. Yoruba (yor) - Nigeria
40
 
41
+ The dataset covers diverse data for training multilingual models like `Hailay/FT_EXLMR`.
42
+ You can access the dataset via the [AfriSent-Semeval-2023 GitHub Repository](https://github.com/afrisenti-semeval/afrisent-semeval-2023).
43
+ The **Hailay/FT_EXLMR** model was trained using the following configuration:
44
  Epochs: 3
45
  Learning Rate: 1e-5
46
  Optimizer: AdamW
47
  Batch Size: 16
48
 
49
+ ## 4. Evaluation
50
 
51
  The model was evaluated using accuracy and loss as the primary metrics. The results are as follows:
52