feverlash commited on
Commit
5e50bc7
ยท
verified ยท
1 Parent(s): bd36f2f

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -0
README.md ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - kornwtp/indonlu-smsa
4
+ language:
5
+ - id
6
+ metrics:
7
+ - accuracy
8
+ base_model:
9
+ - indobenchmark/indobert-base-p1
10
+ pipeline_tag: text-classification
11
+ ---
12
+
13
+ # Indonesian Text Sentiment Analysis ๐Ÿš€
14
+ **Model Is Under Development**
15
+ ## ๐Ÿ“Œ Overview
16
+ This project fine-tunes a **transformer-based model** to analyze sentiment for Indonesian text.
17
+
18
+ ## ๐Ÿ“ฅ Data Collection
19
+ The dataset used for fine-tuning was sourced from **IndoNLU Datasets**, specifically:
20
+ [SmSA (IndoNLU) Dataset](https://metatext.io/datasets/smsa-(indonlu))
21
+
22
+ ## ๐Ÿ”„ Data Preparation
23
+ - **Tokenization**:
24
+ - Used **Indobert** for efficient text processing.
25
+ - **Train-Test Split**:
26
+ - The Dataset is already splitted into train, validation, and test.
27
+
28
+ ## ๐Ÿ‹๏ธ Fine-Tuning & Results
29
+ The model was fine-tuned using **TensorFlow Hugging Face Transformers**.
30
+
31
+ ### **๐Ÿ“Š Evaluation Metrics**
32
+ | **Epoch** | **Train Loss** | **Train Accuracy** | **Eval Loss** | **Eval Accuracy** | **Training Time** | **Validation Time** |
33
+ |-----------|----------------|---------------------|---------------|-------------------|-------------------|---------------------|
34
+ | **1** | `0.2471` | `88.15%` | `0.2107` | `91.31%` | `7:55 min` | `10 sec` |
35
+ | **2** | `0.1844` | `90.41%` | `0.2107` | `92.39%` | `7:50 min` | `10 sec` |
36
+ | **3** | `0.1502` | `91.66%` | `0.2135` | `93.14%` | `7:51 min` | `9 sec` |
37
+ | **4** | `0.1285` | `92.50%` | `0.2192` | `93.69%` | `7:50 min` | `10 sec` |
38
+ | **5** | `0.1101` | `93.13%` | `0.2367` | `94.14%` | `7:48 min` | `9 sec` |
39
+
40
+ ## โš™๏ธ Training Parameters
41
+ epochs = 5
42
+ learning_rate = 5e-5
43
+ seed_val = 42
44
+ max_length = 128
45
+ batch_size = 32
46
+ eval_batch_size = 32