Zulkifli1409 commited on
Commit
b4fc020
Β·
verified Β·
1 Parent(s): ed5ef3f

Create readme.md

Browse files
Files changed (1) hide show
  1. readme.md +148 -0
readme.md ADDED
@@ -0,0 +1,148 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # πŸ“Š Aduan Classification Model (IndoBERT)
2
+
3
+ Model ini dilatih untuk **klasifikasi teks aduan masyarakat** dalam Bahasa Indonesia menggunakan **IndoBERT (indobenchmark/indobert-base-p1)**.
4
+ Model dapat mengelompokkan aduan ke dalam 4 kategori:
5
+
6
+ - **DARURAT** β†’ Situasi darurat (kebakaran, kecelakaan, bencana)
7
+ - **PRIORITAS** β†’ Perlu penanganan cepat (jalan rusak, kebersihan, infrastruktur)
8
+ - **UMUM** β†’ Informasi / pertanyaan umum
9
+ - **LAINNYA** β†’ Aduan lain yang tidak termasuk kategori di atas
10
+
11
+ ---
12
+
13
+ ## πŸ“‚ Files
14
+
15
+ - `model.safetensors` β†’ model terlatih (498MB)
16
+ - `aduan_model.pt` β†’ backup format pickle
17
+ - `config.json`, `tokenizer.json`, `vocab.txt` β†’ konfigurasi dan tokenizer
18
+ - `special_tokens_map.json`, `tokenizer_config.json` β†’ mapping tokenizer
19
+
20
+ ---
21
+
22
+ ## πŸ“Š Dataset & Training
23
+
24
+ - **Total data (raw)**: 3,373
25
+ - Darurat: 900
26
+ - Prioritas: 875
27
+ - Umum: 880
28
+ - Lainnya: 718
29
+ - **Augmentasi** β†’ 3,600 (balance 900 per kelas)
30
+ - **Split** β†’ 80% Train (2880) | 20% Validation (720)
31
+ - **Base model** β†’ `indobenchmark/indobert-base-p1`
32
+ - **Device training** β†’ NVIDIA RTX 3050 Laptop GPU (CUDA)
33
+
34
+ ---
35
+
36
+ ## πŸ“ˆ Hasil Evaluasi
37
+
38
+ - **Best Epoch** β†’ 3
39
+ - **Validation Accuracy** β†’ **93.89%**
40
+ - **Macro F1-score** β†’ **0.9389**
41
+
42
+ ### πŸ“‘ Classification Report
43
+ | Label | Precision | Recall | F1-score |
44
+ |------------|-----------|--------|----------|
45
+ | Darurat | 0.9435 | 0.9278 | 0.9356 |
46
+ | Prioritas | 0.9257 | 0.9000 | 0.9127 |
47
+ | Umum | 0.9026 | 0.9778 | 0.9387 |
48
+ | Lainnya | 0.9884 | 0.9500 | 0.9688 |
49
+ | **Macro Avg** | 0.9401 | 0.9389 | 0.9389 |
50
+
51
+ ### πŸ”’ Confusion Matrix
52
+ ```
53
+
54
+ [[167 10 3 0] # Darurat
55
+ [ 6 162 11 1] # Prioritas
56
+ [ 1 2 176 1] # Umum
57
+ [ 3 1 5 171]] # Lainnya
58
+
59
+ ````
60
+
61
+ ---
62
+
63
+ ## πŸ§ͺ Contoh Prediksi
64
+
65
+ ### Single Input
66
+ ```python
67
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
68
+ import torch
69
+
70
+ model_name = "Zulkifli1409/aduan-model"
71
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
72
+ model = AutoModelForSequenceClassification.from_pretrained(model_name)
73
+
74
+ text = "Ada kebakaran besar di jalan sudirman, tolong kirim pemadam!"
75
+ inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)
76
+ outputs = model(**inputs)
77
+ probs = torch.nn.functional.softmax(outputs.logits, dim=1)
78
+
79
+ pred_idx = torch.argmax(probs).item()
80
+ labels = ["DARURAT", "PRIORITAS", "UMUM", "LAINNYA"]
81
+
82
+ print("Prediksi:", labels[pred_idx])
83
+ print("Probabilitas:", probs.tolist())
84
+ ````
85
+
86
+ ### Output:
87
+
88
+ ```
89
+ Prediksi: DARURAT
90
+ Probabilitas: [[0.9823, 0.0145, 0.0021, 0.0011]]
91
+ ```
92
+
93
+ ---
94
+
95
+ ## πŸ“¦ Advanced Prediction Tests
96
+
97
+ | Teks Aduan | Prediksi | Confidence |
98
+ | ----------------------------------------- | --------- | ---------- |
99
+ | ada kebakaran besar di pasar tolong cepat | DARURAT | 60.62% |
100
+ | jalan berlubang perlu diperbaiki | PRIORITAS | 78.47% |
101
+ | mohon pencerahan tentang program desa | UMUM | 72.09% |
102
+ | ada orang kecelakaan parah butuh ambulans | DARURAT | 74.29% |
103
+ | sampah menumpuk di jalan | PRIORITAS | 71.17% |
104
+ | banjir tinggi merendam rumah warga | DARURAT | 58.01% |
105
+
106
+ ---
107
+
108
+ ## πŸš€ Deployment
109
+
110
+ Model ini juga tersedia dalam bentuk API di Railway:
111
+
112
+ ```
113
+ Base URL: https://api-klasifikasi-aduan.up.railway.app
114
+ ```
115
+
116
+ Contoh request:
117
+
118
+ ```bash
119
+ curl -X POST https://api-klasifikasi-aduan.up.railway.app/predict \
120
+ -H "Content-Type: application/json" \
121
+ -d '{"text": "Ada kebakaran di pasar"}'
122
+ ```
123
+
124
+ Response:
125
+
126
+ ```json
127
+ {
128
+ "label": "DARURAT",
129
+ "confidence": 0.9823,
130
+ "all_scores": {
131
+ "DARURAT": 0.9823,
132
+ "PRIORITAS": 0.0145,
133
+ "UMUM": 0.0021,
134
+ "LAINNYA": 0.0011
135
+ }
136
+ }
137
+ ```
138
+
139
+ ---
140
+
141
+ ## πŸ“§ Kontak
142
+
143
+ Dikembangkan oleh **Zulkifli1409**
144
+ Jika ada pertanyaan atau saran, silakan buka *issue* atau hubungi via [Hugging Face profile](https://huggingface.co/Zulkifli1409).
145
+
146
+ ---
147
+
148
+ **Β© 2025 Klasifikasi Aduan Model**