Spaces:
Running
A newer version of the Gradio SDK is available: 6.15.2
title: Matcha Sentiment
emoji: π΅
colorFrom: green
colorTo: yellow
sdk: gradio
sdk_version: 5.0.1
app_file: app.py
python_version: '3.12'
pinned: false
license: mit
Matcha Sentiment
Sentiment analysis bahasa Indonesia untuk review Matchaya/IKUYO. Dataset dibersihkan menjadi klasifikasi biner Negatif dan Positif, lalu dibandingkan dengan baseline machine learning klasik dan fine-tuning 9 model Transformer Indonesia.
Ringkasan
| Area | Hasil |
|---|---|
| Dataset final | 2028 review |
| Label | 1014 Negatif, 1014 Positif |
| Label dihapus | 14 Netral |
| Duplikat dibuang | 219 teks |
| Best classical | TF-IDF + Linear SVM |
| Best Transformer | indolem/indobert-base-uncased |
| Runtime | Docker + NVIDIA GPU |
| Dashboard | Gradio, siap Hugging Face Spaces |
Catatan push: model Transformer terbaik disimpan di
models/best_transformerdan ditrack lewat Git LFS. Weight kandidat dimodels/transformers/*/model/model.safetensorsdi-ignore karena bisa dibuat ulang dari pipeline training.
Hasil Utama
Transformer
| Model | Accuracy | Precision | Recall | F1 | ROC AUC |
|---|---|---|---|---|---|
indolem/indobert-base-uncased |
0.9951 | 0.9902 | 1.0000 | 0.9951 | 0.9998 |
naufalihsan/indonesian-sbert-large |
0.9901 | 0.9806 | 1.0000 | 0.9902 | 0.9998 |
flax-community/indonesian-roberta-base |
0.9901 | 0.9806 | 1.0000 | 0.9902 | 0.9997 |
sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 |
0.9901 | 0.9806 | 1.0000 | 0.9902 | 0.9996 |
indobenchmark/indobert-base-p1 |
0.9901 | 0.9806 | 1.0000 | 0.9902 | 0.9989 |
ChristopherA08/IndoELECTRA |
0.9901 | 0.9806 | 1.0000 | 0.9902 | 0.9989 |
cahya/distilbert-base-indonesian |
0.9901 | 0.9901 | 0.9901 | 0.9901 | 0.9998 |
indolem/indobertweet-base-uncased |
0.9852 | 0.9804 | 0.9901 | 0.9852 | 0.9994 |
w11wo/indonesian-roberta-base-sentiment-classifier |
0.9803 | 0.9619 | 1.0000 | 0.9806 | 1.0000 |
Model terbaik sudah disimpan di:
models/best_transformer
Machine Learning Klasik
TF-IDF dan Word2Vec diuji dengan Stratified 10-fold cross validation. Hasil lengkap:
| Feature | Model | Accuracy | Precision | Recall | F1 | ROC AUC |
|---|---|---|---|---|---|---|
| TF-IDF | Linear SVM | 0.9684 | 0.9788 | 0.9576 | 0.9681 | 0.9951 |
| Word2Vec | Logistic Regression | 0.9635 | 0.9663 | 0.9606 | 0.9634 | 0.9939 |
| Word2Vec | Extra Trees | 0.9625 | 0.9653 | 0.9596 | 0.9624 | 0.9940 |
| TF-IDF | Logistic Regression | 0.9610 | 0.9756 | 0.9458 | 0.9604 | 0.9933 |
| Word2Vec | Linear SVM | 0.9596 | 0.9660 | 0.9527 | 0.9593 | 0.9924 |
| Word2Vec | Random Forest | 0.9591 | 0.9632 | 0.9546 | 0.9589 | 0.9927 |
| Word2Vec | Gradient Boosting | 0.9571 | 0.9603 | 0.9536 | 0.9570 | 0.9933 |
| TF-IDF | Extra Trees | 0.9522 | 0.9580 | 0.9458 | 0.9519 | 0.9918 |
| TF-IDF | Random Forest | 0.9443 | 0.9353 | 0.9546 | 0.9449 | 0.9882 |
| TF-IDF | Gradient Boosting | 0.9147 | 0.9304 | 0.8964 | 0.9131 | 0.9735 |
Visual Evaluasi
Dashboard
Detail Plot
Kata Kunci Bermakna
Beberapa kata yang paling membantu membaca arah sentimen:
| Kata | Positif Docs | Negatif Docs | Dominan |
|---|---|---|---|
enak |
173 | 10 | Positif |
nyaman |
54 | 10 | Positif |
ramah |
37 | 9 | Positif |
terbaik |
18 | 0 | Positif |
mahal |
10 | 24 | Negatif |
harga |
10 | 28 | Negatif |
buruk |
0 | 19 | Negatif |
antrean |
0 | 19 | Negatif |
lama |
1 | 16 | Negatif |
File lengkapnya ada di:
artifacts/classical/keyword_counts.csv
Struktur Proyek
.
βββ app.py
βββ Dockerfile
βββ docker-compose.yml
βββ INSTALL_DOCKER.md
βββ data/processed/matcha_sentiment_binary.csv
βββ docs/images/
βββ artifacts/
βββ models/best_transformer/
βββ models/classical/best_model.joblib
βββ scripts/
βββ src/matcha_sentiment/
Quick Start
docker build -t matcha-sentiment .
docker run --rm --gpus all -p 7860:7860 -v "${PWD}:/workspace" matcha-sentiment
Buka:
http://localhost:7860
Panduan dari nol sampai deploy ada di INSTALL_DOCKER.md.
Catatan
Skor evaluasi sangat tinggi karena dataset masih kecil dan domainnya sempit. Model ini sudah bagus untuk demo, dashboard, dan eksperimen sentiment analysis review matcha, tetapi untuk production lintas brand atau lintas kategori sebaiknya ditambah data baru yang lebih beragam.









