Tiep's picture
Add references folder and research skills
ef06968

Benchmark Comparison: Vietnamese Text Classification

VNTC Dataset (10-topic News Classification)

Model Year Accuracy F1 (weighted) Training Time Inference Size
N-gram LM (Vu et al.) 2007 97.1% - ~79 min - -
SVM Multi (Vu et al.) 2007 93.4% - ~79 min - -
sonar_core_1 (SVC) - 92.80% 92.0% ~54.6 min - ~75MB
Sen-1 (LinearSVC) 2026 92.49% 92.40% 37.6s 66K/sec 2.4MB
PhoBERT-base* 2020 ~95-97% ~95% Hours (GPU) ~20/sec ~400MB

*PhoBERT not directly evaluated on VNTC in original paper; estimates from similar tasks.

UTS2017_Bank Dataset (14-category Banking)

Model Accuracy F1 (weighted) F1 (macro) Training Time
Sen-1 75.76% 72.70% 36.18% 0.13s
sonar_core_1 72.47% 66.0% - ~5.3s

Vietnamese Pretrained Models Comparison

Model Architecture Pre-training Data Languages Vietnamese Tasks
PhoBERT RoBERTa 20GB Vietnamese 1 (vi) SOTA on POS, NER, NLI
ViSoBERT XLM-R Social media corpus 1 (vi) SOTA on social media tasks
vELECTRA ELECTRA 60GB Vietnamese 1 (vi) Strong on tagging/classification
viBERT BERT 10GB Vietnamese 1 (vi) Baseline Vietnamese BERT
XLM-R RoBERTa CC-100 (2.5TB) 100 Strong multilingual baseline
mBERT BERT Wikipedia 104 Weakest on Vietnamese

SMTCE Benchmark Results (Best model per task)

Task Best Model Score Runner-up
UIT-VSMEC (Emotion) PhoBERT 65.44% F1 viBERT4news
ViOCD (Complaint) vELECTRA 95.26% F1 PhoBERT
ViHSD (Hate Speech) PhoBERT - XLM-R
ViCTSD (Constructive) PhoBERT - vELECTRA
UIT-VSFC (Sentiment) PhoBERT - viBERT

Speed vs Accuracy Trade-off

Accuracy (%)
 97 |  * N-gram LM (Vu 2007)
 96 |
 95 |                               * PhoBERT (estimated)
 94 |
 93 |  * SVM Multi
 92 |  * Sen-1
 91 |
    +--------+--------+--------+---------->
    0.01s    1s      1min    1hr     Training Time

Model Size vs Accuracy

Model Size VNTC Accuracy Ratio (Acc/MB)
Sen-1 2.4 MB 92.49% 38.5
PhoBERT-base ~400 MB ~95% 0.24
XLM-R-base ~1.1 GB ~93% 0.08

Sen-1 is ~160x more efficient in accuracy-per-MB than PhoBERT.