BERT-based Domain Classification for Japanese Complaint Texts
A BERT-based Japanese text classification model trained for domain classification of complaint texts.
Model Details
- Architecture: BERT for Sequence Classification
- Language: Japanese
- Task: Multi-class domain classification
- Framework: Hugging Face Transformers
Training Data
Training corpus:
BERT-basedDomainClassification_ComplaintTexts_ja Dataset
Dataset split:
- Train: 90%
- Validation: 5%
- Test: 5%
Evaluation
Test Accuracy: 73.0%
Performance Discussion
The model was trained on primarily formal written text (Wikimedia-derived corpus), while evaluation was conducted on complaint-style texts.
The domain gap between formal and conversational language likely contributed to reduced performance.
Intended Use
- Educational purposes
- Research prototyping
- Domain classification experiments
Limitations
- No domain adaptation applied
- Performance sensitive to genre distribution
Author
Independent implementation by Shota Tokunaga.
- Downloads last month
- 16
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support