| --- |
| language: |
| - hu |
| license: mit |
| tags: |
| - sentiment-analysis |
| - xlm-roberta |
| - hungarian |
| - text-classification |
| datasets: |
| - custom |
| metrics: |
| - accuracy |
| - f1 |
| pipeline_tag: text-classification |
| --- |
| |
| # Sentiment |
|
|
| Fine-tuned [xlm-roberta-base](https://huggingface.co/xlm-roberta-base) for **Hungarian sentiment classification**. |
|
|
| ## Model Details |
|
|
| - **Base model**: `xlm-roberta-base` |
| - **Task**: 3-class sentiment classification (negative / neutral / positive) |
| - **Language**: Hungarian |
| - **Training data**: ~37K sentences (stratified split from ~46K total) |
| - **Class weighting**: Balanced weights applied during training to handle class imbalance |
|
|
| ## Labels |
|
|
| | Label ID | Label | Description | |
| |----------|-------|-------------| |
| | 0 | negative | Negative sentiment | |
| | 1 | neutral | Neutral sentiment | |
| | 2 | positive | Positive sentiment | |
|
|
| ## Overall Results |
|
|
| | Metric | Value | |
| |--------|-------| |
| | Accuracy | 0.8442320225939605 | |
| | F1 (macro) | 0.8387464047460437 | |
| | F1 (weighted) | 0.8435908941071462 | |
|
|
| ## Per-Language Results |
|
|
| | Language | Samples | Accuracy | F1 (macro) | F1 (weighted) | |
| |----------|---------|----------|------------|---------------| |
| | hun | 4603 | 0.8442 | 0.8387 | 0.8436 | |
|
|
|
|
| ## Usage |
|
|
| ```python |
| from transformers import pipeline |
| |
| classifier = pipeline("text-classification", model="ringorsolya/Sentiment") |
| |
| classifier("Ez egy fantasztikus nap!") |
| # [{'label': 'positive', 'score': 0.95}] |
| |
| classifier("Szörnyű volt a kiszolgálás.") |
| # [{'label': 'negative', 'score': 0.92}] |
| ``` |
|
|
| ## Training Details |
|
|
| - **Epochs**: 5 |
| - **Batch size**: 32 |
| - **Learning rate**: 2e-05 |
| - **Weight decay**: 0.01 |
| - **Warmup ratio**: 0.1 |
| - **Max sequence length**: 128 |
| - **FP16**: True |
| - **Class weights**: [0.8114, 1.1219, 1.1413] |
|
|