ringorsolya commited on
Commit
204f411
·
verified ·
1 Parent(s): d9ac7cc

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +102 -3
README.md CHANGED
@@ -1,3 +1,102 @@
1
- ---
2
- license: cc-by-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - hu
4
+ - en
5
+ - de
6
+ - cs
7
+ - fr
8
+ - pl
9
+ - sk
10
+ license: mit
11
+ tags:
12
+ - sentiment-analysis
13
+ - xlm-roberta
14
+ - multilingual
15
+ - text-classification
16
+ datasets:
17
+ - custom
18
+ metrics:
19
+ - accuracy
20
+ - f1
21
+ pipeline_tag: text-classification
22
+ model-index:
23
+ - name: Sentiment
24
+ results:
25
+ - task:
26
+ type: text-classification
27
+ name: Sentiment Analysis
28
+ metrics:
29
+ - name: Accuracy
30
+ type: accuracy
31
+ value: 0.4108175318619832
32
+ - name: F1 (macro)
33
+ type: f1
34
+ value: 0.1941274108021563
35
+ ---
36
+
37
+ # Sentiment
38
+
39
+ Fine-tuned [xlm-roberta-base](https://huggingface.co/xlm-roberta-base) for **multilingual sentiment classification** across 7 languages.
40
+
41
+ ## Model Details
42
+
43
+ - **Base model**: `xlm-roberta-base`
44
+ - **Task**: 3-class sentiment classification (negative / neutral / positive)
45
+ - **Languages**: Hungarian, English, German, Czech, French, Polish, Slovak
46
+ - **Training data**: ~257K sentences (stratified split from ~322K total)
47
+ - **Class weighting**: Balanced weights applied during training to handle class imbalance
48
+
49
+ ## Labels
50
+
51
+ | Label ID | Label | Description |
52
+ |----------|-------|-------------|
53
+ | 0 | negative | Negative sentiment |
54
+ | 1 | neutral | Neutral sentiment |
55
+ | 2 | positive | Positive sentiment |
56
+
57
+ ## Overall Results
58
+
59
+ | Metric | Value |
60
+ |--------|-------|
61
+ | Accuracy | 0.4108175318619832 |
62
+ | F1 (macro) | 0.1941274108021563 |
63
+ | F1 (weighted) | 0.23925283131749744 |
64
+
65
+ ## Per-Language Results
66
+
67
+ | Language | Samples | Accuracy | F1 (macro) | F1 (weighted) |
68
+ |----------|---------|----------|------------|---------------|
69
+ | cz | 4602 | 0.4109 | 0.1942 | 0.2393 |
70
+ | en | 4596 | 0.4108 | 0.1941 | 0.2392 |
71
+ | fr | 4569 | 0.4108 | 0.1941 | 0.2392 |
72
+ | ger | 4599 | 0.4107 | 0.1941 | 0.2392 |
73
+ | hun | 4603 | 0.4108 | 0.1941 | 0.2393 |
74
+ | pl | 4603 | 0.4108 | 0.1941 | 0.2393 |
75
+ | sk | 4598 | 0.4108 | 0.1941 | 0.2393 |
76
+
77
+
78
+ ## Usage
79
+
80
+ ```python
81
+ from transformers import pipeline
82
+
83
+ classifier = pipeline("text-classification", model="ringorsolya/Sentiment")
84
+
85
+ # Hungarian
86
+ classifier("Ez egy fantasztikus nap!")
87
+ # English
88
+ classifier("This is a terrible product.")
89
+ # German
90
+ classifier("Das Wetter ist heute schön.")
91
+ ```
92
+
93
+ ## Training Details
94
+
95
+ - **Epochs**: 3
96
+ - **Batch size**: 64
97
+ - **Learning rate**: 2e-05
98
+ - **Weight decay**: 0.01
99
+ - **Warmup ratio**: 0.1
100
+ - **Max sequence length**: 128
101
+ - **FP16**: True
102
+ - **Class weights**: [0.8114, 1.1219, 1.1413]