AMBJ24 commited on
Commit
22c9ac1
·
verified ·
1 Parent(s): ed0dc60

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -0
README.md ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ language:
4
+ - is
5
+ pipeline_tag: text-classification
6
+ library_name: transformers
7
+ tags:
8
+ - icelandic
9
+ - sentiment-analysis
10
+ - text-classification
11
+ - sequence-classification
12
+ - social-media
13
+ ---
14
+
15
+
16
+ **Task**: 3-class sentiment analysis → `["negative", "neutral", "positive"]`
17
+ **Base model**: `mideind/IceBERT-igc` (Icelandic RoBERTa)
18
+
19
+ > Label mapping is embedded in the model config:
20
+ > `id2label={0: "negative", 1: "neutral", 2: "positive"}` and `label2id` accordingly. :contentReference[oaicite:3]{index=3}
21
+
22
+ ## TL;DR
23
+
24
+ A small Icelandic RoBERTa fine-tuned for 3-way sentiment on non-ironic text. Pairs well **after** an irony gate (first run the irony model; only classify sentiment if `not_ironic`).
25
+
26
+ ---
27
+
28
+ ## How to use
29
+
30
+ ```python
31
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
32
+
33
+ model_id = "ambj24/icelandic-sentiment"
34
+ tok = AutoTokenizer.from_pretrained(model_id)
35
+ mod = AutoModelForSequenceClassification.from_pretrained(model_id)
36
+
37
+ text = "Þjónustan var frábær!"
38
+ inputs = tok(text, return_tensors="pt")
39
+ probs = mod(**inputs).logits.softmax(-1).tolist()[0]
40
+
41
+ labels = ["negative", "neutral", "positive"]
42
+ print(dict(zip(labels, probs)))
43
+
44
+ Input length: short posts; trained with max length ~128 tokens.
45
+
46
+ Data: social-media style Icelandic.
47
+ Domain shift: trained on short, informal posts.
48
+
49
+ Positive/neutral/negative labels; only examples judged not ironic.
50
+
51
+ Typical setup: 3 epochs, LR ≈ 2e-5, batch ≈ 16, max length 128.