AMBJ24
/

icelandic-sentiment

Text Classification

sentiment-analysis

sequence-classification

Model card Files Files and versions

icelandic-sentiment / README.md

AMBJ24's picture

Update README.md

65c707e verified 3 months ago

|

history blame contribute delete

1.45 kB

	---
	license: cc-by-nc-4.0
	language:
	- is
	pipeline_tag: text-classification
	library_name: transformers
	tags:
	- icelandic
	- sentiment-analysis
	- text-classification
	- sequence-classification
	- social-media
	sources:
	Risamálheildin slices of forums/blogs, manually labelled by us, and our own
	small corpus made from samples gathered from social media
	---


	Task: 3-class sentiment analysis → `["negative", "neutral", "positive"]`
	Base model: `mideind/IceBERT-igc` (Icelandic RoBERTa)

	## TL;DR

	A small Icelandic RoBERTa fine-tuned for 3-way sentiment on non-ironic text. Pairs well after an irony gate (first run the irony model; only classify sentiment if `not_ironic`).

	---

	## How to use

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification

	model_id = "ambj24/icelandic-sentiment"
	tok = AutoTokenizer.from_pretrained(model_id)
	mod = AutoModelForSequenceClassification.from_pretrained(model_id)

	text = "Þjónustan var frábær!"
	inputs = tok(text, return_tensors="pt")
	probs = mod(**inputs).logits.softmax(-1).tolist()[0]

	labels = ["negative", "neutral", "positive"]
	print(dict(zip(labels, probs)))

	Input length: short posts; trained with max length ~128 tokens.

	Data: social-media style Icelandic.
	Domain shift: trained on short, informal posts.

	Positive/neutral/negative labels; only examples judged not ironic.

	Typical setup: 3 epochs, LR ≈ 2e-5, batch ≈ 16, max length 128.