Model Card for Model ID
KMMA – LoRA adapters for GaMS-9B-Instruct (Slovene multi-task news classification)
Model type: PEFT/LoRA adapters (not a full model) Base model: cjvt/GaMS-9B-Instruct (Gemma-2 family) Language: Slovene (sl), works reasonably on hr/bs/sr/croatian/serbian text too Tasks: multi-label/ multi-head classification over news articles Library: 🤗 Transformers, PEFT, bitsandbytes, PyTorch
TL;DR
This repo contains LoRA adapters + a small classification head that turn cjvt/GaMS-9B-Instruct into a 4-task classifier for Slovene news:
CATEGORY: POLITIKA, DRUŽBA, GOSPODARSTVO, SVET, ŠPORT, KULTURA
SENTIMENT: NEGATIVEN, NEVTRALEN, POZITIVEN
BIAS: LEVO, NEVTRALEN, DESNO
CREDIBILITY (binary): VERODOSTOJNO, NEVERODOSTOJNO (derived from a numeric credibility score with a threshold, see below)
The repo ships:
adapter_config.json + LoRA weights (PEFT)
multitask_head.pt (four linear layers for the heads)
label_maps.json (label order + max_length)
tokenizer files
⚠️ You must load the base model and then attach these adapters + head.
Model Details
This repository contains LoRA adapters + a small classification head for the cjvt/GaMS-9B-Instruct model (Gemma-2 family), finetuned for multi-task classification of Slovene news.
The model predicts four attributes for a given article:
CATEGORY: POLITIKA, DRUŽBA, GOSPODARSTVO, SVET, ŠPORT, KULTURA
SENTIMENT: NEGATIVEN, NEVTRALEN, POZITIVEN
BIAS: LEVO, NEVTRALEN, DESNO
CREDIBILITY: VERODOSTOJNO, NEVERODOSTOJNO
Developed by: Jon Petek
Model type: PEFT/LoRA adapters + multitask head
Language(s) (NLP): Slovene (sl), works partially on hr/bs/sr
License: Apache 2.0 (adapters & code), base model subject to original GaMS/Gemma license
Finetuned from model: cjvt/GaMS-9B-Instruct
Direct Use
Automated classification of Slovene news and opinion pieces into categories, sentiment, political bias, and credibility proxy.
For research, media monitoring, dashboards, or exploratory analysis.
Downstream Use [optional]
As a feature extractor for analytics (e.g., news dashboards, monitoring tools).
Can be fine-tuned further on domain-specific labeled data.
Out-of-Scope Use
High-stakes decision making (e.g., moderation, censorship, or factual verification).
Fine-grained stance detection or fact-checking (credibility labels are coarse and proxy-based).
Bias, Risks, and Limitations
BIAS labels are coarse (left/neutral/right). Mixed stance or quoted text may confuse the model.
CREDIBILITY is not fact-checking — it mirrors dataset heuristics.
Errors may occur on very short, satirical, or code-switched texts.
Trained only on Slovene; cross-lingual performance is untested.
Recommendations
Treat predictions as signals, not ground truth.
Review critical outputs manually.
Use balanced accuracy and macro-F1 for fairer evaluation of minority classes.
How to Get Started with the Model
Use the code from this GitHub repository to get started with the model: https://github.com/Truevoluhar/KMMA_finetuning_inference
- Downloads last month
- 2