GaMS-9B-KMMA / README.md
jurijvega's picture
Update README.md
4c8ff23 verified
metadata
base_model: cjvt/GaMS-9B-Instruct
library_name: peft
tags:
  - base_model:adapter:cjvt/GaMS-9B-Instruct
  - lora
  - transformers

Model Card for Model ID

KMMA – LoRA adapters for GaMS-9B-Instruct (Slovene multi-task news classification)

Model type: PEFT/LoRA adapters (not a full model) Base model: cjvt/GaMS-9B-Instruct (Gemma-2 family) Language: Slovene (sl), works reasonably on hr/bs/sr/croatian/serbian text too Tasks: multi-label/ multi-head classification over news articles Library: 🤗 Transformers, PEFT, bitsandbytes, PyTorch

TL;DR

This repo contains LoRA adapters + a small classification head that turn cjvt/GaMS-9B-Instruct into a 4-task classifier for Slovene news:

CATEGORY: POLITIKA, DRUŽBA, GOSPODARSTVO, SVET, ŠPORT, KULTURA

SENTIMENT: NEGATIVEN, NEVTRALEN, POZITIVEN

BIAS: LEVO, NEVTRALEN, DESNO

CREDIBILITY (binary): VERODOSTOJNO, NEVERODOSTOJNO (derived from a numeric credibility score with a threshold, see below)

The repo ships:

adapter_config.json + LoRA weights (PEFT)

multitask_head.pt (four linear layers for the heads)

label_maps.json (label order + max_length)

tokenizer files

⚠️ You must load the base model and then attach these adapters + head.

Model Details

This repository contains LoRA adapters + a small classification head for the cjvt/GaMS-9B-Instruct model (Gemma-2 family), finetuned for multi-task classification of Slovene news.

The model predicts four attributes for a given article:

CATEGORY: POLITIKA, DRUŽBA, GOSPODARSTVO, SVET, ŠPORT, KULTURA

SENTIMENT: NEGATIVEN, NEVTRALEN, POZITIVEN

BIAS: LEVO, NEVTRALEN, DESNO

CREDIBILITY: VERODOSTOJNO, NEVERODOSTOJNO

Developed by: Jon Petek

Model type: PEFT/LoRA adapters + multitask head

Language(s) (NLP): Slovene (sl), works partially on hr/bs/sr

License: Apache 2.0 (adapters & code), base model subject to original GaMS/Gemma license

Finetuned from model: cjvt/GaMS-9B-Instruct

Direct Use

Automated classification of Slovene news and opinion pieces into categories, sentiment, political bias, and credibility proxy.

For research, media monitoring, dashboards, or exploratory analysis.

Downstream Use [optional]

As a feature extractor for analytics (e.g., news dashboards, monitoring tools).

Can be fine-tuned further on domain-specific labeled data.

Out-of-Scope Use

High-stakes decision making (e.g., moderation, censorship, or factual verification).

Fine-grained stance detection or fact-checking (credibility labels are coarse and proxy-based).

Bias, Risks, and Limitations

BIAS labels are coarse (left/neutral/right). Mixed stance or quoted text may confuse the model.

CREDIBILITY is not fact-checking — it mirrors dataset heuristics.

Errors may occur on very short, satirical, or code-switched texts.

Trained only on Slovene; cross-lingual performance is untested.

Recommendations

Treat predictions as signals, not ground truth.

Review critical outputs manually.

Use balanced accuracy and macro-F1 for fairer evaluation of minority classes.

How to Get Started with the Model

Use the code from this GitHub repository to get started with the model: https://github.com/Truevoluhar/KMMA_finetuning_inference