cloud0day3
/

finbert-ft-v3

Text Classification

Model card Files Files and versions

finbert-ft-v3 / README.md

cloud0day3's picture

Update README.md

d6d7366 verified 8 months ago

|

history blame contribute delete

2.08 kB

	---
	license: mit
	language:
	- fi
	metrics:
	- f1
	- precision
	- recall
	- accuracy
	base_model:
	- google-bert/bert-base-uncased
	- TurkuNLP/bert-base-finnish-cased-v1
	pipeline_tag: text-classification
	tags:
	- classification
	- news
	---
	# News Relevancy Classifiers

	## FinBERT-ft-v3

	![FinBERT Badge](https://img.shields.io/badge/Model-FinBERT--ft--v3-blue)

	### Model Description
	- Purpose: This model is trained for a specific task in research, it is not a commmercial product and should not be used in for-profit.
	- Architecture: `bert-base-finnish-cased-v1`
	- Fine-tuning task: Four-class Finnish news-headline relevancy classification
	- Dataset: ~225 Finnish headlines (2024–2025) manually labeled into:
	- 0 — Not Relevant
	- 1 — Least Relevant
	- 2 — Highly Relevant
	- 3 — Most Relevant
	- HF Repo: [`cloud0day3/finbert-ft-v3`](https://huggingface.co/cloud0day3/finbert-ft-v3) (latest v4 checkpoint, 6 June 2025)
	- Date Trained: 2025-06-06

	#### Model Inputs

	- A raw Finnish headline (string), truncated/padded to 96 tokens.
	- Tokenization handled by the bundled `vocab.txt` + `tokenizer_config.json` + `special_tokens_map.json`.

	#### Model Outputs

	- A single integer label (0–3). Mapped to human-readable categories:
	```python
	LABELS = {
	0: "Not Relevant",
	1: "Least Relevant",
	2: "Highly Relevant",
	3: "Most Relevant"
	}


	#### Intended Use
	- Primary: Automatically assign a relevancy score to Finnish news headlines so that downstream pipelines (e.g., filtering, ranking) can operate without manual triage.

	#### Examples of use:

	- Pre-filtering a news aggregation feed.

	- Prioritizing headlines for editorial review.

	- Input to summarization/retrieval pipelines.

	#### Out-of-Scope Uses
	- Any non-Finnish text (e.g., English, Swedish).

	- Multi-sentence inputs or full articles (this model is tuned on single-sentence headlines).

	- Tasks other than relevancy (e.g., sentiment analysis, topic modeling).

	- High-risk decision making without human oversight (e.g., emergency alerts).