floxoris
/

harmony-v1.1

Model card Files Files and versions

harmony-v1.1 / README.md

sollamon's picture

Add readme

481e1f7 verified 16 days ago

|

history blame contribute delete

2.46 kB

	# Floxoris Harmony v1.1

	Floxoris Harmony v1.1 is a lightweight moderation model for fast binary toxicity detection in Russian and Ukrainian text.

	This version is a continued fine-tuning update of Floxoris Harmony v1, focused on improving detection of mild toxicity, short rude phrases, and everyday aggressive messages while keeping the model compact and fast.

	The model is intended for scenarios where low latency, small size, and simple deployment matter, such as Telegram bots, chat moderation systems, AI assistants, community tools, and first-pass safety filters.

	## What Is New In v1.1

	Compared to `Floxoris Harmony v1`, this release focuses on better handling of short and mild toxic messages.

	Examples of targeted improvements:

	- better detection of short rude phrases
	- improved sensitivity to mild toxicity
	- stronger Russian and Ukrainian moderation behavior
	- better handling of direct insults and aggressive commands
	- continued support for fast binary classification
	- same simple output labels: `safe` and `toxic`

	This version was trained as a behavioral patch, not as a completely new architecture.

	## Model Task

	The model performs binary text classification:

	\| Class \| Label \|
	\|---\|---\|
	\| `0` \| `safe` \|
	\| `1` \| `toxic` \|

	It is designed to answer a simple question:

	> Is this message safe or toxic?

	## Intended Use

	Floxoris Harmony v1.1 is suitable for:

	- Telegram bot moderation
	- chat message filtering
	- AI assistant safety checks
	- community moderation tools
	- lightweight API moderation
	- first-stage toxicity detection
	- Russian/Ukrainian text safety classification

	It works best as a fast first-pass classifier before more complex moderation logic.

	## Example Usage

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	model_id = "floxoris/harmony-v1.1"

	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForSequenceClassification.from_pretrained(model_id)

	text = "заткнись"

	inputs = tokenizer(
	text,
	return_tensors="pt",
	truncation=True,
	padding=True,
	max_length=128
	)

	with torch.no_grad():
	outputs = model(**inputs)
	probs = torch.softmax(outputs.logits, dim=-1)[0]

	safe_score = probs[0].item()
	toxic_score = probs[1].item()

	label = "toxic" if toxic_score > safe_score else "safe"

	print({
	"label": label,
	"safe_score": round(safe_score, 4),
	"toxic_score": round(toxic_score, 4)
	})