harmony-v1.1 / README.md
sollamon's picture
Add readme
481e1f7 verified

Floxoris Harmony v1.1

Floxoris Harmony v1.1 is a lightweight moderation model for fast binary toxicity detection in Russian and Ukrainian text.

This version is a continued fine-tuning update of Floxoris Harmony v1, focused on improving detection of mild toxicity, short rude phrases, and everyday aggressive messages while keeping the model compact and fast.

The model is intended for scenarios where low latency, small size, and simple deployment matter, such as Telegram bots, chat moderation systems, AI assistants, community tools, and first-pass safety filters.

What Is New In v1.1

Compared to Floxoris Harmony v1, this release focuses on better handling of short and mild toxic messages.

Examples of targeted improvements:

  • better detection of short rude phrases
  • improved sensitivity to mild toxicity
  • stronger Russian and Ukrainian moderation behavior
  • better handling of direct insults and aggressive commands
  • continued support for fast binary classification
  • same simple output labels: safe and toxic

This version was trained as a behavioral patch, not as a completely new architecture.

Model Task

The model performs binary text classification:

Class Label
0 safe
1 toxic

It is designed to answer a simple question:

Is this message safe or toxic?

Intended Use

Floxoris Harmony v1.1 is suitable for:

  • Telegram bot moderation
  • chat message filtering
  • AI assistant safety checks
  • community moderation tools
  • lightweight API moderation
  • first-stage toxicity detection
  • Russian/Ukrainian text safety classification

It works best as a fast first-pass classifier before more complex moderation logic.

Example Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_id = "floxoris/harmony-v1.1"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)

text = "заткнись"

inputs = tokenizer(
    text,
    return_tensors="pt",
    truncation=True,
    padding=True,
    max_length=128
)

with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=-1)[0]

safe_score = probs[0].item()
toxic_score = probs[1].item()

label = "toxic" if toxic_score > safe_score else "safe"

print({
    "label": label,
    "safe_score": round(safe_score, 4),
    "toxic_score": round(toxic_score, 4)
})