# Floxoris Harmony v0

**Floxoris Harmony v0** is a lightweight binary toxic moderation model for **Russian and Ukrainian** text. It is designed for fast, low-cost inference in production environments such as Telegram bots, AI assistants, chat filters, and message pre-moderation pipelines.

Built on top of [`gravitee-io/bert-tiny-toxicity`](https://huggingface.co/gravitee-io/bert-tiny-toxicity), the model focuses on practical toxicity detection with a very small footprint of roughly **40-50 MB**, making it suitable for lightweight deployment scenarios.

## Features

- Binary toxic moderation
- Supports **Russian** and **Ukrainian**
- Very small and fast for inference
- Suitable for real-time moderation pipelines
- Easy to deploy in lightweight production systems
- Designed for Telegram bots, assistants, and chat filtering

## Model Details

- **Task:** Binary text classification
- **Base model:** `gravitee-io/bert-tiny-toxicity`
- **Languages:** Russian, Ukrainian
- **Classes:** `not_toxic`, `toxic`
- **Model size:** ~40-50 MB
- **License:** Apache License 2.0

## Labels

The model returns one of two classes:

- `0` = `not_toxic`
- `1` = `toxic`

## Training Details

The model was fine-tuned for binary toxicity classification on a merged multilingual moderation dataset built from:

- `ru.parquet`
- `uk.parquet`
- `big-ru.parquet`

### Data Correction

In `big-ru.parquet`, labels were originally inverted:

- `0` = toxic
- `1` = safe

This issue was corrected before final training.

### Final Dataset

After label correction, the datasets were merged, cleaned, and balanced.

- **Total rows:** ~122,000
- **Toxic:** 61,127
- **Safe / Not toxic:** 61,127

## Example Usage

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name = "floxoris/harmony-v0"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

id2label = {
    0: "not_toxic",
    1: "toxic",
}

texts = [
    "дарова, как день?",
    "ты дибил?",
]

inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt")

with torch.no_grad():
    logits = model(**inputs).logits
    probs = torch.softmax(logits, dim=-1)
    preds = torch.argmax(probs, dim=-1)

for text, pred, prob in zip(texts, preds, probs):
    label = id2label[pred.item()]
    confidence = prob[pred.item()].item()
    print(f"{text} -> {label} ({confidence:.4f})")
```

## Example Outputs

Example model behavior on simple test inputs:

```text
"дарова, как день?"
-> not_toxic (~0.91)

"ты дибил?"
-> toxic (~0.80)
```

These examples are illustrative and should not be treated as a full benchmark.

## Intended Use

Floxoris Harmony v0 is intended for fast and lightweight toxic moderation in:

- Telegram bots
- AI assistants
- Chat filtering systems
- Message pre-moderation pipelines
- Lightweight production deployments

Typical use cases include:

- filtering incoming user messages before they reach a model or operator
- flagging potentially toxic content for review
- reducing moderation cost in high-volume chat environments
- adding a first-pass safety layer to conversational systems

## Limitations

- This is a **binary** moderation model and does not classify toxicity types
- It may miss subtle harassment, sarcasm, or context-dependent abuse
- It may produce false positives on slang, irony, or emotionally charged messages
- Performance may degrade on domain-specific jargon, mixed-language text, or heavily misspelled input
- It is intended as a lightweight moderation layer, not a full safety system
- Human review is still recommended for high-stakes moderation decisions

## License

This model is released under the **Apache License 2.0**.

## Future Versions

Planned directions for future releases:

- **v1:** improved accuracy and calibration
- **v2:** broader multilingual coverage and more robust edge-case handling
- future iterations may include better handling of slang, implicit toxicity, and context-aware moderation

## Summary

Floxoris Harmony v0 is a compact toxic moderation model optimized for practical deployment where **speed, cost, and simplicity** matter. It is best suited as a lightweight first-stage moderation component for Russian and Ukrainian text pipelines.