YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Floxoris Harmony v0
Floxoris Harmony v0 is a lightweight binary toxic moderation model for Russian and Ukrainian text. It is designed for fast, low-cost inference in production environments such as Telegram bots, AI assistants, chat filters, and message pre-moderation pipelines.
Built on top of gravitee-io/bert-tiny-toxicity, the model focuses on practical toxicity detection with a very small footprint of roughly 40-50 MB, making it suitable for lightweight deployment scenarios.
Features
- Binary toxic moderation
- Supports Russian and Ukrainian
- Very small and fast for inference
- Suitable for real-time moderation pipelines
- Easy to deploy in lightweight production systems
- Designed for Telegram bots, assistants, and chat filtering
Model Details
- Task: Binary text classification
- Base model:
gravitee-io/bert-tiny-toxicity - Languages: Russian, Ukrainian
- Classes:
not_toxic,toxic - Model size: ~40-50 MB
- License: Apache License 2.0
Labels
The model returns one of two classes:
0=not_toxic1=toxic
Training Details
The model was fine-tuned for binary toxicity classification on a merged multilingual moderation dataset built from:
ru.parquetuk.parquetbig-ru.parquet
Data Correction
In big-ru.parquet, labels were originally inverted:
0= toxic1= safe
This issue was corrected before final training.
Final Dataset
After label correction, the datasets were merged, cleaned, and balanced.
- Total rows: ~122,000
- Toxic: 61,127
- Safe / Not toxic: 61,127
Example Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_name = "floxoris/harmony-v0"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
id2label = {
0: "not_toxic",
1: "toxic",
}
texts = [
"дарова, как день?",
"ты дибил?",
]
inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt")
with torch.no_grad():
logits = model(**inputs).logits
probs = torch.softmax(logits, dim=-1)
preds = torch.argmax(probs, dim=-1)
for text, pred, prob in zip(texts, preds, probs):
label = id2label[pred.item()]
confidence = prob[pred.item()].item()
print(f"{text} -> {label} ({confidence:.4f})")
Example Outputs
Example model behavior on simple test inputs:
"дарова, как день?"
-> not_toxic (~0.91)
"ты дибил?"
-> toxic (~0.80)
These examples are illustrative and should not be treated as a full benchmark.
Intended Use
Floxoris Harmony v0 is intended for fast and lightweight toxic moderation in:
- Telegram bots
- AI assistants
- Chat filtering systems
- Message pre-moderation pipelines
- Lightweight production deployments
Typical use cases include:
- filtering incoming user messages before they reach a model or operator
- flagging potentially toxic content for review
- reducing moderation cost in high-volume chat environments
- adding a first-pass safety layer to conversational systems
Limitations
- This is a binary moderation model and does not classify toxicity types
- It may miss subtle harassment, sarcasm, or context-dependent abuse
- It may produce false positives on slang, irony, or emotionally charged messages
- Performance may degrade on domain-specific jargon, mixed-language text, or heavily misspelled input
- It is intended as a lightweight moderation layer, not a full safety system
- Human review is still recommended for high-stakes moderation decisions
License
This model is released under the Apache License 2.0.
Future Versions
Planned directions for future releases:
- v1: improved accuracy and calibration
- v2: broader multilingual coverage and more robust edge-case handling
- future iterations may include better handling of slang, implicit toxicity, and context-aware moderation
Summary
Floxoris Harmony v0 is a compact toxic moderation model optimized for practical deployment where speed, cost, and simplicity matter. It is best suited as a lightweight first-stage moderation component for Russian and Ukrainian text pipelines.
- Downloads last month
- 18