---
language: en
license: cc-by-nc-4.0
library_name: transformers
tags:
- moderation
- toxicity
- content-moderation
- safety
- quark
- multi-label-classification
- jigsaw
- hate-speech
- italian-ai
pipeline_tag: text-classification
metrics:
- f1
- macro-f1
base_model: ThingAI/Quark-135m
model_name: Quark-Mod-v0.1
pretty_name: Quark-Mod-v0.1
size_categories: 135M
task_categories:
- text-classification
---
# Quark-Mod-v0.1
**A 135M parameter content moderation model fine-tuned from Quark-135M**
[](https://things-ai.org)
[](LICENSE)
[](https://python.org)
[](https://huggingface.co/docs/transformers)
---
## 📋 Model Overview
| Property | Value |
|----------|-------|
| **Full name** | Quark-Mod-v0.1 |
| **Base model** | [Quark-135M](https://huggingface.co/ThingAI/Quark-135m) (pretrained from scratch) |
| **Architecture** | Decoder-only, GQA (9:3), SwiGLU, RoPE, RMSNorm |
| **Parameters** | 135M |
| **Context length** | 2048 tokens |
| **Task** | Multi-label content moderation (9 classes) |
| **Language** | English (v0.1) |
---
## 🎯 Intended Use
This model is designed to **classify toxic and harmful content** across 9 categories. It is intended for:
- ✅ Social media moderation
- ✅ Comment filtering systems
- ✅ Content safety pipelines
- ✅ Research on efficient moderation models
### Limitations
- ⚠️ English only (v0.1)
- ⚠️ May struggle with subtle sarcasm or highly contextual toxicity
- ⚠️ Lower performance on rare classes due to dataset imbalance
- ⚠️ Not recommended for high-stakes decisions without human review
---
## 🏷️ Labels (9 classes)
| Label | Description | Training examples |
|-------|-------------|-------------------|
| `toxic` | General toxic content | 32,263 (19.4%) |
| `severe_toxic` | Severe toxicity | 1,423 (0.9%) |
| `obscene` | Obscene/profane language | 7,567 (4.6%) |
| `threat` | Direct threats | 445 (0.3%) |
| `insult` | Insulting content | 7,065 (4.3%) |
| `identity_hate` | Hate targeting identity | 1,263 (0.8%) |
| `hate_speech` | Explicit hate speech | 1,265 (0.8%) |
| `offensive` | Offensive language | 17,274 (10.4%) |
**Note:** Multi-label classification — multiple classes can be active simultaneously.
---
## 📊 Evaluation Results
**Validation set:** 18,436 examples
| Class | F1 Score |
|-------|----------|
| `toxic` | **0.909** |
| `offensive` | **0.938** |
| `obscene` | **0.796** |
| `insult` | **0.721** |
| `severe_toxic` | **0.498** |
| `identity_hate` | **0.415** |
| `hate_speech` | **0.319** |
| `threat` | **0.372** |
| Metric | Score |
|--------|-------|
| **Macro F1** | **0.552** |
| **Validation Loss** | **0.037** |
---
## 🚀 Usage Example
```python
import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer
# Load model and tokenizer
model_name = "ThingsAI/Quark-Mod-v0.1"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
# Labels
labels = ["toxic", "severe_toxic", "obscene", "threat", "insult",
"identity_hate", "hate_speech", "offensive"]
# Predict
def moderate(text):
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=2048)
with torch.no_grad():
outputs = model(**inputs)
predictions = (outputs.logits > 0).int()[0]
detected = [labels[i] for i, v in enumerate(predictions) if v == 1]
return detected if detected else ["clean"]
# Test
print(moderate("I love this community!")) # ['clean']
print(moderate("You are an idiot and should die")) # ['toxic', 'insult']
print(moderate("Nice post, thanks for sharing")) # ['clean']