ThingAI
/

Quark-Mod

 ---
+language: en
+license: cc-by-nc-4.0
+library_name: transformers
+tags:
+- moderation
+- toxicity
+- content-moderation
+- safety
+- quark
+- multi-label-classification
+- jigsaw
+- hate-speech
+- italian-ai
+pipeline_tag: text-classification
+metrics:
+- f1
+- macro-f1
+base_model: ThingAI/Quark-135m
+model_name: Quark-Mod-v0.1
+pretty_name: Quark-Mod-v0.1
+size_categories: 135M
+task_categories:
+- text-classification
 ---
+# Quark-Mod-v0.1
+<div align="center">
+**A 135M parameter content moderation model fine-tuned from Quark-135M**
+[![ThingsAI](https://img.shields.io/badge/🛡️%20ThingsAI-Research-blue)](https://things-ai.org)
+[![License](https://img.shields.io/badge/License-CC_BY_NC_4.0-lightgrey)](LICENSE)
+[![Python](https://img.shields.io/badge/Python-3.10%2B-blue)](https://python.org)
+[![Transformers](https://img.shields.io/badge/🤗%20Transformers-latest-orange)](https://huggingface.co/docs/transformers)
+</div>
+---
+## 📋 Model Overview
+| Property | Value |
+|----------|-------|
+| **Full name** | Quark-Mod-v0.1 |
+| **Base model** | [Quark-135M](https://huggingface.co/ThingAI/Quark-135m) (pretrained from scratch) |
+| **Architecture** | Decoder-only, GQA (9:3), SwiGLU, RoPE, RMSNorm |
+| **Parameters** | 135M |
+| **Context length** | 2048 tokens |
+| **Task** | Multi-label content moderation (9 classes) |
+| **Language** | English (v0.1) |
+---
+## 🎯 Intended Use
+This model is designed to **classify toxic and harmful content** across 9 categories. It is intended for:
+- ✅ Social media moderation
+- ✅ Comment filtering systems
+- ✅ Content safety pipelines
+- ✅ Research on efficient moderation models
+### Limitations
+- ⚠️ English only (v0.1)
+- ⚠️ May struggle with subtle sarcasm or highly contextual toxicity
+- ⚠️ Lower performance on rare classes due to dataset imbalance
+- ⚠️ Not recommended for high-stakes decisions without human review
+---
+## 🏷️ Labels (9 classes)
+| Label | Description | Training examples |
+|-------|-------------|-------------------|
+| `toxic` | General toxic content | 32,263 (19.4%) |
+| `severe_toxic` | Severe toxicity | 1,423 (0.9%) |
+| `obscene` | Obscene/profane language | 7,567 (4.6%) |
+| `threat` | Direct threats | 445 (0.3%) |
+| `insult` | Insulting content | 7,065 (4.3%) |
+| `identity_hate` | Hate targeting identity | 1,263 (0.8%) |
+| `hate_speech` | Explicit hate speech | 1,265 (0.8%) |
+| `offensive` | Offensive language | 17,274 (10.4%) |
+**Note:** Multi-label classification — multiple classes can be active simultaneously.
+---
+## 📊 Evaluation Results
+**Validation set:** 18,436 examples
+| Class | F1 Score |
+|-------|----------|
+| `toxic` | **0.909** |
+| `offensive` | **0.938** |
+| `obscene` | **0.796** |
+| `insult` | **0.721** |
+| `severe_toxic` | **0.498** |
+| `identity_hate` | **0.415** |
+| `hate_speech` | **0.319** |
+| `threat` | **0.372** |
+| Metric | Score |
+|--------|-------|
+| **Macro F1** | **0.552** |
+| **Validation Loss** | **0.037** |
+---
+## 🚀 Usage Example
+```python
+import torch
+from transformers import AutoModelForSequenceClassification, AutoTokenizer
+# Load model and tokenizer
+model_name = "ThingsAI/Quark-Mod-v0.1"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForSequenceClassification.from_pretrained(model_name)
+# Labels
+labels = ["toxic", "severe_toxic", "obscene", "threat", "insult",
+          "identity_hate", "hate_speech", "offensive"]
+# Predict
+def moderate(text):
+    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=2048)
+    with torch.no_grad():
+        outputs = model(**inputs)
+    predictions = (outputs.logits > 0).int()[0]
+    detected = [labels[i] for i, v in enumerate(predictions) if v == 1]
+    return detected if detected else ["clean"]
+# Test
+print(moderate("I love this community!"))        # ['clean']
+print(moderate("You are an idiot and should die")) # ['toxic', 'insult']
+print(moderate("Nice post, thanks for sharing")) # ['clean']