Text Classification
Transformers
Safetensors
English
llama
moderation
toxicity
content-moderation
safety
quark
multi-label-classification
jigsaw
hate-speech
italian-ai
text-embeddings-inference
Instructions to use ThingAI/Quark-Mod with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ThingAI/Quark-Mod with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="ThingAI/Quark-Mod")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("ThingAI/Quark-Mod") model = AutoModelForSequenceClassification.from_pretrained("ThingAI/Quark-Mod") - Notebooks
- Google Colab
- Kaggle
| language: en | |
| license: cc-by-nc-4.0 | |
| library_name: transformers | |
| tags: | |
| - moderation | |
| - toxicity | |
| - content-moderation | |
| - safety | |
| - quark | |
| - multi-label-classification | |
| - jigsaw | |
| - hate-speech | |
| - italian-ai | |
| pipeline_tag: text-classification | |
| metrics: | |
| - f1 | |
| - macro-f1 | |
| base_model: ThingAI/Quark-135m | |
| model_name: Quark-Mod-v0.1 | |
| pretty_name: Quark-Mod-v0.1 | |
| size_categories: 135M | |
| task_categories: | |
| - text-classification | |
| # Quark-Mod-v0.1 | |
| <div align="center"> | |
| **A 135M parameter content moderation model fine-tuned from Quark-135M** | |
| [](https://things-ai.org) | |
| [](LICENSE) | |
| [](https://python.org) | |
| [](https://huggingface.co/docs/transformers) | |
| </div> | |
| --- | |
| ## π Model Overview | |
| | Property | Value | | |
| |----------|-------| | |
| | **Full name** | Quark-Mod-v0.1 | | |
| | **Base model** | [Quark-135M](https://huggingface.co/ThingAI/Quark-135m) (pretrained from scratch) | | |
| | **Architecture** | Decoder-only, GQA (9:3), SwiGLU, RoPE, RMSNorm | | |
| | **Parameters** | 135M | | |
| | **Context length** | 2048 tokens | | |
| | **Task** | Multi-label content moderation (9 classes) | | |
| | **Language** | English (v0.1) | | |
| --- | |
| ## π― Intended Use | |
| This model is designed to **classify toxic and harmful content** across 9 categories. It is intended for: | |
| - β Social media moderation | |
| - β Comment filtering systems | |
| - β Content safety pipelines | |
| - β Research on efficient moderation models | |
| ### Limitations | |
| - β οΈ English only (v0.1) | |
| - β οΈ May struggle with subtle sarcasm or highly contextual toxicity | |
| - β οΈ Lower performance on rare classes due to dataset imbalance | |
| - β οΈ Not recommended for high-stakes decisions without human review | |
| --- | |
| ## π·οΈ Labels (9 classes) | |
| | Label | Description | Training examples | | |
| |-------|-------------|-------------------| | |
| | `toxic` | General toxic content | 32,263 (19.4%) | | |
| | `severe_toxic` | Severe toxicity | 1,423 (0.9%) | | |
| | `obscene` | Obscene/profane language | 7,567 (4.6%) | | |
| | `threat` | Direct threats | 445 (0.3%) | | |
| | `insult` | Insulting content | 7,065 (4.3%) | | |
| | `identity_hate` | Hate targeting identity | 1,263 (0.8%) | | |
| | `hate_speech` | Explicit hate speech | 1,265 (0.8%) | | |
| | `offensive` | Offensive language | 17,274 (10.4%) | | |
| **Note:** Multi-label classification β multiple classes can be active simultaneously. | |
| --- | |
| ## π Evaluation Results | |
| **Validation set:** 18,436 examples | |
| | Class | F1 Score | | |
| |-------|----------| | |
| | `toxic` | **0.909** | | |
| | `offensive` | **0.938** | | |
| | `obscene` | **0.796** | | |
| | `insult` | **0.721** | | |
| | `severe_toxic` | **0.498** | | |
| | `identity_hate` | **0.415** | | |
| | `hate_speech` | **0.319** | | |
| | `threat` | **0.372** | | |
| | Metric | Score | | |
| |--------|-------| | |
| | **Macro F1** | **0.552** | | |
| | **Validation Loss** | **0.037** | | |
| --- | |
| ## π Usage Example | |
| ```python | |
| import torch | |
| from transformers import AutoModelForSequenceClassification, AutoTokenizer | |
| # Load model and tokenizer | |
| model_name = "ThingsAI/Quark-Mod-v0.1" | |
| tokenizer = AutoTokenizer.from_pretrained(model_name) | |
| model = AutoModelForSequenceClassification.from_pretrained(model_name) | |
| # Labels | |
| labels = ["toxic", "severe_toxic", "obscene", "threat", "insult", | |
| "identity_hate", "hate_speech", "offensive"] | |
| # Predict | |
| def moderate(text): | |
| inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=2048) | |
| with torch.no_grad(): | |
| outputs = model(**inputs) | |
| predictions = (outputs.logits > 0).int()[0] | |
| detected = [labels[i] for i, v in enumerate(predictions) if v == 1] | |
| return detected if detected else ["clean"] | |
| # Test | |
| print(moderate("I love this community!")) # ['clean'] | |
| print(moderate("You are an idiot and should die")) # ['toxic', 'insult'] | |
| print(moderate("Nice post, thanks for sharing")) # ['clean'] |