ThingsAI commited on
Commit
e79d18b
Β·
verified Β·
1 Parent(s): 89c5b42

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +137 -1
README.md CHANGED
@@ -1,3 +1,139 @@
1
  ---
2
- license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language: en
3
+ license: cc-by-nc-4.0
4
+ library_name: transformers
5
+ tags:
6
+ - moderation
7
+ - toxicity
8
+ - content-moderation
9
+ - safety
10
+ - quark
11
+ - multi-label-classification
12
+ - jigsaw
13
+ - hate-speech
14
+ - italian-ai
15
+ pipeline_tag: text-classification
16
+ metrics:
17
+ - f1
18
+ - macro-f1
19
+ base_model: ThingAI/Quark-135m
20
+ model_name: Quark-Mod-v0.1
21
+ pretty_name: Quark-Mod-v0.1
22
+ size_categories: 135M
23
+ task_categories:
24
+ - text-classification
25
  ---
26
+
27
+ # Quark-Mod-v0.1
28
+
29
+ <div align="center">
30
+
31
+ **A 135M parameter content moderation model fine-tuned from Quark-135M**
32
+
33
+ [![ThingsAI](https://img.shields.io/badge/πŸ›‘οΈ%20ThingsAI-Research-blue)](https://things-ai.org)
34
+ [![License](https://img.shields.io/badge/License-CC_BY_NC_4.0-lightgrey)](LICENSE)
35
+ [![Python](https://img.shields.io/badge/Python-3.10%2B-blue)](https://python.org)
36
+ [![Transformers](https://img.shields.io/badge/πŸ€—%20Transformers-latest-orange)](https://huggingface.co/docs/transformers)
37
+
38
+ </div>
39
+
40
+ ---
41
+
42
+ ## πŸ“‹ Model Overview
43
+
44
+ | Property | Value |
45
+ |----------|-------|
46
+ | **Full name** | Quark-Mod-v0.1 |
47
+ | **Base model** | [Quark-135M](https://huggingface.co/ThingAI/Quark-135m) (pretrained from scratch) |
48
+ | **Architecture** | Decoder-only, GQA (9:3), SwiGLU, RoPE, RMSNorm |
49
+ | **Parameters** | 135M |
50
+ | **Context length** | 2048 tokens |
51
+ | **Task** | Multi-label content moderation (9 classes) |
52
+ | **Language** | English (v0.1) |
53
+
54
+ ---
55
+
56
+ ## 🎯 Intended Use
57
+
58
+ This model is designed to **classify toxic and harmful content** across 9 categories. It is intended for:
59
+
60
+ - βœ… Social media moderation
61
+ - βœ… Comment filtering systems
62
+ - βœ… Content safety pipelines
63
+ - βœ… Research on efficient moderation models
64
+
65
+ ### Limitations
66
+ - ⚠️ English only (v0.1)
67
+ - ⚠️ May struggle with subtle sarcasm or highly contextual toxicity
68
+ - ⚠️ Lower performance on rare classes due to dataset imbalance
69
+ - ⚠️ Not recommended for high-stakes decisions without human review
70
+
71
+ ---
72
+
73
+ ## 🏷️ Labels (9 classes)
74
+
75
+ | Label | Description | Training examples |
76
+ |-------|-------------|-------------------|
77
+ | `toxic` | General toxic content | 32,263 (19.4%) |
78
+ | `severe_toxic` | Severe toxicity | 1,423 (0.9%) |
79
+ | `obscene` | Obscene/profane language | 7,567 (4.6%) |
80
+ | `threat` | Direct threats | 445 (0.3%) |
81
+ | `insult` | Insulting content | 7,065 (4.3%) |
82
+ | `identity_hate` | Hate targeting identity | 1,263 (0.8%) |
83
+ | `hate_speech` | Explicit hate speech | 1,265 (0.8%) |
84
+ | `offensive` | Offensive language | 17,274 (10.4%) |
85
+
86
+ **Note:** Multi-label classification β€” multiple classes can be active simultaneously.
87
+
88
+ ---
89
+
90
+ ## πŸ“Š Evaluation Results
91
+
92
+ **Validation set:** 18,436 examples
93
+
94
+ | Class | F1 Score |
95
+ |-------|----------|
96
+ | `toxic` | **0.909** |
97
+ | `offensive` | **0.938** |
98
+ | `obscene` | **0.796** |
99
+ | `insult` | **0.721** |
100
+ | `severe_toxic` | **0.498** |
101
+ | `identity_hate` | **0.415** |
102
+ | `hate_speech` | **0.319** |
103
+ | `threat` | **0.372** |
104
+
105
+ | Metric | Score |
106
+ |--------|-------|
107
+ | **Macro F1** | **0.552** |
108
+ | **Validation Loss** | **0.037** |
109
+
110
+ ---
111
+
112
+ ## πŸš€ Usage Example
113
+
114
+ ```python
115
+ import torch
116
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer
117
+
118
+ # Load model and tokenizer
119
+ model_name = "ThingsAI/Quark-Mod-v0.1"
120
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
121
+ model = AutoModelForSequenceClassification.from_pretrained(model_name)
122
+
123
+ # Labels
124
+ labels = ["toxic", "severe_toxic", "obscene", "threat", "insult",
125
+ "identity_hate", "hate_speech", "offensive"]
126
+
127
+ # Predict
128
+ def moderate(text):
129
+ inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=2048)
130
+ with torch.no_grad():
131
+ outputs = model(**inputs)
132
+ predictions = (outputs.logits > 0).int()[0]
133
+ detected = [labels[i] for i, v in enumerate(predictions) if v == 1]
134
+ return detected if detected else ["clean"]
135
+
136
+ # Test
137
+ print(moderate("I love this community!")) # ['clean']
138
+ print(moderate("You are an idiot and should die")) # ['toxic', 'insult']
139
+ print(moderate("Nice post, thanks for sharing")) # ['clean']