File size: 7,903 Bytes
1aabcde
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
---

language: en
license: apache-2.0
tags:
- toxicity-detection
- text-classification
- dendritic-optimization
- perforated-backpropagation
- bert-tiny
- efficient-ml
datasets:
- jigsaw-toxic-comment-classification-challenge
metrics:
- f1
- accuracy
- precision
- recall
model-index:
- name: dendritic-bert-tiny-toxicity
  results:
  - task:
      type: text-classification
      name: Toxicity Detection
    dataset:
      name: Civil Comments
      type: jigsaw_toxicity_pred
    metrics:
    - type: f1
      value: 0.358
      name: F1 Score (Toxic Class)
    - type: accuracy
      value: 0.918
      name: Accuracy
    - type: inference_time
      value: 2.25
      name: Inference Latency (ms)
---


# Dendritic BERT-Tiny for Toxicity Detection

## Model Description

This model applies **Perforated Backpropagation with Dendritic Optimization** to enhance a compact BERT-Tiny model (4.8M parameters) for toxicity classification. It achieves performance comparable to BERT-Base (109M parameters) while maintaining **17.8x faster inference speed**.

**Key Features:**
-  **22.8x smaller** than BERT-Base (4.8M vs 109M parameters)
-  **17.8x faster** inference (2.25ms vs 40ms)
-  **Superior toxic detection** (F1=0.358 vs BERT-Base F1=0.05)
-  **Dendritic optimization** using PerforatedAI
-  **Edge-ready** for real-time deployment

## Model Details

- **Model Type:** BERT-Tiny with Dendritic Optimization
- **Language:** English
- **Task:** Binary Text Classification (Toxic/Non-Toxic)
- **Base Model:** [prajjwal1/bert-tiny](https://huggingface.co/prajjwal1/bert-tiny)
- **Framework:** PyTorch 2.9.1 + PerforatedAI 3.0.7
- **Parameters:** 4,798,468
- **Model Size:** 19.3 MB

## Performance Metrics

| Metric | Dendritic BERT-Tiny | BERT-Base | Improvement |
|--------|---------------------|-----------|-------------|
| **Parameters** | 4.8M | 109M | 22.8x smaller |
| **F1 Score (Toxic)** | 0.358 | 0.050 | 7.16x better |
| **Accuracy** | 91.8% | 91.0% | +0.8% |
| **Inference Time** | 2.25ms | 40.1ms | 17.8x faster |
| **Throughput** | 444 samples/s | 25 samples/s | 17.8x higher |

## Intended Use

### Primary Use Cases
- Real-time content moderation in online forums and social media
- Edge device deployment for resource-constrained environments
- High-throughput toxicity screening systems
- Research in efficient NLP and dendritic optimization

### Limitations
- Trained on Civil Comments dataset (English only)
- May not generalize to all forms of toxicity or cultural contexts
- Class imbalance handling (94% non-toxic samples)
- Best suited for binary toxicity detection

## How to Use

### Installation

```bash

pip install transformers torch perforatedai

```

### Quick Start

```python

from transformers import AutoTokenizer, AutoModelForSequenceClassification

import torch



# Load model and tokenizer

model_name = "your-username/dendritic-bert-tiny-toxicity"

tokenizer = AutoTokenizer.from_pretrained("prajjwal1/bert-tiny")

model = AutoModelForSequenceClassification.from_pretrained(model_name)



# Prepare input

text = "This is a sample comment to classify"

inputs = tokenizer(text, return_tensors="pt", max_length=128, truncation=True, padding=True)



# Inference

model.eval()

with torch.no_grad():

    outputs = model(**inputs)

    logits = outputs.logits

    prediction = torch.argmax(logits, dim=-1)

    

# Get result

label = "toxic" if prediction.item() == 1 else "non-toxic"

confidence = torch.softmax(logits, dim=-1)[0][prediction].item()

print(f"Prediction: {label} (confidence: {confidence:.2%})")

```

### Batch Processing for High Throughput

```python

import torch

from transformers import AutoTokenizer, AutoModelForSequenceClassification



model_name = "your-username/dendritic-bert-tiny-toxicity"

tokenizer = AutoTokenizer.from_pretrained("prajjwal1/bert-tiny")

model = AutoModelForSequenceClassification.from_pretrained(model_name)



# Example batch of comments

comments = [

    "This is a great discussion!",

    "You are absolutely terrible",

    "I disagree but respect your opinion",

]



# Tokenize batch

inputs = tokenizer(comments, return_tensors="pt", max_length=128, 

                   truncation=True, padding=True)



# Batch inference

model.eval()

with torch.no_grad():

    outputs = model(**inputs)

    predictions = torch.argmax(outputs.logits, dim=-1)

    probabilities = torch.softmax(outputs.logits, dim=-1)



# Display results

for comment, pred, probs in zip(comments, predictions, probabilities):

    label = "toxic" if pred.item() == 1 else "non-toxic"

    confidence = probs[pred].item()

    print(f"'{comment}' -> {label} ({confidence:.2%})")

```

### Edge Deployment with Quantization

```python

import torch

from transformers import AutoModelForSequenceClassification



model = AutoModelForSequenceClassification.from_pretrained(

    "your-username/dendritic-bert-tiny-toxicity"

)



# Dynamic quantization for edge devices

quantized_model = torch.quantization.quantize_dynamic(

    model, {torch.nn.Linear}, dtype=torch.qint8

)



# Model is now ~75% smaller and faster on CPU

```

## Training Details

### Training Dataset
- **Source:** Civil Comments / Jigsaw Toxicity Dataset
- **Train samples:** 5,000 (4.54% toxic)
- **Validation samples:** 1,000 (6.60% toxic)
- **Test samples:** 1,000 (9.00% toxic)

### Training Procedure
- **Optimizer:** AdamW (lr=2e-5, weight_decay=0.01)

- **Epochs:** 10

- **Batch size:** 32

- **Max sequence length:** 128 tokens

- **Loss function:** Cross-entropy with class weights (1.0, 21.0)

- **Warmup steps:** 500

- **Early stopping:** Patience 3 epochs



### Dendritic Optimization

This model uses **Perforated Backpropagation** with dendritic nodes to enhance learning capacity:

- Correlation threshold: 0.95

- Perforated AI version: 3.0.7

- Configuration: 3D tensor output dimensions for transformer layers



## Technical Architecture



```

Input Text (max 128 tokens)

    ↓

BERT-Tiny Tokenizer

    ↓

Embedding Layer (vocab_size: 30522, hidden: 128)
    ↓

Transformer Layer 1 (2 attention heads) + Dendritic Nodes

    ↓

Transformer Layer 2 (2 attention heads) + Dendritic Nodes

    ↓

Pooler Layer (CLS token)

    ↓

Classification Head (128 → 2)

    ↓

Output: [non-toxic_logit, toxic_logit]

```


## Ethical Considerations

### Bias and Fairness
- Model may inherit biases from the Civil Comments dataset
- Performance may vary across demographic groups and cultural contexts
- Should not be the sole decision-maker in content moderation

### Recommended Practices
- Use as part of a human-in-the-loop moderation system
- Regularly evaluate for fairness across user demographics
- Combine with other signals for critical decisions
- Provide appeal mechanisms for users

## Citation

If you use this model in your research, please cite:

```bibtex

@misc{dendritic-bert-tiny-2026,

  title={Giant-Killer NLP: Dendritic Optimization for Toxicity Classification},

  author={PROJECT-Z Team},

  year={2026},

  publisher={HuggingFace Hub},

  howpublished={\url{https://huggingface.co/your-username/dendritic-bert-tiny-toxicity}},

  note={PyTorch Dendritic Optimization Hackathon}

}

```

## Acknowledgments

- **Base Model:** BERT-Tiny by [prajjwal1](https://huggingface.co/prajjwal1)
- **Optimization Framework:** [PerforatedAI](https://perforatedai.com/)
- **Dataset:** Jigsaw/Google Civil Comments
- **Hackathon:** PyTorch Dendritic Optimization Challenge

## License

Apache 2.0

## Contact

For questions, issues, or collaboration opportunities, please open an issue on the [GitHub repository](https://github.com/your-username/dendritic-bert-tiny-toxicity).

---

**Developed for the PyTorch Dendritic Optimization Hackathon - January 2026**