NiklasKoch
/

modernbert-discussion-classifier

+---
+base_model: answerdotai/ModernBERT-base
+library_name: peft
+tags:
+- text-classification
+- reddit
+- conversation-analysis
+- constructive-dialogue
+- modernbert
+- lora
+- transformers
+- lightweight
+- high-throughput
+language:
+- en
+datasets:
+- reddit
+pipeline_tag: text-classification
+---
+# ModernBERT Reddit Discussion Classifier
+A lightweight, high-throughput ModernBERT-based model for classifying constructive vs non-constructive conversations in online forums like Reddit. Optimized for processing vast amounts of Reddit discussion data efficiently.
+## Model Description
+This model is a QLoRA (Quantized LoRA) fine-tuned version of `answerdotai/ModernBERT-base` specifically designed as a **lightweight** solution for large-scale Reddit discussion analysis.
+- **Model Type**: Text Classification (Binary)
+- **Base Model**: answerdotai/ModernBERT-base
+- **Training Method**: QLoRA with self-training
+- **Task**: Binary classification of conversation constructiveness
+- **Language**: English
+## Intended Uses
+### Primary Use Case
+- Classifying Reddit discussions as constructive or non-constructive
+- Content moderation assistance
+- Large-scale conversation quality analysis
+- Social media research
+### Direct Use
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+from peft import PeftModel
+import torch
+# Load base model and tokenizer
+base_model_name = "answerdotai/ModernBERT-base"
+tokenizer = AutoTokenizer.from_pretrained(base_model_name)
+model = AutoModelForSequenceClassification.from_pretrained(
+    base_model_name,
+    num_labels=2
+)
+# Load the fine-tuned adapters
+model = PeftModel.from_pretrained(model, "NiklasKoch/modernbert-discussion-classifier")
+model.eval()
+# Classify text (optimized for batch processing)
+def classify_text(text):
+    inputs = tokenizer(
+        text,
+        return_tensors="pt",
+        truncation=True,
+        padding=True,
+        max_length=4096
+    )
+    # Move inputs to same device as model (important for GPU usage)
+    inputs = {k: v.to(next(model.parameters()).device) for k, v in inputs.items()}
+    with torch.no_grad():
+        outputs = model(**inputs)
+        predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
+    # 0 = non-constructive, 1 = constructive
+    predicted_class = torch.argmax(predictions, dim=-1).item()
+    confidence = predictions[0][predicted_class].item()
+    return {
+        'class': 'constructive' if predicted_class == 1 else 'non-constructive',
+        'confidence': confidence,
+        'scores': {
+            'non-constructive': predictions[0][0].item(),
+            'constructive': predictions[0][1].item()
+        }
+    }
+# Example usage - Reddit discussion
+text = "[author0] LEGO: What do you think you're doing?!? [author1] I don't get it did he reveal bionicle reboot or smthn? [author2] Not really, he did announce something but was super vague, seems like a sort of passion project we wants to do with the community, he even said it might not even be bionicle. [author1] So is that image fan made or is it one of his passion projects [author2] Those pictures are real and on his insta, he did a stream talking about it I'm sure you can find somewhere, search up Fabre bionicle stream 2020 or something. [author1] OK thanks"
+result = classify_text(text)
+print(result)
+```
+## Training Details
+### Training Data
+- **Source**: https://archive.org/download/pushshift_reddit_200506_to_202212/
+- **Size**: ~1.4 million Reddit threads filtered for English language and minimum 2 authors
+- **Labels**: Binary (constructive/non-constructive conversations)
+- **Additional Data**: YNACC and IAC datasets for initial supervised training
+### Training Procedure
+- **Training Method**: Self-training
+- **Quantization**: 4-bit QLoRA for efficiency
+- **LoRA Config**:
+  - `r`: 16
+  - `lora_alpha`: 32
+  - `lora_dropout`: 0.1
+  - Target modules: `Wqkv`, `Wo`, `Wi`, `dense`
+- **Loss Function**: Focal Loss with class weighting
+- **Max Sequence Length**: 4096 tokens
+- **Batch Size**: 64
+- **Learning Rate**: 2e-6
+### Training Hardware
+- 48 hours on 4x NVIDIA A100 40GB GPUs
+## Performance
+### Evaluation Results
+```
+YNACC:
+Accuracy: 0.63
+Precision: 0.63
+F1-Score: 0.65
+IAC:
+Accuracy: 0.79
+Precision: 0.85
+F1-Score: 0.87
+Reddit:
+Accuracy: 0.57
+Precision: 0.74
+F1-Score: 0.67
+```
+## Limitations and Bias
+- **Language**: English only
+- **Bias**: May reflect biases present in Reddit discussions and training data
+## Ethical Considerations
+- Human oversight is recommended for important moderation decisions
+## Technical Specifications
+- **Model Architecture**: ModernBERT + Classification Head
+- **Parameters**: ~150M base + LoRA adapters + classification head
+- **Precision**: 4-bit quantized base model with full-precision adapters
+- **Framework**: PyTorch, Transformers, PEFT (any recent version - you may see harmless warnings about configuration parameters)
+## Model Card Authors
+Niklas Koch, Georg August University of Göttingen
+## Model Card Contact
+niklas.koch01@stud.uni-goettingen.de

adapter_config.json ADDED Viewed

	@@ -0,0 +1,48 @@

+{
+  "alpha_pattern": {},
+  "auto_mapping": null,
+  "base_model_name_or_path": "answerdotai/ModernBERT-base",
+  "bias": "none",
+  "corda_config": null,
+  "eva_config": null,
+  "exclude_modules": null,
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 32,
+  "lora_bias": false,
+  "lora_dropout": 0.1,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": [
+    "classifier",
+    "classifier",
+    "score",
+    "classifier",
+    "score",
+    "classifier",
+    "score",
+    "classifier",
+    "score"
+  ],
+  "peft_type": "LORA",
+  "qalora_group_size": 16,
+  "r": 16,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "Wi",
+    "Wqkv",
+    "dense",
+    "Wo"
+  ],
+  "task_type": "SEQ_CLS",
+  "trainable_token_indices": null,
+  "use_dora": false,
+  "use_qalora": false,
+  "use_rslora": false
+}

adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:46e2e62b7c20299da64c9049672738806def82c7684aa3b0bdefa34be0438ac8
+size 13643392