File size: 5,907 Bytes
dc41bba
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
---
license: mit
language:
- en
metrics:
- accuracy
- f1
base_model:
- google-bert/bert-base-uncased
pipeline_tag: text-classification
tags:
- text-classification
- ai-detection
- academic-text
- ai-generated-text-detection
model-index:
- name: bert-ai-text-detector
  results:
  - task:
      type: text-classification
      name: AI-Generated Text Detection
    dataset:
      name: Custom Academic Text Dataset
      type: custom
    metrics:
    - type: accuracy
      value: 0.9957
    - type: f1
      value: 0.9958
    - type: precision
      value: 0.9923
    - type: recall
      value: 0.9994
---
# BERT-based AI-Generated Academic Text Detector

A high-accuracy BERT model for detecting AI-generated academic text with **99.57% accuracy** on paragraph-level samples.

## Online Demo

🌐 **Try the model online**: [https://followsci.com/ai-detection](https://followsci.com/ai-detection)

Free web interface with real-time detection, no installation or API key required.

## Model Details

### Model Description

- **Model Type**: BERT-base-uncased fine-tuned for binary text classification
- **Architecture**: BERT-base-uncased (110M parameters)
- **Task**: Binary classification (Human-written vs AI-generated text)
- **Input**: Academic text paragraphs (up to 512 tokens)
- **Output**: Binary label (0 = Human-written, 1 = AI-generated) with confidence scores

### Training Information

- **Training Samples**: 1,487,400 paragraph-level samples
- **Validation Samples**: 185,930 paragraph-level samples
- **Test Samples**: 185,930 paragraph-level samples
- **Total Dataset**: 1,859,260 paragraphs
- **Training Data**:
  - Human-written: Academic papers from arXiv
  - AI-generated: Text generated by various large language models (GPT, Claude, etc.)

## Performance

### Test Set Results

| Metric | Value |
|--------|-------|
| **Accuracy** | **99.57%** |
| **F1-Score** | **99.58%** |
| Precision | 99.23% |
| Recall | 99.94% |
| False Positive Rate | 0.82% |
| False Negative Rate | 0.06% |

### Confusion Matrix (Test Set)

| | Predicted: Human | Predicted: AI |
|---|---|---|
| **Actual: Human** | 89,740 (TN) | 740 (FP) |
| **Actual: AI** | 60 (FN) | 95,390 (TP) |

**Inference Speed:** ~20,900 samples/second on RTX 3090 (batch size 64)

## Usage

### Quick Start

```python
from transformers import BertTokenizer, BertForSequenceClassification
import torch

# Load model and tokenizer
model_name = "followsci/bert-ai-text-detector"
tokenizer = BertTokenizer.from_pretrained(model_name)
model = BertForSequenceClassification.from_pretrained(model_name)
model.eval()

# Detect AI text
text = "Your academic paragraph here..."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.nn.functional.softmax(outputs.logits, dim=-1)
    ai_prob = probs[0][1].item() * 100
    human_prob = probs[0][0].item() * 100
    
    print(f"AI-generated probability: {ai_prob:.1f}%")
    print(f"Human-written probability: {human_prob:.1f}%")
    
    if ai_prob > 50:
        print("Prediction: AI-generated")
    else:
        print("Prediction: Human-written")
```

### Batch Processing

```python
texts = [
    "First paragraph...",
    "Second paragraph...",
    # ... more texts
]

inputs = tokenizer(
    texts,
    return_tensors="pt",
    truncation=True,
    max_length=512,
    padding=True
)

with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.nn.functional.softmax(outputs.logits, dim=-1)
    
    for i, prob in enumerate(probs):
        ai_prob = prob[1].item() * 100
        print(f"Text {i+1}: AI probability = {ai_prob:.1f}%")
```

### Using with Transformers Pipeline

```python
from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="followsci/bert-ai-text-detector",
    tokenizer="followsci/bert-ai-text-detector"
)

result = classifier("Your text here...")
print(result)
```

## Training Details

### Training Configuration

- **Base Model**: `bert-base-uncased`
- **Batch Size**: 64
- **Learning Rate**: 5e-5 (with linear warmup)
- **Warmup Steps**: 5,000
- **Max Sequence Length**: 512
- **Optimizer**: AdamW
- **Epochs**: 3
- **Training Time**: ~11 hours (on RTX 3090)

### Dataset Distribution

| Split | Total Samples | Human (Label 0) | AI (Label 1) |
|-------|--------------|-----------------|--------------|
| Train | 1,487,400 | 723,780 (48.7%) | 763,620 (51.3%) |
| Validation | 185,930 | 90,470 (48.7%) | 95,460 (51.3%) |
| Test | 185,930 | 90,480 (48.7%) | 95,450 (51.3%) |

## Limitations

1. **Domain Specificity**: The model is trained primarily on academic text. Performance may degrade on:
   - Casual text or social media content
   - Technical documentation
   - Creative writing

2. **Binary Classification**: The model only distinguishes between "human" and "AI" text, without:
   - Identifying which AI model generated the text
   - Providing confidence intervals
   - Detecting partially AI-assisted text

3. **Paragraph-Level Detection**: The model is optimized for paragraph-level samples:
   - Performance on sentence-level or full-document level may vary
   - Best results achieved with structured academic paragraphs

4. **False Positives**: Approximately 0.82% false positive rate means some human-written text may be flagged as AI-generated.

## Ethical Considerations

- **Use Case**: This model is intended as a tool for academic integrity and research purposes
- **Bias**: The model may reflect biases present in the training data
- **Misuse**: Should not be used as the sole criterion for academic misconduct decisions
- **Transparency**: Results should be interpreted with context and domain expertise


## License

This model is licensed under the MIT License.

## Contact

- **Email**: raffoduanedonnenfeld@gmail.com

---

<p align="center">
  Made with ❤️ for Academic Integrity
</p>