codelion commited on
Commit
7e4fb88
·
verified ·
1 Parent(s): 36ec374

AI text detector trained on EditLens ICLR 2026 dataset

Browse files
README.md CHANGED
@@ -1,121 +1,74 @@
1
  ---
2
- language: en
3
  tags:
4
  - adaptive-classifier
5
  - text-classification
6
- - ai-detection
7
- - ai-generated-text
8
  - continuous-learning
9
  license: apache-2.0
10
- datasets:
11
- - pangram/editlens_iclr
12
- - adaptive-classifier/ai-detector-data
13
- base_model: TrustSafeAI/RADAR-Vicuna-7B
14
- metrics:
15
- - accuracy
16
- - f1
17
- pipeline_tag: text-classification
18
- model-index:
19
- - name: adaptive-classifier/ai-detector
20
- results:
21
- - task:
22
- type: text-classification
23
- name: AI Text Detection (Binary)
24
- dataset:
25
- name: EditLens ICLR 2026
26
- type: pangram/editlens_iclr
27
- split: test
28
- metrics:
29
- - type: accuracy
30
- value: 73.5
31
- name: Accuracy
32
- - type: f1
33
- value: 72.1
34
- name: Macro F1
35
  ---
36
 
37
- # AI Text Detector (adaptive-classifier)
38
 
39
- A binary AI text detector that classifies text as **human-written** or **AI-generated/edited**, built with [adaptive-classifier](https://github.com/codelion/adaptive-classifier) on the [EditLens ICLR 2026](https://huggingface.co/datasets/pangram/editlens_iclr) benchmark.
40
-
41
- ## How It Works
42
-
43
- Uses frozen embeddings from [TrustSafeAI/RADAR-Vicuna-7B](https://huggingface.co/TrustSafeAI/RADAR-Vicuna-7B) (a RoBERTa-large model adversarially trained for AI detection) as a feature extractor, with adaptive-classifier's prototype memory + neural head for classification.
44
-
45
- ```
46
- Text → RADAR backbone (frozen, 355M) → 1024-dim embedding → adaptive-classifier head → human / ai
47
- ```
48
 
49
  ## Installation
50
 
 
 
51
  ```bash
52
  pip install adaptive-classifier
53
  ```
54
 
55
- ## Usage
56
-
57
- ```python
58
- from adaptive_classifier import AdaptiveClassifier
59
-
60
- classifier = AdaptiveClassifier.from_pretrained("adaptive-classifier/ai-detector")
61
 
62
- predictions = classifier.predict("Your text here")
63
- # Returns: [('ai', 0.85), ('human', 0.15)]
 
 
64
 
65
- # Batch prediction
66
- results = classifier.predict_batch(["text 1", "text 2"], k=2)
67
 
68
- # Continuous learning — add new examples without retraining
69
- classifier.add_examples(
70
- ["new human text example", "new ai text example"],
71
- ["human", "ai"]
72
- )
73
  ```
74
 
75
- ## Results
76
 
77
- Evaluated on the [EditLens ICLR 2026](https://huggingface.co/datasets/pangram/editlens_iclr) test splits.
78
 
79
- ### Binary Classification (Human vs AI)
 
80
 
81
- | Model | Method | Test F1 |
82
- |-------|--------|---------|
83
- | EditLens Mistral-Small 24B | QLoRA fine-tuned | 95.6 |
84
- | Pangram v2 | Proprietary | 83.7 |
85
- | Binoculars | Perplexity ratio | 81.4 |
86
- | FastDetectGPT | Log-prob based | 80.5 |
87
- | **This model** | **Frozen RADAR + adaptive-classifier** | **72.1** |
88
 
89
- ### Per-Split Results
 
 
 
90
 
91
- | Split | Accuracy | Macro-F1 | AI F1 | Human F1 |
92
- |-------|----------|----------|-------|----------|
93
- | test (in-distribution) | 73.5% | 72.1 | 78.3 | 65.9 |
94
- | test_enron (OOD domain) | 73.5% | 64.1 | 82.5 | 45.7 |
95
- | test_llama (OOD model) | 76.1% | 74.7 | 80.7 | 68.8 |
96
 
97
- The model generalizes well to unseen AI models (Llama 3.3-70B), achieving higher F1 on OOD text than in-distribution.
98
 
99
  ## Training Details
100
 
101
- - **Backbone**: [TrustSafeAI/RADAR-Vicuna-7B](https://huggingface.co/TrustSafeAI/RADAR-Vicuna-7B) (frozen, 355M params)
102
- - **Dataset**: [pangram/editlens_iclr](https://huggingface.co/datasets/pangram/editlens_iclr) train split
103
- - **Examples**: 1,000 per class (2,000 total), stratified sample
104
- - **Classes**: `human` (human_written), `ai` (ai_edited + ai_generated)
105
- - **Embedding dim**: 1024
106
- - **Prototype weight**: 0.3, Neural weight: 0.7
107
- - **Training time**: ~6 minutes on CPU
108
-
109
- ## Live Predictions Dataset
110
-
111
- Predictions made through the [hosted Space](https://huggingface.co/spaces/adaptive-classifier/ai-detector) are continuously logged to [adaptive-classifier/ai-detector-data](https://huggingface.co/datasets/adaptive-classifier/ai-detector-data) — a public dataset of real-world predictions with optional user feedback (Correct / Incorrect). This dataset grows over time and can be used to track model performance, find failure cases, and drive future retraining.
112
 
113
  ## Limitations
114
 
115
- - Binary only (human vs AI) — does not distinguish AI-edited from AI-generated
116
- - Relies on frozen RADAR embeddings; cannot learn new text patterns beyond what RADAR captures
117
- - Minimum ~50 words of text recommended for reliable detection
118
- - Trained on English text from specific domains (reviews, news, creative writing, academic)
119
 
120
  ## Citation
121
 
 
1
  ---
2
+ language: multilingual
3
  tags:
4
  - adaptive-classifier
5
  - text-classification
 
 
6
  - continuous-learning
7
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  ---
9
 
10
+ # Adaptive Classifier
11
 
12
+ This model is an instance of an [adaptive-classifier](https://github.com/codelion/adaptive-classifier) that allows for continuous learning and dynamic class addition.
 
 
 
 
 
 
 
 
13
 
14
  ## Installation
15
 
16
+ **IMPORTANT:** To use this model, you must first install the `adaptive-classifier` library. You do **NOT** need `trust_remote_code=True`.
17
+
18
  ```bash
19
  pip install adaptive-classifier
20
  ```
21
 
22
+ ## Model Details
 
 
 
 
 
23
 
24
+ - Base Model: TrustSafeAI/RADAR-Vicuna-7B
25
+ - Number of Classes: 2
26
+ - Total Examples: 400
27
+ - Embedding Dimension: 1024
28
 
29
+ ## Class Distribution
 
30
 
31
+ ```
32
+ ai: 200 examples (50.0%)
33
+ human: 200 examples (50.0%)
 
 
34
  ```
35
 
36
+ ## Usage
37
 
38
+ After installing the `adaptive-classifier` library, you can load and use this model:
39
 
40
+ ```python
41
+ from adaptive_classifier import AdaptiveClassifier
42
 
43
+ # Load the model (no trust_remote_code needed!)
44
+ classifier = AdaptiveClassifier.from_pretrained("adaptive-classifier/model-name")
 
 
 
 
 
45
 
46
+ # Make predictions
47
+ text = "Your text here"
48
+ predictions = classifier.predict(text)
49
+ print(predictions) # List of (label, confidence) tuples
50
 
51
+ # Add new examples for continuous learning
52
+ texts = ["Example 1", "Example 2"]
53
+ labels = ["class1", "class2"]
54
+ classifier.add_examples(texts, labels)
55
+ ```
56
 
57
+ **Note:** This model uses the `adaptive-classifier` library distributed via PyPI. You do **NOT** need to set `trust_remote_code=True` - just install the library first.
58
 
59
  ## Training Details
60
 
61
+ - Training Steps: 6
62
+ - Examples per Class: See distribution above
63
+ - Prototype Memory: Active
64
+ - Neural Adaptation: Active
 
 
 
 
 
 
 
65
 
66
  ## Limitations
67
 
68
+ This model:
69
+ - Requires at least 3 examples per class
70
+ - Has a maximum of 1000 examples per class
71
+ - Updates prototypes every 100 examples
72
 
73
  ## Citation
74
 
config.json CHANGED
@@ -41,9 +41,9 @@
41
  },
42
  "library_name": "adaptive-classifier",
43
  "model_name": "TrustSafeAI/RADAR-Vicuna-7B",
44
- "train_steps": 4,
45
  "training_history": {
46
- "ai": 1000,
47
- "human": 1000
48
  }
49
  }
 
41
  },
42
  "library_name": "adaptive-classifier",
43
  "model_name": "TrustSafeAI/RADAR-Vicuna-7B",
44
+ "train_steps": 6,
45
  "training_history": {
46
+ "ai": 1030,
47
+ "human": 1030
48
  }
49
  }
examples.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:bd9d6b340651ee19eeb9562b300ea8c05613467144502b8e40b90a2b5616c4b5
3
- size 13044274
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b7095775294d931100f970844dbb8b3239f4253e327aef53e9e3f2e7072e8af0
3
+ size 13126989
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d71cf4f74418cf384a41c80d46ddce1efc9c9798d9abaaab34fe257f82249929
3
  size 6310624
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bdfed368896614a1418adc175d9021e05dcfddf687a96c2b169798097ee1fb10
3
  size 6310624
onnx/config.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "RobertaForSequenceClassification"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "bos_token_id": 0,
7
+ "classifier_dropout": null,
8
+ "eos_token_id": 2,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 1024,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 4096,
14
+ "layer_norm_eps": 1e-05,
15
+ "max_position_embeddings": 514,
16
+ "model_type": "roberta",
17
+ "num_attention_heads": 16,
18
+ "num_hidden_layers": 24,
19
+ "pad_token_id": 1,
20
+ "position_embedding_type": "absolute",
21
+ "transformers_version": "4.57.6",
22
+ "type_vocab_size": 1,
23
+ "use_cache": true,
24
+ "vocab_size": 50265
25
+ }
onnx/merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
onnx/model.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:27ad02e9be07902436c772f338da833ba3fa601bd514992c7785a4e5f9b800b9
3
+ size 1417623186
onnx/model_quantized.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:637e5e684517c84282f969ff324087f3fd2f180484bc433118a11bb48bc02261
3
+ size 356067671
onnx/ort_config.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "one_external_file": true,
3
+ "opset": null,
4
+ "optimization": {},
5
+ "quantization": {
6
+ "activations_dtype": "QUInt8",
7
+ "activations_symmetric": false,
8
+ "format": "QOperator",
9
+ "is_static": false,
10
+ "mode": "IntegerOps",
11
+ "nodes_to_exclude": [],
12
+ "nodes_to_quantize": [],
13
+ "operators_to_quantize": [
14
+ "Conv",
15
+ "MatMul",
16
+ "Attention",
17
+ "LSTM",
18
+ "Gather",
19
+ "Transpose",
20
+ "EmbedLayerNormalization"
21
+ ],
22
+ "per_channel": false,
23
+ "qdq_add_pair_to_weight": false,
24
+ "qdq_dedicated_pair": false,
25
+ "qdq_op_type_per_channel_support_to_axis": {
26
+ "MatMul": 1
27
+ },
28
+ "reduce_range": false,
29
+ "weights_dtype": "QInt8",
30
+ "weights_symmetric": true
31
+ },
32
+ "use_external_data_format": false
33
+ }
onnx/special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": true,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": true,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": true,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": true,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": true,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "<unk>",
46
+ "lstrip": false,
47
+ "normalized": true,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
onnx/tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
onnx/tokenizer_config.json ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "added_tokens_decoder": {
4
+ "0": {
5
+ "content": "<s>",
6
+ "lstrip": false,
7
+ "normalized": true,
8
+ "rstrip": false,
9
+ "single_word": false,
10
+ "special": true
11
+ },
12
+ "1": {
13
+ "content": "<pad>",
14
+ "lstrip": false,
15
+ "normalized": true,
16
+ "rstrip": false,
17
+ "single_word": false,
18
+ "special": true
19
+ },
20
+ "2": {
21
+ "content": "</s>",
22
+ "lstrip": false,
23
+ "normalized": true,
24
+ "rstrip": false,
25
+ "single_word": false,
26
+ "special": true
27
+ },
28
+ "3": {
29
+ "content": "<unk>",
30
+ "lstrip": false,
31
+ "normalized": true,
32
+ "rstrip": false,
33
+ "single_word": false,
34
+ "special": true
35
+ },
36
+ "50264": {
37
+ "content": "<mask>",
38
+ "lstrip": true,
39
+ "normalized": false,
40
+ "rstrip": false,
41
+ "single_word": false,
42
+ "special": true
43
+ }
44
+ },
45
+ "bos_token": "<s>",
46
+ "clean_up_tokenization_spaces": false,
47
+ "cls_token": "<s>",
48
+ "eos_token": "</s>",
49
+ "errors": "replace",
50
+ "extra_special_tokens": {},
51
+ "mask_token": "<mask>",
52
+ "model_max_length": 1000000000000000019884624838656,
53
+ "pad_token": "<pad>",
54
+ "sep_token": "</s>",
55
+ "tokenizer_class": "RobertaTokenizer",
56
+ "trim_offsets": true,
57
+ "unk_token": "<unk>"
58
+ }
onnx/vocab.json ADDED
The diff for this file is too large to render. See raw diff