adaptive-classifier
/

ai-detector

@@ -1,74 +1,116 @@
 ---
-language: multilingual
 tags:
 - adaptive-classifier
 - text-classification
 - continuous-learning
 license: apache-2.0
 ---
-# Adaptive Classifier
-This model is an instance of an [adaptive-classifier](https://github.com/codelion/adaptive-classifier) that allows for continuous learning and dynamic class addition.
-## Installation
-**IMPORTANT:** To use this model, you must first install the `adaptive-classifier` library. You do **NOT** need `trust_remote_code=True`.
 ```bash
 pip install adaptive-classifier
 ```
-## Model Details
-- Base Model: TrustSafeAI/RADAR-Vicuna-7B
-- Number of Classes: 2
-- Total Examples: 10
-- Embedding Dimension: 1024
-## Class Distribution
-```
-ai: 5 examples (50.0%)
-human: 5 examples (50.0%)
 ```
-## Usage
-After installing the `adaptive-classifier` library, you can load and use this model:
-```python
-from adaptive_classifier import AdaptiveClassifier
-# Load the model (no trust_remote_code needed!)
-classifier = AdaptiveClassifier.from_pretrained("adaptive-classifier/model-name")
-# Make predictions
-text = "Your text here"
-predictions = classifier.predict(text)
-print(predictions)  # List of (label, confidence) tuples
-# Add new examples for continuous learning
-texts = ["Example 1", "Example 2"]
-labels = ["class1", "class2"]
-classifier.add_examples(texts, labels)
-```
-**Note:** This model uses the `adaptive-classifier` library distributed via PyPI. You do **NOT** need to set `trust_remote_code=True` - just install the library first.
 ## Training Details
-- Training Steps: 4
-- Examples per Class: See distribution above
-- Prototype Memory: Active
-- Neural Adaptation: Active
 ## Limitations
-This model:
-- Requires at least 3 examples per class
-- Has a maximum of 1000 examples per class
-- Updates prototypes every 100 examples
 ## Citation
@@ -80,4 +122,18 @@ This model:
   publisher = {GitHub},
   url = {https://github.com/codelion/adaptive-classifier}
 }
 ```

 ---
+language: en
 tags:
 - adaptive-classifier
 - text-classification
+- ai-detection
+- ai-generated-text
 - continuous-learning
 license: apache-2.0
+datasets:
+- pangram/editlens_iclr
+base_model: TrustSafeAI/RADAR-Vicuna-7B
+metrics:
+- accuracy
+- f1
+pipeline_tag: text-classification
+model-index:
+- name: adaptive-classifier/ai-detector
+  results:
+  - task:
+      type: text-classification
+      name: AI Text Detection (Binary)
+    dataset:
+      name: EditLens ICLR 2026
+      type: pangram/editlens_iclr
+      split: test
+    metrics:
+    - type: accuracy
+      value: 73.5
+      name: Accuracy
+    - type: f1
+      value: 72.1
+      name: Macro F1
 ---
+# AI Text Detector (adaptive-classifier)
+A binary AI text detector that classifies text as **human-written** or **AI-generated/edited**, built with [adaptive-classifier](https://github.com/codelion/adaptive-classifier) on the [EditLens ICLR 2026](https://huggingface.co/datasets/pangram/editlens_iclr) benchmark.
+## How It Works
+Uses frozen embeddings from [TrustSafeAI/RADAR-Vicuna-7B](https://huggingface.co/TrustSafeAI/RADAR-Vicuna-7B) (a RoBERTa-large model adversarially trained for AI detection) as a feature extractor, with adaptive-classifier's prototype memory + neural head for classification.
+```
+Text → RADAR backbone (frozen, 355M) → 1024-dim embedding → adaptive-classifier head → human / ai
+```
+## Installation
 ```bash
 pip install adaptive-classifier
 ```
+## Usage
+```python
+from adaptive_classifier import AdaptiveClassifier
+classifier = AdaptiveClassifier.from_pretrained("adaptive-classifier/ai-detector")
+predictions = classifier.predict("Your text here")
+# Returns: [('ai', 0.85), ('human', 0.15)]
+# Batch prediction
+results = classifier.predict_batch(["text 1", "text 2"], k=2)
+# Continuous learning — add new examples without retraining
+classifier.add_examples(
+    ["new human text example", "new ai text example"],
+    ["human", "ai"]
+)
 ```
+## Results
+Evaluated on the [EditLens ICLR 2026](https://huggingface.co/datasets/pangram/editlens_iclr) test splits.
+### Binary Classification (Human vs AI)
+| Model | Method | Test F1 |
+|-------|--------|---------|
+| EditLens Mistral-Small 24B | QLoRA fine-tuned | 95.6 |
+| Pangram v2 | Proprietary | 83.7 |
+| Binoculars | Perplexity ratio | 81.4 |
+| FastDetectGPT | Log-prob based | 80.5 |
+| **This model** | **Frozen RADAR + adaptive-classifier** | **72.1** |
+### Per-Split Results
+| Split | Accuracy | Macro-F1 | AI F1 | Human F1 |
+|-------|----------|----------|-------|----------|
+| test (in-distribution) | 73.5% | 72.1 | 78.3 | 65.9 |
+| test_enron (OOD domain) | 73.5% | 64.1 | 82.5 | 45.7 |
+| test_llama (OOD model) | 76.1% | 74.7 | 80.7 | 68.8 |
+The model generalizes well to unseen AI models (Llama 3.3-70B), achieving higher F1 on OOD text than in-distribution.
 ## Training Details
+- **Backbone**: [TrustSafeAI/RADAR-Vicuna-7B](https://huggingface.co/TrustSafeAI/RADAR-Vicuna-7B) (frozen, 355M params)
+- **Dataset**: [pangram/editlens_iclr](https://huggingface.co/datasets/pangram/editlens_iclr) train split
+- **Examples**: 1,000 per class (2,000 total), stratified sample
+- **Classes**: `human` (human_written), `ai` (ai_edited + ai_generated)
+- **Embedding dim**: 1024
+- **Prototype weight**: 0.3, Neural weight: 0.7
+- **Training time**: ~6 minutes on CPU
 ## Limitations
+- Binary only (human vs AI) — does not distinguish AI-edited from AI-generated
+- Relies on frozen RADAR embeddings; cannot learn new text patterns beyond what RADAR captures
+- Minimum ~50 words of text recommended for reliable detection
+- Trained on English text from specific domains (reviews, news, creative writing, academic)
 ## Citation
   publisher = {GitHub},
   url = {https://github.com/codelion/adaptive-classifier}
 }
+@inproceedings{thai2026editlens,
+  title = {EditLens: Quantifying the Extent of AI Editing in Text},
+  author = {Thai, Katherine and Emi, Bradley and Masrour, Elyas and Iyyer, Mohit},
+  booktitle = {ICLR},
+  year = {2026}
+}
+@article{hu2023radar,
+  title = {RADAR: Robust AI-Text Detection via Adversarial Learning},
+  author = {Hu, Xiaomeng and Chen, Pin-Yu and Ho, Tsung-Yi},
+  journal = {arXiv preprint arXiv:2307.03838},
+  year = {2023}
+}
 ```