voight-kampff-pan2024-classifier
A calibrated Linear SVM classifier for AI-generated text detection, built for the Voight-Kampff Generative AI Shared Task (PAN @ CLEF 2024).
This classifier operates on embeddings produced by Alejandro-Pardo/voight-kampff-pan2024-gte-en-v1.5. Given an embedding of a text chunk (~64 tokens), it classifies it as human-written (0) or AI-generated (1) with calibrated probability scores.
Model Details
| Parameter | Value |
|---|---|
| Algorithm | LinearSVC (scikit-learn) |
| Calibration | CalibratedClassifierCV with cv='prefit' |
| Kernel | Linear |
| Input | Embeddings from voight-kampff-pan2024-gte-en-v1.5 |
| Output | Calibrated probabilities for human (0) and AI (1) |
| Format | Python pickle (.pkl) |
Training Data
The SVM was trained on embeddings generated from:
- PAN 2024 competition dataset + Ollama-augmented texts (
llama 3.2 1b,qwen 2.5 1b,gemma 2 2b) - Combined train + validation sets for training, test set for calibration
- 18 different LLM sources (GPT-3.5, GPT-4, LLaMA, Mistral, Alpaca, Qwen, Gemma, etc.)
- Text chunks of ~64 tokens with 15% leetspeak noise injection
Note: The pre-computed embeddings used for training are not included. They can be regenerated using the embedding model and the training data available in the GitHub repository.
Usage
Quick Start
from sentence_transformers import SentenceTransformer
import pickle
# 1. Load the embedding model
embeddings_model = SentenceTransformer(
'alejandroparbas/voight-kampff-pan2024-gte-en-v1.5',
trust_remote_code=True
)
# 2. Load the SVM classifier
with open('nlp_classifier_20.pkl', 'rb') as f:
classifier = pickle.load(f)
# 3. Classify a text chunk
text = "Some text chunk of approximately 64 tokens..."
embedding = embeddings_model.encode([text])
prediction = classifier.predict(embedding) # 0 = human, 1 = AI
probabilities = classifier.predict_proba(embedding) # [[p_human, p_ai]]
Full Text-Pair Classification
For classifying full text pairs (as in the competition format), the pipeline:
- Chunks both texts into ~64 token segments
- Generates embeddings for all chunks using the embedding model
- Classifies each chunk with the SVM
- Averages chunk probabilities per text
- Combines scores using:
See the full implementation in the GitHub repository.
Evaluation Results
Chunk-Level Classification (~64 token chunks)
| Metric | Score |
|---|---|
| F1 | ~0.80 |
| Accuracy | ~0.80 |
Full Text-Pair Classification (with chunk averaging)
On PAN 2024 test split (with noise):
| Metric | Score |
|---|---|
| ROC-AUC | 0.993 |
| Brier | 0.924 |
| C@1 | 0.951 |
| F1 | 0.951 |
| F0.5u | 0.953 |
| Mean | 0.955 |
On external Kaggle AI vs Human Text dataset:
| Metric | Score |
|---|---|
| ROC-AUC | 0.948 |
| Brier | 0.872 |
| C@1 | 0.878 |
| F1 | 0.878 |
| F0.5u | 0.877 |
| Mean | 0.891 |
Comparison with PAN 2024 Competition Leaderboard
| # | Team | ROC-AUC | Brier | C@1 | F1 | F0.5u | Mean |
|---|---|---|---|---|---|---|---|
| 1 | marsan | 0.961 | 0.928 | 0.912 | 0.884 | 0.932 | 0.924 |
| 2 | you-shun-you-de | 0.931 | 0.926 | 0.928 | 0.905 | 0.913 | 0.921 |
| 3 | baselineavengers | 0.925 | 0.869 | 0.882 | 0.875 | 0.869 | 0.886 |
| - | Baseline | 0.751 | 0.780 | 0.734 | 0.720 | 0.720 | 0.741 |
Note: Our results are evaluated on our own test split and are not directly comparable to the official competition leaderboard.
Retraining the Classifier
If you want to retrain the SVM (e.g., with additional data), you can regenerate the embeddings from the raw data:
from sentence_transformers import SentenceTransformer
from sklearn.svm import LinearSVC
from sklearn.calibration import CalibratedClassifierCV
# Load the embedding model
model = SentenceTransformer(
'Alejandro-Pardo/voight-kampff-pan2024-gte-en-v1.5',
trust_remote_code=True
)
# Generate embeddings from your data
train_embeddings = model.encode(train_texts, show_progress_bar=True, batch_size=32)
test_embeddings = model.encode(test_texts, show_progress_bar=True, batch_size=32)
# Train and calibrate
svm = LinearSVC(random_state=42, max_iter=10000)
svm.fit(train_embeddings, train_labels)
calibrated_svm = CalibratedClassifierCV(estimator=svm, cv='prefit')
calibrated_svm.fit(test_embeddings, test_labels)
The training data is available in the GitHub repository.
Limitations
- Requires the embedding model: This classifier only works with embeddings from voight-kampff-pan2024-gte-en-v1.5.
- English only: Trained on English text data only.
- Short texts: Performance degrades on very short texts (1-2 chunks), where chunk-level accuracy (~80%) is the bottleneck.
- Pickle format: Requires compatible scikit-learn version to load.
- Evaluation caveat: Results are on our own test split, not the official PAN 2024 evaluation set.
Authors
- Alejandro Pardo Bascuñana - Universidad Politécnica de Madrid
- Pedro Amaya Moreno - Universidad Politécnica de Madrid
Developed as part of the NLP course in the Master's program Aprendizaje Automático y Datos Masivos at UPM (2024-2025).
Citation
@misc{pardo2025voightkampff,
title={Voight-Kampff: Contrastive Embedding Learning for AI-Generated Text Detection},
author={Pardo-Bascu{\~n}ana, Alejandro and Amaya-Moreno, Pedro},
year={2025},
url={https://github.com/Alejandro-Pardo/voight-kampff-pan2024/}
}
Links
- GitHub: https://github.com/Alejandro-Pardo/voight-kampff-pan2024/
- Embedding Model: Alejandro-Pardo/voight-kampff-pan2024-gte-en-v1.5
- Competition: PAN @ CLEF 2024 - Generative AI Authorship Verification
- Base Model: Alibaba-NLP/gte-base-en-v1.5
- Downloads last month
- -