|
|
--- |
|
|
license: mit |
|
|
language: |
|
|
- en |
|
|
base_model: |
|
|
- Qwen/Qwen3-Embedding-0.6B |
|
|
pipeline_tag: text-classification |
|
|
--- |
|
|
|
|
|
# Argus |
|
|
|
|
|
**AI-Generated Text Detection Classifier** |
|
|
|
|
|
Argus is a binary text classifier that detects whether text was written by a human or generated by AI. It is fine-tuned from [Qwen/Qwen3-Embedding-0.6B](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B) using a classification head with last-token pooling. |
|
|
|
|
|
## Features |
|
|
|
|
|
- **High Accuracy**: Achieves near-perfect classification on held-out test data |
|
|
- **Long Context**: Supports sequences up to 4,096 tokens with automatic chunking for longer texts |
|
|
- **Fast Inference**: Optimized with Flash Attention 2 and bfloat16 precision on CUDA |
|
|
- **Batch Processing**: Parallel tokenization and batched inference for high throughput |
|
|
|
|
|
## Installation |
|
|
|
|
|
```bash |
|
|
pip install torch transformers |
|
|
``` |
|
|
|
|
|
## Minimal Inference Example |
|
|
|
|
|
```python |
|
|
import torch |
|
|
import torch.nn as nn |
|
|
import torch.nn.functional as F |
|
|
from transformers import AutoModel, AutoTokenizer, AutoConfig |
|
|
|
|
|
class Qwen3ForSequenceClassification(nn.Module): |
|
|
"""Qwen3-Embedding with classification head using last-token pooling.""" |
|
|
|
|
|
def __init__(self, model_name="Qwen/Qwen3-Embedding-0.6B", num_labels=2): |
|
|
super().__init__() |
|
|
self.encoder = AutoModel.from_pretrained( |
|
|
model_name, |
|
|
torch_dtype=torch.bfloat16, |
|
|
trust_remote_code=True, |
|
|
) |
|
|
hidden_size = AutoConfig.from_pretrained(model_name).hidden_size |
|
|
self.classifier = nn.Linear(hidden_size, num_labels) |
|
|
|
|
|
def forward(self, input_ids, attention_mask): |
|
|
outputs = self.encoder(input_ids=input_ids, attention_mask=attention_mask) |
|
|
# Last-token pooling (not CLS token) |
|
|
pooled = outputs.last_hidden_state[:, -1] |
|
|
return self.classifier(pooled) |
|
|
|
|
|
|
|
|
# Load model and weights |
|
|
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
|
|
model = Qwen3ForSequenceClassification() |
|
|
model.load_state_dict(torch.load("weights/model.pt", map_location=device, weights_only=True)) |
|
|
model.to(device).eval() |
|
|
|
|
|
# Load tokenizer |
|
|
tokenizer = AutoTokenizer.from_pretrained("weights/", padding_side="left", trust_remote_code=True) |
|
|
if tokenizer.pad_token is None: |
|
|
tokenizer.pad_token = tokenizer.eos_token |
|
|
|
|
|
# Inference |
|
|
text = "Your text to classify here." |
|
|
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=4096) |
|
|
inputs = {k: v.to(device) for k, v in inputs.items()} |
|
|
|
|
|
with torch.no_grad(): |
|
|
logits = model(inputs["input_ids"], inputs["attention_mask"]) |
|
|
probs = F.softmax(logits.float(), dim=-1)[0] |
|
|
|
|
|
label = "ai" if probs[1] > probs[0] else "human" |
|
|
confidence = probs[1].item() if label == "ai" else probs[0].item() |
|
|
|
|
|
print(f"Label: {label}, Confidence: {confidence:.2%}") |
|
|
# Example output: Label: human, Confidence: 94.32% |
|
|
``` |
|
|
|
|
|
## Model Architecture |
|
|
|
|
|
| Component | Details | |
|
|
|-----------|---------| |
|
|
| Base Model | Qwen/Qwen3-Embedding-0.6B | |
|
|
| Hidden Size | 1024 | |
|
|
| Parameters | ~600M | |
|
|
| Pooling | Last-token (not CLS) | |
|
|
| Classification Head | Linear (1024 → 2) | |
|
|
| Precision | bfloat16 (CUDA) / float32 (CPU) | |
|
|
|
|
|
## Performance |
|
|
|
|
|
| Metric | Score | |
|
|
|--------|-------| |
|
|
| Accuracy | 98.86% | |
|
|
|
|
|
*Note: These metrics are from the validation set during training. Real-world performance may vary depending on the domain and AI models used to generate text.* |
|
|
|
|
|
## Training Data |
|
|
|
|
|
Trained on a combination of: |
|
|
- [RAID](https://huggingface.co/datasets/liamdugan/raid) - Multi-domain, multi-model dataset |
|
|
- [HC3](https://huggingface.co/datasets/Hello-SimpleAI/HC3) - Human vs ChatGPT responses |
|
|
- [AI-human-text](https://huggingface.co/datasets/andythetechnerd03/AI-human-text) |
|
|
- [AI Text Detection Pile](https://huggingface.co/datasets/artem9k/ai-text-detection-pile) |
|
|
|
|
|
## License |
|
|
|
|
|
MIT License |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use Argus in your research, please cite: |
|
|
|
|
|
```bibtex |
|
|
@software{, |
|
|
title={Argus: AI-Generated Text Detection Classifier}, |
|
|
author={Xi Nai Lai}, |
|
|
year={2026}, |
|
|
url={https://huggingface.co/johnbean393/argus/} |
|
|
} |
|
|
``` |