RoBERTa-Large CV AI Detection Model

Model Overview

This model is a fine-tuned RoBERTa-Large transformer trained to detect whether a curriculum vitae (CV) was written by a human, generated by AI, or produced through a combination of both.

The model was trained using the dataset:

jamal-ibrahim/ai-cv-detection-dataset

The task is formulated as a three-class classification problem:

Label Meaning


human CV written entirely by a human mixed CV containing both human-written and AI-generated content ai_generated CV generated primarily by an AI system

The goal of this experiment was to explore whether modern transformer models can identify stylistic differences between human-written and AI-generated professional documents.


Experimental Setup

Model Architecture

Base model:

roberta-large

RoBERTa-Large contains approximately 355 million parameters and is known to perform well on tasks involving linguistic style, authorship detection, and semantic classification.

Dataset

Dataset used for training:

jamal-ibrahim/ai-cv-detection-dataset

Total dataset size: approximately 1500 CV documents.

The dataset was split using a stratified strategy to maintain class balance.

Split configuration:

Split Percentage


Train 70% Validation 15% Test 15%

Each class (human, mixed, ai_generated) is represented equally in each split.


Training Configuration

Training parameters:

Parameter Value


Epochs 10 Batch size 16 Learning rate 5e-6 Optimizer AdamW Max sequence length 512 Evaluation strategy per epoch

Training was performed using the Hugging Face Trainer API.


Results

Evaluation was performed on the held-out test set.

Confusion Matrix

True / Predicted Human Mixed AI


Human 75 0 0 Mixed 0 74 1 AI Generated 0 0 75

Only one misclassification occurred in the test set.

Metrics

Metric Value


Accuracy ~0.995 Weighted F1 ~0.996 Macro F1 ~0.995

The model demonstrates near-perfect classification performance on this dataset.


Interpretation of Results

The extremely high classification accuracy suggests that the model successfully learned stylistic patterns that differentiate:

  • human-authored CVs
  • AI-generated CVs
  • hybrid documents

Several factors may contribute to this performance.

Structural Patterns in AI-Generated CVs

Large language models often produce documents with consistent structural patterns such as:

  • uniform sentence lengths
  • repeated phrasing patterns
  • symmetrical bullet point structures
  • consistent grammatical style

These patterns may create detectable stylistic signatures.

Human Writing Variability

Human-written CVs often contain:

  • irregular formatting
  • inconsistent sentence structures
  • varied vocabulary
  • domain-specific phrasing

This variability may make human documents distinguishable from AI-generated ones.

Mixed Documents as a Transitional Style

Documents labeled as mixed appear to occupy an intermediate stylistic space between human and AI-generated text.

The single observed misclassification (mixed → AI-generated) suggests that documents heavily edited by AI may resemble fully generated content.


Limitations

Although the results are strong, several limitations should be considered.

Dataset Size

The dataset contains approximately 1500 samples, which is relatively small for training large transformer models.

High performance may partly reflect the limited diversity of the dataset.

Potential Template Effects

If AI-generated CVs were produced using similar prompts or generation patterns, the model may learn those specific templates rather than generalizable AI-writing signals.

Domain Specificity

This model was trained specifically on CV-style documents and may not generalize well to other types of text such as essays, emails, or reports.


Future Work

Future experiments could explore several directions:

  • cross-validation to measure performance stability
  • testing on unseen CV datasets
  • evaluating different LLM-generated CV styles
  • reducing model size using distillation or quantization
  • hybrid approaches combining stylistic features with transformer embeddings

Intended Use

Potential applications include:

  • research on AI-generated text detection
  • analysis of AI assistance in professional writing
  • experimentation with authorship attribution in structured documents

The model should not be used as a definitive tool for determining authorship in real-world hiring decisions.


Citation

@model{roberta_large_cv_detector, title = {RoBERTa-Large CV AI Detection Model}, author = {Jamal Ibrahim}, year = {2026}, dataset = {jamal-ibrahim/ai-cv-detection-dataset}, publisher = {Hugging Face} }

Downloads last month
26
Safetensors
Model size
0.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train jamal-ibrahim/roberta-large-cv-detector