Upload README.md with huggingface_hub
Browse files
README.md
ADDED
|
@@ -0,0 +1,363 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
tags:
|
| 4 |
+
- text-classification
|
| 5 |
+
- ai-detection
|
| 6 |
+
- human-vs-ai
|
| 7 |
+
- binary-classification
|
| 8 |
+
- ensemble-methods
|
| 9 |
+
- bert
|
| 10 |
+
- lstm
|
| 11 |
+
- xgboost
|
| 12 |
+
- machine-learning
|
| 13 |
+
- deep-learning
|
| 14 |
+
datasets:
|
| 15 |
+
- hc3
|
| 16 |
+
language:
|
| 17 |
+
- en
|
| 18 |
+
metrics:
|
| 19 |
+
- accuracy
|
| 20 |
+
- precision
|
| 21 |
+
- recall
|
| 22 |
+
- f1
|
| 23 |
+
- roc-auc
|
| 24 |
+
library_name: scikit-learn
|
| 25 |
+
pipeline_tag: text-classification
|
| 26 |
+
---
|
| 27 |
+
|
| 28 |
+
# Human vs. AI Text Classifier
|
| 29 |
+
|
| 30 |
+
[](https://www.python.org/downloads/)
|
| 31 |
+
[](https://scikit-learn.org/)
|
| 32 |
+
[](https://pytorch.org/)
|
| 33 |
+
[](https://tensorflow.org/)
|
| 34 |
+
[](https://opensource.org/licenses/MIT)
|
| 35 |
+
[](https://github.com/huzaifanasir95/Human-vs-AI-Classifier)
|
| 36 |
+
|
| 37 |
+
## Model Description
|
| 38 |
+
|
| 39 |
+
A comprehensive ensemble-based text classification system that distinguishes between human-written and AI-generated text with high accuracy. This implementation combines **traditional machine learning** (Logistic Regression, Random Forest, SVM, XGBoost) and **deep learning** approaches (BiLSTM with Attention, BERT) using advanced ensemble techniques.
|
| 40 |
+
|
| 41 |
+
**Key Features:**
|
| 42 |
+
- 6 diverse classifiers (4 traditional ML + 2 deep learning)
|
| 43 |
+
- 5,015-dimensional hybrid feature space (5,000 TF-IDF + 15 linguistic features)
|
| 44 |
+
- 4 ensemble strategies (Hard/Soft Voting, Weighted Average, Stacking)
|
| 45 |
+
- 99.59% F1-score with weighted ensemble
|
| 46 |
+
- Balanced performance (99.59% precision, recall, and accuracy)
|
| 47 |
+
- Trained on 52,452 samples from HC3 dataset
|
| 48 |
+
|
| 49 |
+
## Model Architecture
|
| 50 |
+
|
| 51 |
+
```
|
| 52 |
+
Input Text
|
| 53 |
+
↓
|
| 54 |
+
[Feature Engineering]
|
| 55 |
+
├─→ TF-IDF Vectorization (5,000 features)
|
| 56 |
+
│ - Unigrams & Bigrams
|
| 57 |
+
│ - Max DF: 0.95, Min DF: 2
|
| 58 |
+
│
|
| 59 |
+
└─→ Linguistic Features (15 features)
|
| 60 |
+
- Text length, word count, sentence count
|
| 61 |
+
- Lexical diversity (TTR)
|
| 62 |
+
- Stopword/punctuation ratios
|
| 63 |
+
- Statistical text properties
|
| 64 |
+
↓
|
| 65 |
+
Multi-Modal Feature Vector (5,015 dimensions)
|
| 66 |
+
↓
|
| 67 |
+
┌──────────────────────────────────────────────┐
|
| 68 |
+
│ Base Classifiers (Parallel) │
|
| 69 |
+
├──────────────────────────────────────────────┤
|
| 70 |
+
│ Traditional ML │ Deep Learning │
|
| 71 |
+
├─────────────────────────┼────────────────────┤
|
| 72 |
+
│ • Logistic Regression │ • BERT │
|
| 73 |
+
│ • Random Forest (200) │ (bert-base) │
|
| 74 |
+
│ • SVM (RBF kernel) │ • BiLSTM+Attention │
|
| 75 |
+
│ • XGBoost (200 trees) │ (64 units) │
|
| 76 |
+
└─────────────────────────┴────────────────────┘
|
| 77 |
+
↓
|
| 78 |
+
[Ensemble Aggregation]
|
| 79 |
+
├─→ Hard Voting (Majority vote)
|
| 80 |
+
├─→ Soft Voting (Probability averaging)
|
| 81 |
+
├─→ Weighted Average (Optimized weights)
|
| 82 |
+
└─→ Stacking (Meta-learner: Logistic Regression)
|
| 83 |
+
↓
|
| 84 |
+
Final Prediction: Human (0) or AI (1)
|
| 85 |
+
```
|
| 86 |
+
|
| 87 |
+
**Individual Model Specifications:**
|
| 88 |
+
|
| 89 |
+
| Model | Type | Parameters | Key Configuration |
|
| 90 |
+
|-------|------|------------|-------------------|
|
| 91 |
+
| **Logistic Regression** | Linear | 5,015 | C=1.0, L2 regularization, LBFGS solver |
|
| 92 |
+
| **Random Forest** | Ensemble Trees | - | 200 estimators, unlimited depth |
|
| 93 |
+
| **SVM** | Kernel Method | - | RBF kernel, C=1.0, gamma=scale |
|
| 94 |
+
| **XGBoost** | Gradient Boosting | - | 200 trees, depth=7, LR=0.1 |
|
| 95 |
+
| **BiLSTM** | Recurrent NN | ~500K | 64 units/dir, attention, dropout=0.5 |
|
| 96 |
+
| **BERT** | Transformer | 110M | bert-base-uncased, max_len=128 |
|
| 97 |
+
|
| 98 |
+
## Performance
|
| 99 |
+
|
| 100 |
+
### Individual Models
|
| 101 |
+
|
| 102 |
+
| Model | Accuracy | Precision | Recall | F1-Score | ROC-AUC |
|
| 103 |
+
|-------|----------|-----------|--------|----------|---------|
|
| 104 |
+
| **XGBoost** | 0.9903 | 0.9838 | 0.9970 | 0.9904 | 0.9994 |
|
| 105 |
+
| **Logistic Regression** | 0.9897 | 0.9827 | 0.9970 | 0.9898 | 0.9996 |
|
| 106 |
+
| **SVM** | 0.9867 | 0.9807 | 0.9929 | 0.9867 | 0.9991 |
|
| 107 |
+
| **BERT** | 0.9727 | 0.9510 | 0.9967 | 0.9733 | 0.9975 |
|
| 108 |
+
| **BiLSTM** | 0.9710 | 0.9668 | 0.9756 | 0.9712 | 0.9963 |
|
| 109 |
+
| **Random Forest** | 0.9573 | 0.9571 | 0.9576 | 0.9573 | 0.9922 |
|
| 110 |
+
|
| 111 |
+
### Ensemble Methods
|
| 112 |
+
|
| 113 |
+
| Method | Accuracy | Precision | Recall | F1-Score | ROC-AUC |
|
| 114 |
+
|--------|----------|-----------|--------|----------|---------|
|
| 115 |
+
| **Weighted Average** ⭐ | **0.9959** | **0.9959** | **0.9959** | **0.9959** | **0.9998** |
|
| 116 |
+
| **Stacking** | 0.9956 | 0.9947 | 0.9964 | 0.9956 | 0.9998 |
|
| 117 |
+
| **Soft Voting** | 0.9945 | 0.9937 | 0.9954 | 0.9945 | 0.9998 |
|
| 118 |
+
| **Hard Voting** | 0.9921 | 0.9944 | 0.9898 | 0.9921 | 0.9998 |
|
| 119 |
+
|
| 120 |
+
**Optimized Ensemble Weights:**
|
| 121 |
+
- XGBoost: 0.25
|
| 122 |
+
- Logistic Regression: 0.20
|
| 123 |
+
- BERT: 0.20
|
| 124 |
+
- SVM: 0.15
|
| 125 |
+
- Random Forest: 0.10
|
| 126 |
+
- BiLSTM: 0.10
|
| 127 |
+
|
| 128 |
+
**Confusion Matrix (Weighted Ensemble):**
|
| 129 |
+
```
|
| 130 |
+
Predicted
|
| 131 |
+
Human AI
|
| 132 |
+
Actual Human 3918 16
|
| 133 |
+
AI 16 3918
|
| 134 |
+
```
|
| 135 |
+
- Total Errors: 32 / 7,868 (0.41%)
|
| 136 |
+
- False Positives: 16 (0.20%)
|
| 137 |
+
- False Negatives: 16 (0.20%)
|
| 138 |
+
|
| 139 |
+
## Training Details
|
| 140 |
+
|
| 141 |
+
**Dataset:**
|
| 142 |
+
- **Name**: HC3 (Human-ChatGPT Comparison Corpus)
|
| 143 |
+
- **Total Samples**: 52,452 balanced pairs
|
| 144 |
+
- **Training**: 36,716 (70%)
|
| 145 |
+
- **Validation**: 7,868 (15%)
|
| 146 |
+
- **Test**: 7,868 (15%)
|
| 147 |
+
- **Domains**: Finance, Medicine, Open QA, Reddit ELI5, Wikipedia CS/AI
|
| 148 |
+
- **Minimum Length**: 50 characters
|
| 149 |
+
- **Balance**: 50-50 (Human-AI)
|
| 150 |
+
|
| 151 |
+
**Feature Engineering:**
|
| 152 |
+
- **TF-IDF**: 5,000 dimensions (unigrams + bigrams)
|
| 153 |
+
- **Linguistic**: 15 handcrafted features
|
| 154 |
+
- Text statistics (length, word/sentence counts)
|
| 155 |
+
- Lexical diversity (Type-Token Ratio)
|
| 156 |
+
- Character ratios (stopwords, punctuation, digits, capitals)
|
| 157 |
+
- Structural patterns (long/short words, question/exclamation marks)
|
| 158 |
+
|
| 159 |
+
**Training Configuration:**
|
| 160 |
+
|
| 161 |
+
*Traditional ML Models:*
|
| 162 |
+
- Framework: scikit-learn 1.3+
|
| 163 |
+
- Cross-validation: 5-fold (for stacking)
|
| 164 |
+
- Class balance: Maintained via stratified splitting
|
| 165 |
+
|
| 166 |
+
*Deep Learning Models:*
|
| 167 |
+
- **BiLSTM**: 10 epochs (early stopped at 4), batch=64, Adam optimizer (LR=1e-3)
|
| 168 |
+
- **BERT**: 2 epochs, batch=16, AdamW optimizer (LR=2e-5), warmup=500 steps
|
| 169 |
+
|
| 170 |
+
**Hardware:**
|
| 171 |
+
- Training: CPU/GPU compatible
|
| 172 |
+
- BiLSTM training time: 3,406 seconds (4 epochs)
|
| 173 |
+
- BERT training time: Variable (depends on GPU)
|
| 174 |
+
|
| 175 |
+
## Usage
|
| 176 |
+
|
| 177 |
+
### Installation
|
| 178 |
+
|
| 179 |
+
```bash
|
| 180 |
+
git clone https://github.com/huzaifanasir95/Human-vs-AI-Classifier.git
|
| 181 |
+
cd Human-vs-AI-Classifier
|
| 182 |
+
pip install -r requirements.txt
|
| 183 |
+
```
|
| 184 |
+
|
| 185 |
+
### Download Models
|
| 186 |
+
|
| 187 |
+
```python
|
| 188 |
+
from huggingface_hub import hf_hub_download
|
| 189 |
+
import pickle
|
| 190 |
+
import torch
|
| 191 |
+
|
| 192 |
+
# Download traditional ML models
|
| 193 |
+
models = ['logistic_regression', 'random_forest', 'svm', 'xgboost']
|
| 194 |
+
for model_name in models:
|
| 195 |
+
model_path = hf_hub_download(
|
| 196 |
+
repo_id="huzaifanasirrr/human-vs-ai-text-classifier",
|
| 197 |
+
filename=f"models/{model_name}.pkl"
|
| 198 |
+
)
|
| 199 |
+
with open(model_path, 'rb') as f:
|
| 200 |
+
model = pickle.load(f)
|
| 201 |
+
|
| 202 |
+
# Download deep learning models
|
| 203 |
+
bert_path = hf_hub_download(
|
| 204 |
+
repo_id="huzaifanasirrr/human-vs-ai-text-classifier",
|
| 205 |
+
filename="models/bert_best.pt"
|
| 206 |
+
)
|
| 207 |
+
bilstm_path = hf_hub_download(
|
| 208 |
+
repo_id="huzaifanasirrr/human-vs-ai-text-classifier",
|
| 209 |
+
filename="models/bilstm_best.h5"
|
| 210 |
+
)
|
| 211 |
+
```
|
| 212 |
+
|
| 213 |
+
### Inference (Weighted Ensemble)
|
| 214 |
+
|
| 215 |
+
```python
|
| 216 |
+
from src.feature_extractor import FeatureExtractor
|
| 217 |
+
from src.models.ensemble import WeightedEnsemble
|
| 218 |
+
import numpy as np
|
| 219 |
+
|
| 220 |
+
# Initialize feature extractor
|
| 221 |
+
feature_extractor = FeatureExtractor(
|
| 222 |
+
max_features=5000,
|
| 223 |
+
ngram_range=(1, 2)
|
| 224 |
+
)
|
| 225 |
+
|
| 226 |
+
# Extract features from text
|
| 227 |
+
text = "Your text to classify here..."
|
| 228 |
+
features = feature_extractor.extract(text) # Shape: (5015,)
|
| 229 |
+
|
| 230 |
+
# Load ensemble
|
| 231 |
+
ensemble = WeightedEnsemble(
|
| 232 |
+
models=[lr_model, rf_model, svm_model, xgb_model, bert_model, bilstm_model],
|
| 233 |
+
weights=[0.20, 0.10, 0.15, 0.25, 0.20, 0.10]
|
| 234 |
+
)
|
| 235 |
+
|
| 236 |
+
# Predict
|
| 237 |
+
prediction = ensemble.predict(features)
|
| 238 |
+
probability = ensemble.predict_proba(features)
|
| 239 |
+
|
| 240 |
+
if prediction == 0:
|
| 241 |
+
print(f"Human-written (confidence: {probability[0]:.2%})")
|
| 242 |
+
else:
|
| 243 |
+
print(f"AI-generated (confidence: {probability[1]:.2%})")
|
| 244 |
+
```
|
| 245 |
+
|
| 246 |
+
### Single Model Inference
|
| 247 |
+
|
| 248 |
+
```python
|
| 249 |
+
# Using XGBoost (best individual model)
|
| 250 |
+
xgb_prediction = xgb_model.predict(features.reshape(1, -1))
|
| 251 |
+
xgb_proba = xgb_model.predict_proba(features.reshape(1, -1))
|
| 252 |
+
|
| 253 |
+
print(f"Prediction: {'AI' if xgb_prediction[0] else 'Human'}")
|
| 254 |
+
print(f"Confidence: {xgb_proba[0][xgb_prediction[0]]:.2%}")
|
| 255 |
+
```
|
| 256 |
+
|
| 257 |
+
## Key Innovations
|
| 258 |
+
|
| 259 |
+
1. **Hybrid Feature Engineering**: Combines vocabulary-based TF-IDF with linguistic style features
|
| 260 |
+
2. **Multi-Paradigm Ensemble**: Integrates linear models, tree ensembles, kernel methods, and neural networks
|
| 261 |
+
3. **Optimized Weighting**: Performance-based weight assignment for ensemble members
|
| 262 |
+
4. **Balanced Performance**: Equal precision and recall (99.59%) indicates no systematic bias
|
| 263 |
+
5. **Domain Diversity**: Trained across 5 different text domains for robust generalization
|
| 264 |
+
|
| 265 |
+
## Feature Importance
|
| 266 |
+
|
| 267 |
+
Based on XGBoost analysis:
|
| 268 |
+
|
| 269 |
+
| Feature Type | Importance |
|
| 270 |
+
|--------------|-----------|
|
| 271 |
+
| TF-IDF Features | 89.2% |
|
| 272 |
+
| Average Sentence Length | 4.3% |
|
| 273 |
+
| Lexical Diversity (TTR) | 2.7% |
|
| 274 |
+
| Unique Words Ratio | 1.5% |
|
| 275 |
+
| Average Word Length | 1.1% |
|
| 276 |
+
| Others | 1.2% |
|
| 277 |
+
|
| 278 |
+
**Insight**: Vocabulary patterns dominate, but linguistic features provide crucial complementary information.
|
| 279 |
+
|
| 280 |
+
## Limitations
|
| 281 |
+
|
| 282 |
+
- **Dataset Specificity**: Trained on ChatGPT-generated text; may not generalize to other LLMs (GPT-4, Claude, Gemini)
|
| 283 |
+
- **Domain Dependency**: Best performance on domains similar to training data
|
| 284 |
+
- **Temporal Drift**: As LLMs evolve, detection patterns may become obsolete
|
| 285 |
+
- **Adversarial Vulnerability**: Not evaluated against deliberate evasion attempts
|
| 286 |
+
- **Language**: English-only (no multilingual support)
|
| 287 |
+
- **Computational Cost**: Full ensemble requires running 6 models (mitigated by optimized weights)
|
| 288 |
+
|
| 289 |
+
## Citation
|
| 290 |
+
|
| 291 |
+
If you use this model in your research, please cite:
|
| 292 |
+
|
| 293 |
+
```bibtex
|
| 294 |
+
@article{nasir2025humanaiclassifier,
|
| 295 |
+
title={Human vs. AI Text Classification: A Comprehensive Study Using Machine Learning and Deep Learning Approaches},
|
| 296 |
+
author={Nasir, Huzaifa},
|
| 297 |
+
institution={National University of Computer and Emerging Sciences, Pakistan},
|
| 298 |
+
year={2025},
|
| 299 |
+
note={Hugging Face: https://huggingface.co/huzaifanasirrr/human-vs-ai-text-classifier}
|
| 300 |
+
}
|
| 301 |
+
```
|
| 302 |
+
|
| 303 |
+
**HC3 Dataset:**
|
| 304 |
+
```bibtex
|
| 305 |
+
@article{guo2023hc3,
|
| 306 |
+
title={How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection},
|
| 307 |
+
author={Guo, Biyang and Zhang, Xin and Wang, Ziyuan and Jiang, Minqi and Nie, Jinran and Ding, Yuxuan and ... and Wu, Yupeng},
|
| 308 |
+
journal={arXiv preprint arXiv:2301.07597},
|
| 309 |
+
year={2023}
|
| 310 |
+
}
|
| 311 |
+
```
|
| 312 |
+
|
| 313 |
+
## Model Files
|
| 314 |
+
|
| 315 |
+
- `models/*.pkl` - Traditional ML models (Logistic Regression, Random Forest, SVM, XGBoost)
|
| 316 |
+
- `models/bert_best.pt` - Fine-tuned BERT model checkpoint
|
| 317 |
+
- `models/bilstm_best.h5` - BiLSTM with Attention model
|
| 318 |
+
- `results/*.json` - Comprehensive performance metrics
|
| 319 |
+
- `data/feature_info.json` - Feature vocabulary and metadata
|
| 320 |
+
- `visualizations/*.png` - Training curves, confusion matrices, ROC curves, comparisons
|
| 321 |
+
- `config.yaml` - Configuration settings
|
| 322 |
+
- `research_paper.tex` - Full research paper (SPRINGER LNCS format)
|
| 323 |
+
|
| 324 |
+
## Ethical Considerations
|
| 325 |
+
|
| 326 |
+
⚠️ **Important Notice:**
|
| 327 |
+
|
| 328 |
+
This model is designed for research and educational purposes. When deploying for real-world applications:
|
| 329 |
+
|
| 330 |
+
- **Transparency**: Inform users when text is subject to AI detection
|
| 331 |
+
- **Fairness**: Evaluate for bias against non-native speakers or specific writing styles
|
| 332 |
+
- **Privacy**: Respect user privacy and data protection regulations
|
| 333 |
+
- **Accuracy**: Do not use as definitive proof; false positives (0.2%) can occur
|
| 334 |
+
- **Context**: Use as one signal among many, not as sole evidence
|
| 335 |
+
- **Appeals**: Provide mechanisms for users to contest decisions
|
| 336 |
+
|
| 337 |
+
Detection systems should support human judgment, not replace it.
|
| 338 |
+
|
| 339 |
+
## Author
|
| 340 |
+
|
| 341 |
+
**Huzaifa Nasir**
|
| 342 |
+
📧 nasirhuzaifa95@gmail.com
|
| 343 |
+
🎓 National University of Computer and Emerging Sciences (FAST-NUCES), Pakistan
|
| 344 |
+
🔗 [GitHub Repository](https://github.com/huzaifanasir95/Human-vs-AI-Classifier)
|
| 345 |
+
🆔 ORCID: [0009-0000-1482-3268](https://orcid.org/0009-0000-1482-3268)
|
| 346 |
+
|
| 347 |
+
## License
|
| 348 |
+
|
| 349 |
+
MIT License - See LICENSE file for details.
|
| 350 |
+
|
| 351 |
+
## Acknowledgments
|
| 352 |
+
|
| 353 |
+
This project builds upon:
|
| 354 |
+
- **HC3 Dataset**: Human-ChatGPT comparison corpus ([Guo et al., 2023](https://arxiv.org/abs/2301.07597))
|
| 355 |
+
- **BERT**: Pre-trained language model ([Devlin et al., 2018](https://arxiv.org/abs/1810.04805))
|
| 356 |
+
- **XGBoost**: Gradient boosting framework ([Chen & Guestrin, 2016](https://arxiv.org/abs/1603.02754))
|
| 357 |
+
- **Scikit-learn**: Machine learning library ([Pedregosa et al., 2011](https://jmlr.org/papers/v12/pedregosa11a.html))
|
| 358 |
+
|
| 359 |
+
Research conducted at FAST-NUCES Islamabad. Special thanks to the open-source community.
|
| 360 |
+
|
| 361 |
+
---
|
| 362 |
+
|
| 363 |
+
**Status**: ✅ Production-ready | Last updated: January 2025
|