Spaces:
Sleeping
Sleeping
| title: Capstone Project | |
| emoji: ⚡ | |
| colorFrom: gray | |
| colorTo: green | |
| sdk: gradio | |
| sdk_version: 5.45.0 | |
| app_file: app.py | |
| pinned: false | |
| short_description: AI vs Human text classifier | |
| # 🤖 AI vs Human Text Classifier (RoBERTa) | |
| This project fine-tunes **RoBERTa** to classify text as either: | |
| - 🧑 Human-Written | |
| - 🤖 AI-Generated | |
| It was developed as a **Capstone Project** to explore the power of transformer-based models in detecting AI-generated content. | |
| --- | |
| ## 📌 Project Overview | |
| With the rapid rise of LLMs like GPT and other AI text generators, distinguishing between human-written and AI-generated text is becoming crucial in education, research, and online authenticity. | |
| This project leverages **RoBERTa**, a transformer-based model, to build a binary text classifier. | |
| --- | |
| ## 🛠️ Features | |
| - Fine-tuned **RoBERTa-base** model | |
| - Binary classification: `Human (0)` vs `AI (1)` | |
| - Deployed with **Gradio** for easy interaction | |
| - Model hosted on **Hugging Face Model Hub** | |
| --- | |
| ## 📂 Dataset | |
| The dataset used in training contains two columns: | |
| - **Text** → the input text sample | |
| - **Generated** → label (`0 = Human`, `1 = AI`) | |
| --- | |
| ## 🚀 Training | |
| The model was fine-tuned on Google Colab using the Hugging Face `transformers` library. | |
| **Steps:** | |
| 1. Load dataset (`Text`, `Generated`) | |
| 2. Preprocess using Hugging Face `AutoTokenizer` | |
| 3. Fine-tune RoBERTa with `Trainer` API | |
| 4. Evaluate using Accuracy, Precision, Recall, F1-score | |
| --- | |
| ## 📊 Results | |
| Validation accuracy achieved: **~99%** | |
| Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference | |