Spaces:

DeepActionPotential
/

Textector

Sleeping

App Files Files Community

Textector / README.md

DeepActionPotential

Update README.md

b1a7d13 verified 8 months ago

preview code

raw

history blame contribute delete

2.79 kB

A newer version of the Streamlit SDK is available: 1.54.0

Upgrade

metadata

title: TextTector - AI Text Detector
emoji: 🤖
colorFrom: indigo
colorTo: blue
sdk: streamlit
sdk_version: 1.30.0
app_file: app.py
pinned: false
license: mit

AI Text Detector

A streamlit-based application that helps identify whether text was generated by AI or written by humans. Built using Streamlit and machine learning.

Features

Real-time text classification
Minimum word count validation (100 words)
User-friendly web interface
Text preprocessing pipeline
Clear visual feedback for results

Demo

The application provides a simple yet powerful interface for checking text. Here's how it works:

1. Input Text

The main interface features a large text area where you can paste or type the text you want to check. The application requires a minimum of 100 words for accurate classification.

2. Results

After submitting the text, the application will process it and display whether it appears to be human-written or AI-generated. The results are shown with clear visual indicators and informative messages.

Setup

Create and activate a virtual environment:

# Create virtual environment
python -m venv venv

# Activate virtual environment
# Windows
.\venv\Scripts\activate
# Linux/MacOS
source venv/bin/activate

Install the required dependencies:

pip install -r requirements.txt

Run the application:

python run.py

Open your web browser and navigate to http://localhost:8501

Technical Details

The application uses a machine learning model trained to distinguish between AI-generated and human-written text. The preprocessing pipeline includes:

Lowercasing
Punctuation removal
Stopword removal
URL and email removal
Number removal
Non-printable character removal

Model Training

The machine learning model used in this application was trained using the Jupyter notebook generated-text-classification.ipynb.

The trained model is saved as models/best_model.joblib and is loaded automatically when the application starts.

The model achieves 100% accuracy and an F1-score of 100, but its performance is constrained to data similar to what is presented in the training dataset. Therefore, it struggles to generalize across diverse data types. Nonetheless, it performs exceptionally well in distinguishing between AI-generated and human-generated text.

Requirements

Python 3.8+
pip
All dependencies listed in requirements.txt

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.