BotDetection / Dataset /Readme.md
ASHUT0SH-SiNGH's picture
Updated ReadMe
ec7c185

A newer version of the Streamlit SDK is available: 1.53.1

Upgrade

Social Media Bot Detection (Metadata-based)

This project focuses on detecting automated social media accounts using structured profile and behavioral metadata. Instead of relying on tweet content or NLP techniques, the model analyzes account-level and activity-based features to identify bot-like patterns.

Dataset Overview

The dataset consists of user profile and activity metadata collected at the account level. Each record represents a user and includes structured numerical and boolean attributes, along with a binary label indicating whether the account is automated (bot) or human-operated.

Example Features Used

  • Follower count and following count
  • Follower–following ratio
  • Posting activity (status count)
  • Account age (in days)
  • Profile attributes (verified status, default profile settings)

Modeling Approach

  • Preprocessing: Cleaned and standardized structured metadata features.
  • Feature Engineering: Derived behavioral indicators such as follower–following ratio and account age.
  • Modeling: Trained a Random Forest classifier to distinguish bot and human accounts.
  • Explainability: Used feature importance to interpret which attributes influence predictions.

Evaluation

Model evaluation was performed offline using standard classification metrics such as accuracy and recall. The Streamlit application focuses on inference and explainability rather than live metric reporting.

Application Demo

A lightweight Streamlit interface is provided to:

  • Input account metadata
  • Generate bot or human predictions
  • Visualize feature importance for interpretability

Notes

This project is intended as a prototype to demonstrate machine learning workflows, feature engineering, and model interpretability using structured data rather than production-scale deployment.