Spaces:
Sleeping
A newer version of the Streamlit SDK is available:
1.53.1
Social Media Bot Detection (Metadata-based)
This project focuses on detecting automated social media accounts using structured profile and behavioral metadata. Instead of relying on tweet content or NLP techniques, the model analyzes account-level and activity-based features to identify bot-like patterns.
Dataset Overview
The dataset consists of user profile and activity metadata collected at the account level. Each record represents a user and includes structured numerical and boolean attributes, along with a binary label indicating whether the account is automated (bot) or human-operated.
Example Features Used
- Follower count and following count
- Follower–following ratio
- Posting activity (status count)
- Account age (in days)
- Profile attributes (verified status, default profile settings)
Modeling Approach
- Preprocessing: Cleaned and standardized structured metadata features.
- Feature Engineering: Derived behavioral indicators such as follower–following ratio and account age.
- Modeling: Trained a Random Forest classifier to distinguish bot and human accounts.
- Explainability: Used feature importance to interpret which attributes influence predictions.
Evaluation
Model evaluation was performed offline using standard classification metrics such as accuracy and recall. The Streamlit application focuses on inference and explainability rather than live metric reporting.
Application Demo
A lightweight Streamlit interface is provided to:
- Input account metadata
- Generate bot or human predictions
- Visualize feature importance for interpretability
Notes
This project is intended as a prototype to demonstrate machine learning workflows, feature engineering, and model interpretability using structured data rather than production-scale deployment.