nahiar commited on Nov 27, 2025

Commit

df39e77

verified ·

1 Parent(s): 8255c74

Upload folder using huggingface_hub

Browse files

Files changed (20) hide show

.gitattributes +10 -0
README.md +200 -242
images/01_class_distribution.png +3 -0
images/02_feature_correlation.png +3 -0
images/03_correlation_matrix.png +3 -0
images/04_baseline_confusion_matrix.png +0 -0
images/05_baseline_roc_curve.png +3 -0
images/06_baseline_precision_recall.png +0 -0
images/07_baseline_feature_importance.png +3 -0
images/08_cross_validation.png +3 -0
images/09_tuned_confusion_matrix.png +0 -0
images/10_tuned_roc_curve.png +3 -0
images/11_tuned_precision_recall.png +3 -0
images/12_tuned_feature_importance.png +3 -0
images/13_model_comparison.png +3 -0
twitter_bot_detection_v2.pkl +3 -0
twitter_features_v2.json +14 -0
twitter_metrics_v2.txt +39 -0
twitter_model_comparison.csv +7 -0
twitter_scaler_v2.pkl +3 -0

.gitattributes CHANGED Viewed

	@@ -1 +1,11 @@
1	*.pkl filter=lfs diff=lfs merge=lfs -text

 *.pkl filter=lfs diff=lfs merge=lfs -text
+images/01_class_distribution.png filter=lfs diff=lfs merge=lfs -text
+images/02_feature_correlation.png filter=lfs diff=lfs merge=lfs -text
+images/03_correlation_matrix.png filter=lfs diff=lfs merge=lfs -text
+images/05_baseline_roc_curve.png filter=lfs diff=lfs merge=lfs -text
+images/07_baseline_feature_importance.png filter=lfs diff=lfs merge=lfs -text
+images/08_cross_validation.png filter=lfs diff=lfs merge=lfs -text
+images/10_tuned_roc_curve.png filter=lfs diff=lfs merge=lfs -text
+images/11_tuned_precision_recall.png filter=lfs diff=lfs merge=lfs -text
+images/12_tuned_feature_importance.png filter=lfs diff=lfs merge=lfs -text
+images/13_model_comparison.png filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,315 +1,273 @@
 ---
-language: en
-license: mit
 tags:
-  - bot-detection
-  - twitter
-  - random-forest
-  - sklearn
-  - social-media
-  - classification
-metrics:
-  - accuracy
-  - precision
-  - recall
-  - f1
-  - roc-auc
-library_name: scikit-learn
 ---
-# Twitter Bot Detection Model
-## Model Description
-This Random Forest classifier is designed to detect bot accounts on Twitter/X based on profile features and behavioral patterns. The model analyzes various account characteristics to determine whether an account is likely automated (bot) or genuine (human).
-## Model Details
-- **Model Type**: Random Forest Classifier
-- **Framework**: scikit-learn
-- **Task**: Binary Classification (Bot vs Human)
-- **Language**: Python
-- **License**: MIT
-## Performance Metrics
-The model achieves strong performance on the test dataset with optimized hyperparameters:
-- **High Accuracy**: Excellent accuracy in distinguishing bots from legitimate accounts
-- **Robust Classification**: Trained with cross-validation for reliable performance
-- **Version**: v2 (improved and optimized)
-The model has been fine-tuned specifically for Twitter's unique features and bot patterns.
-## Features Used
-The model uses the following features for prediction:
-1. **IsPrivate** - Whether the account is protected/private
-2. **IsVerified** - Whether the account has a verification badge (blue checkmark)
-3. **HasProfilePic** - Whether the account has a profile picture
-4. **FollowingCount** - Number of accounts being followed
-5. **FollowerCount** - Number of followers
-6. **HasLocation** - Whether location information is provided
-7. **HasDescription** - Whether the account has a bio/description
-8. **TweetsCount** - Total number of tweets posted
-9. **FollowToFollowerRatio** - Ratio of following to followers
-10. **AccountAge** - Age of the account (if available)
-11. **HasUrl** - Whether there's a URL in the profile
-12. **DefaultProfileImage** - Whether using default profile image
-## Intended Use
-### Primary Uses
-- Identifying potential bot accounts on Twitter/X
-- Content moderation and platform integrity
-- Research on social media bot behavior and misinformation campaigns
-- Automated account screening for spam detection
-- Election integrity and political bot detection
-### Out-of-Scope Uses
-- This model is specifically trained for Twitter/X and should not be used for other platforms without retraining
-- Should not be the sole basis for account suspension decisions
-- Not designed for real-time detection without proper infrastructure
-- Not suitable for detecting state-sponsored advanced persistent threats without additional features
-- Should not be used to target legitimate users based on behavior patterns
-## How to Use
-### Installation
-```bash
-pip install scikit-learn pandas numpy joblib
-```
-### Loading the Model
-```python
-import joblib
-import pandas as pd
-import numpy as np
-from sklearn.preprocessing import MinMaxScaler
-# Load the model
-model = joblib.load('Twitter_BOT_Detection_Model_v1.pkl')
-# Prepare your data
-features = ['IsPrivate', 'IsVerified', 'HasProfilePic', 'FollowingCount',
-            'FollowerCount', 'HasLocation', 'HasDescription', 'TweetsCount',
-            'FollowToFollowerRatio', 'AccountAge', 'HasUrl', 'DefaultProfileImage']
-# Example account data
-account_data = {
-    'IsPrivate': 0,
-    'IsVerified': 0,
-    'HasProfilePic': 1,
-    'FollowingCount': 5000,
-    'FollowerCount': 50,
-    'HasLocation': 0,
-    'HasDescription': 0,
-    'TweetsCount': 10000,
-    'FollowToFollowerRatio': 100.0,
-    'AccountAge': 30,  # days
-    'HasUrl': 1,
-    'DefaultProfileImage': 0
-}
-# Create DataFrame
-df = pd.DataFrame([account_data])
-# Scale features (use the same scaler as training)
-scaler = MinMaxScaler()
-# Note: In production, you should save and load the scaler from training
-df_scaled = scaler.fit_transform(df[features])
-# Make prediction
-prediction = model.predict(df_scaled)
-probability = model.predict_proba(df_scaled)
-print(f"Prediction: {'Bot' if prediction[0] == 1 else 'Human'}")
-print(f"Confidence - Human: {probability[0][0]:.2%}, Bot: {probability[0][1]:.2%}")
-```
-### Batch Prediction with Threshold
-```python
-# For multiple accounts
-accounts_df = pd.read_csv('twitter_accounts_to_check.csv')
-accounts_scaled = scaler.transform(accounts_df[features])
-predictions = model.predict(accounts_scaled)
-probabilities = model.predict_proba(accounts_scaled)
-# Add results to DataFrame
-accounts_df['is_bot'] = predictions
-accounts_df['bot_probability'] = probabilities[:, 1]
-# Filter by confidence threshold
-high_confidence_bots = accounts_df[accounts_df['bot_probability'] > 0.9]
-suspected_bots = accounts_df[(accounts_df['bot_probability'] > 0.7) &
-                              (accounts_df['bot_probability'] <= 0.9)]
-```
-### Integration Example
-```python
-class TwitterBotDetector:
-    def __init__(self, model_path):
-        self.model = joblib.load(model_path)
-        self.scaler = MinMaxScaler()
-        self.features = ['IsPrivate', 'IsVerified', 'HasProfilePic',
-                        'FollowingCount', 'FollowerCount', 'HasLocation',
-                        'HasDescription', 'TweetsCount', 'FollowToFollowerRatio',
-                        'AccountAge', 'HasUrl', 'DefaultProfileImage']
-    def predict(self, account_features):
-        """Predict if an account is a bot"""
-        df = pd.DataFrame([account_features])
-        df_scaled = self.scaler.fit_transform(df[self.features])
-        prediction = self.model.predict(df_scaled)[0]
-        probability = self.model.predict_proba(df_scaled)[0]
-        return {
-            'is_bot': bool(prediction),
-            'bot_probability': float(probability[1]),
-            'human_probability': float(probability[0])
-        }
-# Usage
-detector = TwitterBotDetector('Twitter_BOT_Detection_Model_v1.pkl')
-result = detector.predict(account_data)
-print(result)
-```
-## Training Data
-The model was trained on a comprehensive dataset of Twitter accounts with labeled bot/human classifications. The dataset includes:
-- Balanced distribution of bot and human accounts
-- Various bot types (spam bots, political bots, engagement bots, etc.)
-- Diverse account types, ages, and activity levels
-- Features extracted from public profile information
-**Note**: The training data is proprietary and not included in this repository.
-## Training Procedure
-### Preprocessing
-1. Feature extraction from Twitter account profiles via API
-2. Calculation of derived features (FollowToFollowerRatio, AccountAge)
-3. Handling of missing values and outliers
-4. MinMax normalization of all features to [0, 1] range
-5. Train-test split with stratification to maintain class balance
-### Hyperparameters
-- **Algorithm**: Random Forest Classifier
-- **Version**: v2 (optimized)
-- **Normalization**: MinMaxScaler
-- **Cross-validation**: Stratified K-Fold
-- **Feature Selection**: Based on domain knowledge and feature importance analysis
-The model was trained using scikit-learn's RandomForestClassifier with optimized hyperparameters selected through extensive cross-validation and grid search.
-## Limitations and Bias
-### Limitations
-- Model performance depends on the quality and accuracy of input features
-- May not generalize to new bot patterns not seen during training
-- Requires access to Twitter API for feature extraction
-- Performance may degrade over time as bot behaviors evolve rapidly
-- Limited to profile-level features; does not analyze tweet content deeply
-- May struggle with sophisticated bots that mimic human behavior closely
-- Requires regular updates due to platform changes (Twitter → X)
-### Potential Biases
-- May be biased toward bot patterns present in the training data
-- Could have temporal biases based on when training data was collected
-- May misclassify legitimate accounts with unusual behavior patterns
-- Potential bias against new accounts or accounts with low activity
-- Could reflect biases in the original labeling process
-- May have difficulty with non-English accounts if training data is primarily English
-### Recommendations
-- Regularly retrain the model with new data to capture evolving bot patterns
-- Use as part of a multi-layered detection system including content analysis
-- Implement human review for high-stakes decisions
-- Monitor for false positives and adjust classification thresholds based on use case
-- Combine with tweet content analysis, network analysis, and temporal patterns
-- Consider context (political events, trending topics) when interpreting results
-- Validate performance across different account types and languages
-## Ethical Considerations
-- This model should be used responsibly and not for harassment, doxxing, or targeting
-- Consider privacy implications when analyzing user accounts
-- Ensure compliance with Twitter/X's terms of service and relevant privacy laws (GDPR, CCPA, etc.)
-- Implement appropriate safeguards against misuse
-- Provide transparency to users about automated detection systems
-- Allow for appeals and manual review processes
-- Be aware of potential for false accusations
-- Consider impact on freedom of speech and legitimate automated accounts (news bots, etc.)
-- Monitor for discriminatory outcomes across different user groups
-## Known Issues
-- Twitter's API changes may affect feature availability
-- Platform rebranding (Twitter → X) may introduce new bot patterns
-- Changes in verification system may affect IsVerified feature utility
-## Model Card Authors
-This model card was created as part of the Bot Detection project for social media platforms.
-## Citation
-If you use this model in your research, please cite:
-```bibtex
-@misc{twitter_bot_detection_2024,
-  title={Twitter Bot Detection Model v2},
-  author={Your Name/Organization},
-  year={2024},
-  publisher={Hugging Face},
-  howpublished={\url{https://huggingface.co/your-username/twitter-bot-detection}}
-}
-```
-## Related Models
-- [TikTok Bot Detection](https://huggingface.co/your-username/tiktok-bot-detection)
-- [Instagram Bot Detection](https://huggingface.co/your-username/instagram-bot-detection)
-## Contact
-For questions or feedback about this model, please open an issue in the repository or contact the maintainers.
-## Updates and Maintenance
-- **Version**: 2.0
-- **Last Updated**: November 2024
-- **Status**: Active
-### Changelog
-- **v2.0**: Improved hyperparameters, better cross-validation, optimized for current Twitter/X platform
-- **v1.0**: Initial release
-### Future Updates
-Future updates may include:
-- Improved feature engineering based on new platform features
-- Additional training data with recent bot patterns
-- Deep learning approaches for complex bot detection
-- Integration of tweet content analysis (NLP features)
-- Network graph analysis for coordinated bot detection
-- Temporal pattern analysis
-- Support for multilingual accounts
-- Real-time feature extraction pipeline

 ---
+language: "en"
+license: "apache-2.0"
+library_name: "scikit-learn"
 tags:
+  - "bot-detection"
+  - "twitter"
+  - "classification"
+  - "scikit-learn"
+  - "random-forest"
 ---
+# TWITTER Bot Detection Model
+## Overview
+This directory contains a trained Random Forest classifier for detecting bot accounts on Twitter.
+**Model Version:** v2
+**Training Date:** 2025-11-27 12:08:54
+**Framework:** scikit-learn 1.5.2
+**Algorithm:** Random Forest Classifier with GridSearchCV Hyperparameter Tuning
+---
+## 📊 Model Performance
+### Final Metrics (Test Set)
+| Metric                | Score           |
+| --------------------- | --------------- |
+| **Accuracy**          | 0.8771 (87.71%) |
+| **Precision**         | 0.8595 (85.95%) |
+| **Recall**            | 0.7558 (75.58%) |
+| **F1-Score**          | 0.8043 (80.43%) |
+| **ROC-AUC**           | 0.9354 (93.54%) |
+| **Average Precision** | 0.9008 (90.08%) |
+### Model Improvement
+- **Baseline ROC-AUC:** 0.9314
+- **Tuned ROC-AUC:** 0.9354
+- **Improvement:** 0.0040 (0.43%)
+---
+## 🗂️ Files
+| File                           | Description                            |
+| ------------------------------ | -------------------------------------- |
+| `twitter_bot_detection_v2.pkl` | Trained Random Forest model            |
+| `twitter_scaler_v2.pkl`        | MinMaxScaler for feature normalization |
+| `twitter_features_v2.json`     | List of features used by the model     |
+| `twitter_metrics_v2.txt`       | Detailed performance metrics report    |
+| `images/`                      | All visualization plots (13 images)    |
+| `README.md`                    | This file                              |
+---
+## 🎯 Dataset Information
+### Training Configuration
+- **Training Samples:** 29,951
+- **Test Samples:** 7,487
+- **Total Samples:** 37,438
+- **Number of Features:** 12
+- **Cross-Validation Folds:** 5
+- **Random State:** 42
+### Class Distribution
+**Training Set:**
+- Human (0): 20,028 (66.87%)
+- Bot (1): 9,923 (33.13%)
+**Test Set:**
+- Human (0): 4,985 (66.58%)
+- Bot (1): 2,502 (33.42%)
+---
+## 🔧 Features (12)
+1. `has_custom_cover_image`
+2. `description_length`
+3. `favourites_count`
+4. `followers_count`
+5. `friends_count`
+6. `followers_to_friends_ratio`
+7. `has_location`
+8. `username_digit_count`
+9. `username_length`
+10. `statuses_count`
+11. `is_verified`
+12. `account_age_days`
+---
+## 🏆 Top 5 Most Important Features
+4. **followers_count** - 0.1895
+5. **favourites_count** - 0.1813
+6. **friends_count** - 0.1494
+7. **statuses_count** - 0.1244
+8. **account_age_days** - 0.1010
+---
+## ⚙️ Hyperparameters
+### Best Parameters (from GridSearchCV)
+- **class_weight:** balanced
+- **max_depth:** 20
+- **max_features:** sqrt
+- **min_samples_leaf:** 1
+- **min_samples_split:** 2
+- **n_estimators:** 300
+### Parameter Search Space
+- **n_estimators:** [100, 200, 300]
+- **max_depth:** [10, 15, 20, None]
+- **min_samples_split:** [2, 5, 10]
+- **min_samples_leaf:** [1, 2, 4]
+- **max_features:** ['sqrt', 'log2']
+- **bootstrap:** [True, False]
+**Total combinations tested:** 540
+---
+## 📈 Cross-Validation Results
+### Mean Scores (5-Fold Stratified CV)
+- **Accuracy:** 0.8750 (±0.0053)
+- **Precision:** 0.8658 (±0.0089)
+- **Recall:** 0.7368 (±0.0113)
+- **F1-Score:** 0.7961 (±0.0092)
+- **ROC-AUC:** 0.9325 (±0.0037)
+---
+## 🖼️ Visualizations
+All visualizations are saved in the `images/` directory:
+1. **01_class_distribution.png** - Training/Test set class distribution
+2. **02_feature_correlation.png** - Feature correlation with target variable
+3. **03_correlation_matrix.png** - Feature correlation heatmap
+4. **04_baseline_confusion_matrix.png** - Baseline model confusion matrix
+5. **05_baseline_roc_curve.png** - Baseline ROC curve
+6. **06_baseline_precision_recall.png** - Baseline Precision-Recall curve
+7. **07_baseline_feature_importance.png** - Baseline feature importance
+8. **08_cross_validation.png** - Cross-validation score distribution
+9. **09_tuned_confusion_matrix.png** - Tuned model confusion matrix
+10. **10_tuned_roc_curve.png** - Tuned ROC curve
+11. **11_tuned_precision_recall.png** - Tuned Precision-Recall curve
+12. **12_tuned_feature_importance.png** - Tuned feature importance
+13. **13_model_comparison.png** - Baseline vs Tuned comparison
+---
+## 🚀 Usage Example
+```python
+import joblib
+import pandas as pd
+import numpy as np
+# Load model and scaler
+model = joblib.load('twitter_bot_detection_v2.pkl')
+scaler = joblib.load('twitter_scaler_v2.pkl')
+# Prepare your data (example)
+data = {
+    'has_custom_cover_image': 0.5,
+    'description_length': 0.5,
+    'favourites_count': 0.5,
+    'followers_count': 0.5,
+    'friends_count': 0.5,
+    'followers_to_friends_ratio': 0.5,
+    'has_location': 0.5,
+    'username_digit_count': 0.5,
+    'username_length': 0.5,
+    'statuses_count': 0.5,
+    'is_verified': 0.5,
+    'account_age_days': 0.5,
+}
+# Create DataFrame
+df = pd.DataFrame([data])
+# Scale features
+df_scaled = scaler.transform(df)
+# Predict
+prediction = model.predict(df_scaled)[0]
+probability = model.predict_proba(df_scaled)[0]
+print(f"Prediction: {'Bot' if prediction == 1 else 'Human'}")
+print(f"Bot Probability: {probability[1]:.4f}")
+print(f"Human Probability: {probability[0]:.4f}")
+```
+---
+## 📋 Confusion Matrix Breakdown
+### Tuned Model (Test Set)
+```
+                Predicted
+              Human    Bot
+Actual Human     4676     309
+       Bot        611    1891
+```
+- **True Negatives (TN):** 4,676 (Correctly identified humans)
+- **False Positives (FP):** 309 (Humans incorrectly classified as bots)
+- **False Negatives (FN):** 611 (Bots incorrectly classified as humans)
+- **True Positives (TP):** 1,891 (Correctly identified bots)
+---
+## 🔍 Model Interpretation
+### Strengths
+- High ROC-AUC score (0.9354) indicates excellent discrimination capability
+- Balanced precision and recall for both classes
+- Robust cross-validation performance
+### Key Insights
+1. Top features drive bot classification effectively
+2. GridSearchCV improved performance over baseline by 0.43%
+3. Model generalizes well on unseen test data
+---
+## 📝 Notes
+- **Feature Scaling:** All features are scaled using MinMaxScaler to [0, 1] range
+- **Missing Values:** Filled with 0 during preprocessing
+- **Class Balance:** Imbalanced dataset
+- **Model Type:** Ensemble method resistant to overfitting
+---
+## 🔄 Model Updates
+To retrain the model:
+1. Place new training data in `../data/train_twitter.csv`
+2. Run the training notebook: `5_enhanced_training.ipynb`
+3. Update this README with new metrics
+---
+## 📧 Contact & Support
+For questions or issues regarding this model, please refer to the main project documentation.
+---
+**Generated:** 2025-11-27 12:08:54
+**Notebook:** `5_enhanced_training.ipynb`
+**Platform:** Twitter

images/01_class_distribution.png ADDED Viewed

Git LFS Details

SHA256: 28e8b369a102f23ebc8a38d7a7750ee3cf228891f50fb6ee27ff63137f23e189
Pointer size: 131 Bytes
Size of remote file: 126 kB

images/02_feature_correlation.png ADDED Viewed

Git LFS Details

SHA256: 8a63a4b708b69990192735645821748b2193a0c78c387919b879f8ea56fa04c3
Pointer size: 131 Bytes
Size of remote file: 149 kB

images/03_correlation_matrix.png ADDED Viewed

Git LFS Details

SHA256: db5060fb105b50996edecf5e7580bb0bc61e358ba36a5882cfed9cc9fbe96d31
Pointer size: 131 Bytes
Size of remote file: 417 kB

images/04_baseline_confusion_matrix.png ADDED Viewed

images/05_baseline_roc_curve.png ADDED Viewed

Git LFS Details

SHA256: b47274a0ea64edd968129d481608ec407bff807ed8699d45c808d6db8a4599f0
Pointer size: 131 Bytes
Size of remote file: 149 kB

images/06_baseline_precision_recall.png ADDED Viewed

images/07_baseline_feature_importance.png ADDED Viewed

Git LFS Details

SHA256: 085137d8353282ddc379f6a332f3e86c416e481dc44f018f70de213f731d4a54
Pointer size: 131 Bytes
Size of remote file: 150 kB

images/08_cross_validation.png ADDED Viewed

Git LFS Details

SHA256: f3146b19219aec9729fc14fdf7e3c32dc03cee0cfad38fec09d373088fe6d1e0
Pointer size: 131 Bytes
Size of remote file: 109 kB

images/09_tuned_confusion_matrix.png ADDED Viewed

images/10_tuned_roc_curve.png ADDED Viewed

Git LFS Details

SHA256: c272ba78d06a22f2854fcd106311301508821fb89747a921d0b2b817e89a2d01
Pointer size: 131 Bytes
Size of remote file: 150 kB

images/11_tuned_precision_recall.png ADDED Viewed

Git LFS Details

SHA256: 3219b7fc470ed3a16c4c0ce46c6666ab61fd08ba84c24d09ce4330c034c97f48
Pointer size: 131 Bytes
Size of remote file: 101 kB

images/12_tuned_feature_importance.png ADDED Viewed

Git LFS Details

SHA256: 8a0e401621b1cf2f79783ec0e158dbf80515d7befd29c2dc3e79b606e6f3ad86
Pointer size: 131 Bytes
Size of remote file: 147 kB

images/13_model_comparison.png ADDED Viewed

Git LFS Details

SHA256: c64122d691fb9949705ecba91504d49d1e07a8dbb35611ef3322aa8f870dbcac
Pointer size: 131 Bytes
Size of remote file: 137 kB

twitter_bot_detection_v2.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8a77a8bb9af5ce0909ae5dd72f18176d12795e3f2ccc658fb4e519d325db19b2
+size 144062585

twitter_features_v2.json ADDED Viewed

	@@ -0,0 +1,14 @@

+[
+  "has_custom_cover_image",
+  "description_length",
+  "favourites_count",
+  "followers_count",
+  "friends_count",
+  "followers_to_friends_ratio",
+  "has_location",
+  "username_digit_count",
+  "username_length",
+  "statuses_count",
+  "is_verified",
+  "account_age_days"
+]

twitter_metrics_v2.txt ADDED Viewed

	@@ -0,0 +1,39 @@

+======================================================================
+TWITTER Bot Detection Model - Performance Report
+======================================================================
+Date: 2025-11-27 12:08:54.462391
+Training Configuration:
+  - Platform: twitter
+  - Train samples: 29951
+  - Test samples: 7487
+  - Features: 12
+  - CV Folds: 5
+  - Random State: 42
+Best Hyperparameters:
+  - class_weight: balanced
+  - max_depth: 20
+  - max_features: sqrt
+  - min_samples_leaf: 1
+  - min_samples_split: 2
+  - n_estimators: 300
+Performance Metrics (Test Set):
+  - Accuracy: 0.8771
+  - Precision: 0.8595
+  - Recall: 0.7558
+  - F1: 0.8043
+  - Roc_auc: 0.9354
+  - Avg_precision: 0.9008
+Cross-Validation Results:
+  - Mean ROC-AUC: 0.9352
+Feature Importance (Top 5):
+  - followers_count: 0.1895
+  - favourites_count: 0.1813
+  - friends_count: 0.1494
+  - statuses_count: 0.1244
+  - account_age_days: 0.1010

twitter_model_comparison.csv ADDED Viewed

	@@ -0,0 +1,7 @@

+Metric,Baseline,Tuned,Improvement,Improvement %
+Accuracy,0.8740483504741552,0.8771203419260051,0.003071991451849887,0.3514669926650382
+Precision,0.8678621991505427,0.8595454545454545,-0.008316744605088244,-0.9583024370952686
+Recall,0.7350119904076738,0.7557953637090328,0.02078337330135893,2.827623708537251
+F1-Score,0.7959316165332179,0.8043385793279455,0.008406962794727635,1.0562418454169766
+ROC-AUC,0.9314009975570197,0.9353899828983353,0.003988985341315643,0.4282779760574006
+Avg Precision,0.8949209792592716,0.9007641701676647,0.005843190908393137,0.6529281404520834

twitter_scaler_v2.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c8725c0a395abc30b368dfe7a64051f0095fc7acfb4cfb1ac4229ffe73f02a32
+size 1623