Spaces:
Sleeping
Sleeping
Upload 3 files
Browse files- README.md +33 -0
- app.py +162 -0
- requirements.txt +7 -0
README.md
CHANGED
|
@@ -10,3 +10,36 @@ pinned: false
|
|
| 10 |
---
|
| 11 |
|
| 12 |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
---
|
| 11 |
|
| 12 |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
| 13 |
+
|
| 14 |
+
# AI Ad Server Optimizer
|
| 15 |
+
|
| 16 |
+
## Overview
|
| 17 |
+
The AI Ad Server Optimizer is a Python-based application designed to enhance ad performance through data-driven insights. It utilizes machine learning models to cluster user behavior and provides actionable insights to optimize ad targeting and engagement.
|
| 18 |
+
|
| 19 |
+
## Features
|
| 20 |
+
- **Cluster Prediction**: Predicts user behavior clusters based on session data.
|
| 21 |
+
- **Ad Performance Analytics**: Summarizes key ad performance metrics like Click-Through Rate (CTR), Conversion Rate, and Bounce Rate.
|
| 22 |
+
|
| 23 |
+
## Technology Stack
|
| 24 |
+
- Python 3.8+
|
| 25 |
+
- Pandas for data manipulation
|
| 26 |
+
- Scikit-learn for machine learning tasks
|
| 27 |
+
- Gradio for creating interactive web interfaces
|
| 28 |
+
|
| 29 |
+
## Installation
|
| 30 |
+
Ensure you have Python 3.8 or higher installed. Clone the repository and install the required dependencies:
|
| 31 |
+
```bash
|
| 32 |
+
git clone https://github.com/trilogy-group/Skyvera-AI.git
|
| 33 |
+
cd scripts/AIAdServerOptimizer
|
| 34 |
+
pip install -r requirements.txt
|
| 35 |
+
```
|
| 36 |
+
|
| 37 |
+
## Usage
|
| 38 |
+
Run the application with:
|
| 39 |
+
```bash
|
| 40 |
+
python scripts/AIAdServerOptimizer/app.py
|
| 41 |
+
```
|
| 42 |
+
Navigate to the provided local URL to access the interactive web interface.
|
| 43 |
+
|
| 44 |
+
## Contributing
|
| 45 |
+
Contributions are welcome! Please fork the repository and submit pull requests with your enhancements. Ensure you follow the existing code style
|
app.py
ADDED
|
@@ -0,0 +1,162 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import pandas as pd
|
| 2 |
+
from sklearn.cluster import KMeans
|
| 3 |
+
from sklearn.preprocessing import StandardScaler, OneHotEncoder
|
| 4 |
+
from sklearn.compose import ColumnTransformer
|
| 5 |
+
from sklearn.pipeline import Pipeline
|
| 6 |
+
import logging
|
| 7 |
+
import gradio as gr
|
| 8 |
+
|
| 9 |
+
# Configure logging
|
| 10 |
+
logging.basicConfig(level=logging.DEBUG, format='%(asctime)s - %(levelname)s - %(message)s')
|
| 11 |
+
|
| 12 |
+
# Expanded sample data
|
| 13 |
+
data = pd.DataFrame({
|
| 14 |
+
'User ID': [1, 2, 3, 4, 5],
|
| 15 |
+
'Session Duration': [300, 450, 200, 600, 350],
|
| 16 |
+
'Pages Visited': [5, 8, 3, 12, 7],
|
| 17 |
+
'Ads Clicked': [2, 1, 0, 3, 2],
|
| 18 |
+
'User Interests': ['technology', 'sports', 'technology', 'arts', 'sports'],
|
| 19 |
+
'Engagement Score': [0.8, 0.5, 0.3, 0.9, 0.7],
|
| 20 |
+
'Device Type': ['mobile', 'desktop', 'mobile', 'tablet', 'desktop'],
|
| 21 |
+
'Time of Day': ['morning', 'afternoon', 'evening', 'morning', 'afternoon'],
|
| 22 |
+
'Time Spent per Page': [30, 25, 45, 20, 50],
|
| 23 |
+
'Click Through Rate': [0.1, 0.2, 0.05, 0.3, 0.15],
|
| 24 |
+
'Conversion Rate': [0.05, 0.1, 0, 0.2, 0.1],
|
| 25 |
+
'Frequency of Visits': [10, 20, 5, 15, 10],
|
| 26 |
+
'Bounce Rate': [0.2, 0.1, 0.5, 0.05, 0.3]
|
| 27 |
+
})
|
| 28 |
+
|
| 29 |
+
logging.info("Sample data prepared.")
|
| 30 |
+
|
| 31 |
+
# Updated preprocessing
|
| 32 |
+
preprocessor = ColumnTransformer(
|
| 33 |
+
transformers=[
|
| 34 |
+
('num', StandardScaler(), ['Session Duration', 'Pages Visited', 'Ads Clicked', 'Engagement Score',
|
| 35 |
+
'Time Spent per Page', 'Click Through Rate', 'Conversion Rate',
|
| 36 |
+
'Frequency of Visits', 'Bounce Rate']),
|
| 37 |
+
('cat', OneHotEncoder(), ['User Interests', 'Device Type', 'Time of Day'])
|
| 38 |
+
])
|
| 39 |
+
|
| 40 |
+
logging.info("Preprocessor setup complete.")
|
| 41 |
+
|
| 42 |
+
# Clustering
|
| 43 |
+
kmeans = KMeans(n_clusters=3, random_state=42)
|
| 44 |
+
logging.info("KMeans clustering configured.")
|
| 45 |
+
|
| 46 |
+
# Define the pipeline
|
| 47 |
+
pipeline = Pipeline([
|
| 48 |
+
('preprocessor', preprocessor),
|
| 49 |
+
('cluster', kmeans)
|
| 50 |
+
])
|
| 51 |
+
|
| 52 |
+
logging.info("Pipeline created.")
|
| 53 |
+
|
| 54 |
+
# Fit the pipeline to the data
|
| 55 |
+
pipeline.fit(data)
|
| 56 |
+
|
| 57 |
+
def generate_insights(cluster_characteristics):
|
| 58 |
+
# Example insights based on hypothetical thresholds
|
| 59 |
+
insights = []
|
| 60 |
+
if cluster_characteristics['Engagement Score'] > 0.7 and cluster_characteristics['Conversion Rate'] < 0.1:
|
| 61 |
+
insights.append("High engagement but low conversion: Consider optimizing the checkout process or providing targeted offers.")
|
| 62 |
+
if cluster_characteristics['Click Through Rate'] > 0.2:
|
| 63 |
+
insights.append("High click-through rate: Users are interacting well with ads. Increase ad relevance to boost conversions.")
|
| 64 |
+
if cluster_characteristics['Bounce Rate'] > 0.3:
|
| 65 |
+
insights.append("High bounce rate: Review landing page design and content relevance to improve user retention.")
|
| 66 |
+
return " ".join(insights)
|
| 67 |
+
|
| 68 |
+
def predict_cluster(session_duration, pages_visited, ads_clicked, engagement_score, user_interests, device_type, time_of_day, time_spent_per_page, click_through_rate, conversion_rate, frequency_of_visits, bounce_rate):
|
| 69 |
+
input_df = pd.DataFrame({
|
| 70 |
+
'Session Duration': [session_duration],
|
| 71 |
+
'Pages Visited': [pages_visited],
|
| 72 |
+
'Ads Clicked': [ads_clicked],
|
| 73 |
+
'Engagement Score': [engagement_score],
|
| 74 |
+
'User Interests': [user_interests],
|
| 75 |
+
'Device Type': [device_type],
|
| 76 |
+
'Time of Day': [time_of_day],
|
| 77 |
+
'Time Spent per Page': [time_spent_per_page],
|
| 78 |
+
'Click Through Rate': [click_through_rate],
|
| 79 |
+
'Conversion Rate': [conversion_rate],
|
| 80 |
+
'Frequency of Visits': [frequency_of_visits],
|
| 81 |
+
'Bounce Rate': [bounce_rate]
|
| 82 |
+
})
|
| 83 |
+
cluster = pipeline.predict(input_df)[0]
|
| 84 |
+
centroids = pipeline.named_steps['cluster'].cluster_centers_
|
| 85 |
+
cluster_characteristics = centroids[cluster]
|
| 86 |
+
|
| 87 |
+
# Decode features for insights
|
| 88 |
+
num_features = ['Session Duration', 'Pages Visited', 'Ads Clicked', 'Engagement Score', 'Time Spent per Page', 'Click Through Rate', 'Conversion Rate', 'Frequency of Visits', 'Bounce Rate']
|
| 89 |
+
scaled_features = cluster_characteristics[:9]
|
| 90 |
+
original_num_values = pipeline.named_steps['preprocessor'].named_transformers_['num'].inverse_transform([scaled_features])[0]
|
| 91 |
+
cat_features = ['User Interests', 'Device Type', 'Time of Day']
|
| 92 |
+
encoded_features = cluster_characteristics[9:]
|
| 93 |
+
original_cat_values = pipeline.named_steps['preprocessor'].named_transformers_['cat'].inverse_transform([encoded_features])[0]
|
| 94 |
+
|
| 95 |
+
# Combine numerical and categorical features into a dictionary
|
| 96 |
+
cluster_characteristics = dict(zip(num_features, original_num_values))
|
| 97 |
+
cluster_characteristics.update(dict(zip(cat_features, original_cat_values)))
|
| 98 |
+
|
| 99 |
+
# Generate actionable insights
|
| 100 |
+
insights = generate_insights(cluster_characteristics)
|
| 101 |
+
|
| 102 |
+
return f"Predicted Cluster: {cluster}\nCharacteristics: {cluster_characteristics}\nActionable Insights: {insights}"
|
| 103 |
+
|
| 104 |
+
def ad_performance_analytics():
|
| 105 |
+
# Calculate average metrics
|
| 106 |
+
avg_ctr = data['Click Through Rate'].mean()
|
| 107 |
+
avg_conversion_rate = data['Conversion Rate'].mean()
|
| 108 |
+
avg_bounce_rate = data['Bounce Rate'].mean()
|
| 109 |
+
|
| 110 |
+
# Prepare the analytics report
|
| 111 |
+
report = f"Average Click Through Rate: {avg_ctr:.2%}\n"
|
| 112 |
+
report += f"Average Conversion Rate: {avg_conversion_rate:.2%}\n"
|
| 113 |
+
report += f"Average Bounce Rate: {avg_bounce_rate:.2%}"
|
| 114 |
+
|
| 115 |
+
return report
|
| 116 |
+
|
| 117 |
+
with gr.Blocks() as demo:
|
| 118 |
+
with gr.Tab("Cluster Prediction"):
|
| 119 |
+
with gr.Row():
|
| 120 |
+
gr.Markdown("**This form allows you to input user session data to predict which cluster the user belongs to and provides actionable insights based on their behavior.**")
|
| 121 |
+
session_duration = gr.Number(label="Session Duration", value=300) # Set initial value
|
| 122 |
+
pages_visited = gr.Number(label="Pages Visited", value=5) # Set initial value
|
| 123 |
+
ads_clicked = gr.Number(label="Ads Clicked", value=2) # Set initial value
|
| 124 |
+
engagement_score = gr.Slider(0, 1, label="Engagement Score", value=0.5) # Set initial value
|
| 125 |
+
user_interests = gr.Dropdown(['technology', 'sports', 'arts'], label="User Interests", value='technology') # Set initial value
|
| 126 |
+
device_type = gr.Radio(['mobile', 'desktop', 'tablet'], label="Device Type", value='mobile') # Set initial value
|
| 127 |
+
time_of_day = gr.Radio(['morning', 'afternoon', 'evening'], label="Time of Day", value='morning') # Set initial value
|
| 128 |
+
time_spent_per_page = gr.Number(label="Time Spent per Page", value=30) # Set initial value
|
| 129 |
+
click_through_rate = gr.Slider(0, 1, step=0.01, label="Click Through Rate", value=0.1) # Set initial value
|
| 130 |
+
conversion_rate = gr.Slider(0, 1, step=0.01, label="Conversion Rate", value=0.05) # Set initial value
|
| 131 |
+
frequency_of_visits = gr.Number(label="Frequency of Visits", value=10) # Set initial value
|
| 132 |
+
bounce_rate = gr.Slider(0, 1, step=0.01, label="Bounce Rate", value=0.2) # Set initial value
|
| 133 |
+
predict_button = gr.Button("Predict")
|
| 134 |
+
output_textbox = gr.Textbox(label="Prediction Output")
|
| 135 |
+
predict_button.click(
|
| 136 |
+
predict_cluster,
|
| 137 |
+
inputs=[
|
| 138 |
+
session_duration, pages_visited, ads_clicked, engagement_score, user_interests, device_type,
|
| 139 |
+
time_of_day, time_spent_per_page, click_through_rate, conversion_rate, frequency_of_visits, bounce_rate
|
| 140 |
+
],
|
| 141 |
+
outputs=output_textbox
|
| 142 |
+
)
|
| 143 |
+
|
| 144 |
+
with gr.Tab("Ad Performance Analytics"):
|
| 145 |
+
gr.Markdown("""
|
| 146 |
+
**This form provides a summary of key performance metrics for ads.**
|
| 147 |
+
|
| 148 |
+
- **Average Click-Through Rate (CTR):** Measures the percentage of ad views that result in clicks. Higher values indicate more effective ad engagement.
|
| 149 |
+
- **Average Conversion Rate:** Indicates the percentage of clicks that convert into actions, such as purchases or sign-ups. This metric helps assess the effectiveness of ad targeting and the overall conversion potential.
|
| 150 |
+
- **Average Bounce Rate:** Reflects the percentage of single-page visits. Lower bounce rates suggest that the landing pages are relevant to the visitors' interests.
|
| 151 |
+
|
| 152 |
+
Understanding these metrics can help optimize ad strategies and improve overall campaign performance.
|
| 153 |
+
""")
|
| 154 |
+
analytics_button = gr.Button("Analyze Ad Performance")
|
| 155 |
+
analytics_output = gr.Textbox(label="Analytics Output")
|
| 156 |
+
analytics_button.click(
|
| 157 |
+
ad_performance_analytics,
|
| 158 |
+
outputs=analytics_output
|
| 159 |
+
)
|
| 160 |
+
|
| 161 |
+
demo.launch()
|
| 162 |
+
|
requirements.txt
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
pandas
|
| 2 |
+
sklearn.cluster
|
| 3 |
+
sklearn.preprocessing
|
| 4 |
+
sklearn.compose
|
| 5 |
+
sklearn.pipeline
|
| 6 |
+
logging
|
| 7 |
+
gradio
|