Upload 5 files
Browse files- README.md +99 -12
- app.py +49 -0
- cear_model.py +69 -0
- platform_weights.json +9 -0
- requirements.txt +7 -0
README.md
CHANGED
|
@@ -1,12 +1,99 @@
|
|
| 1 |
-
---
|
| 2 |
-
title:
|
| 3 |
-
emoji:
|
| 4 |
-
colorFrom:
|
| 5 |
-
colorTo:
|
| 6 |
-
sdk: gradio
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: Cultural Exposure and Algorithmic Risk Model
|
| 3 |
+
emoji: "🧭"
|
| 4 |
+
colorFrom: "blue"
|
| 5 |
+
colorTo: "green"
|
| 6 |
+
sdk: gradio # THIS IS THE CRITICAL LINE
|
| 7 |
+
app_file: app.py
|
| 8 |
+
---
|
| 9 |
+
# Cultural Exposure & Algorithmic Risk (CEAR) Baseline v1.0
|
| 10 |
+
|
| 11 |
+
## Model Description
|
| 12 |
+
|
| 13 |
+
The **Cultural Exposure & Algorithmic Risk (CEAR) Model** is an **analytic, rule-based scoring system** designed to help users and researchers interpret social media usage in terms of its potential impact on cultural awareness and algorithmic vulnerability.
|
| 14 |
+
|
| 15 |
+
This version is a V1 Baseline: it is **deterministic** (theory-driven by fixed rules and weights) and does not rely on supervised machine learning or proprietary user data.
|
| 16 |
+
|
| 17 |
+
### 🎯 Key Outputs
|
| 18 |
+
|
| 19 |
+
1. **Cultural Connectedness Score (C-Score):** Estimates exposure to viral and trending content, modeled with diminishing returns on time.
|
| 20 |
+
2. **Algorithmic Risk Score (A-Risk):** Quantifies vulnerability incurred from concentrated time on high-intensity, opaque algorithmic feeds.
|
| 21 |
+
3. **Platform Diversity Index (D-Index):** Measures the concentration/spread of usage across platforms (using $1/\text{HHI}$).
|
| 22 |
+
4. **Cultural Efficiency:** Per-platform estimates of C-Score gained per minute spent.
|
| 23 |
+
|
| 24 |
+
## ⚙️ Analytic Basis & Scoring Logic
|
| 25 |
+
|
| 26 |
+
The model is defined by transparent assumptions encoded in the Python code (`cear_model.py`) and the platform weights (`platform_weights.json`).
|
| 27 |
+
|
| 28 |
+
### Core Formulas
|
| 29 |
+
|
| 30 |
+
The key to the C-Score is the **Diminishing Returns Function** ($f_{DR}$), which prevents the C-Score from increasing linearly with time, acknowledging that the first hour is likely more valuable than the tenth.
|
| 31 |
+
|
| 32 |
+
$$f_{DR}(\text{Min}) = \log_{10}(\text{Min} + 1)$$
|
| 33 |
+
|
| 34 |
+
The final scores are calculated as:
|
| 35 |
+
|
| 36 |
+
$$C_{Score} = \sum_{i} \left[ W_{C,i} \times f_{DR}(\text{Min}_i) \right]$$
|
| 37 |
+
|
| 38 |
+
$$A_{Risk} = \sum_{i} \left[ W_{A,i} \times \text{Min}_i \right]$$
|
| 39 |
+
|
| 40 |
+
*(Where $W_{C}$ is the Trend Density Weight and $W_{A}$ is the Algorithmic Risk Weight, defined in `platform_weights.json`.)*
|
| 41 |
+
|
| 42 |
+
## 🚀 Deployment & Usage (Hugging Face Space)
|
| 43 |
+
|
| 44 |
+
This repository contains the core logic (`cear_model.py`) and the application interface (`app.py`) for a Hugging Face Space.
|
| 45 |
+
|
| 46 |
+
### Model Integration (The Engine)
|
| 47 |
+
|
| 48 |
+
The core logic can be imported and run in any environment:
|
| 49 |
+
|
| 50 |
+
```python
|
| 51 |
+
import pandas as pd
|
| 52 |
+
from cear_model import CEARModel
|
| 53 |
+
|
| 54 |
+
# Example Input Data
|
| 55 |
+
user_data = pd.DataFrame([
|
| 56 |
+
{'platform_name': 'TikTok', 'minutes_per_week': 450},
|
| 57 |
+
{'platform_name': 'YouTube', 'minutes_per_week': 200},
|
| 58 |
+
{'platform_name': 'Reddit', 'minutes_per_week': 50},
|
| 59 |
+
])
|
| 60 |
+
|
| 61 |
+
model = CEARModel()
|
| 62 |
+
results = model.calculate_scores(user_data)
|
| 63 |
+
# {'C_Score': 3.75, 'A_Risk': 565.0, ...}
|
| 64 |
+
|
| 65 |
+
# Application Interface (The App - app.py)
|
| 66 |
+
|
| 67 |
+
The app.py script uses the Gradio library to create an interactive web interface. It handles:
|
| 68 |
+
|
| 69 |
+
Collecting user input via a table component.
|
| 70 |
+
|
| 71 |
+
Calling the CEARModel.calculate_scores() method.
|
| 72 |
+
|
| 73 |
+
Generating a qualitative natural language summary based on the quadrant of the C-Score and A-Risk (e.g., "High C, Low A").
|
| 74 |
+
|
| 75 |
+
⚠️ Limitations and Ethical Considerations
|
| 76 |
+
|
| 77 |
+
1. Theoretical, Not Validated: The scores are based on fixed, theoretical assumptions about platform design. They are not calibrated against real-world user survey data or outcomes (e.g., actual cultural literacy, actual regret). Scores are relative estimates only.
|
| 78 |
+
|
| 79 |
+
2. No Content Analysis: The model only uses time and platform. It cannot distinguish between a productive hour watching educational content and an unproductive hour scrolling low-quality content.
|
| 80 |
+
|
| 81 |
+
3. Future Work: This deterministic model serves as a foundation. Future versions are intended to use the same input schema to train supervised machine learning models that directly predict outcomes (e.g., predicting user-reported "felt caught up" or "post-scroll regret").
|
| 82 |
+
|
| 83 |
+
|
| 84 |
+
---
|
| 85 |
+
|
| 86 |
+
## 2. `requirements.txt` (For Deployment)
|
| 87 |
+
|
| 88 |
+
This file lists the necessary Python packages for the Gradio Space to run your model and interface correctly.
|
| 89 |
+
|
| 90 |
+
```text
|
| 91 |
+
# requirements.txt
|
| 92 |
+
|
| 93 |
+
# Core Model Dependencies
|
| 94 |
+
pandas
|
| 95 |
+
numpy
|
| 96 |
+
|
| 97 |
+
# Gradio Space Dependencies
|
| 98 |
+
# Gradio is used to build the simple web application interface (app.py)
|
| 99 |
+
gradio
|
app.py
ADDED
|
@@ -0,0 +1,49 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# app.py (Simplified Gradio code)
|
| 2 |
+
|
| 3 |
+
import gradio as gr
|
| 4 |
+
from cear_model import CEARModel
|
| 5 |
+
import pandas as pd
|
| 6 |
+
# ... (include logic to load PLATFORM_WEIGHTS)
|
| 7 |
+
|
| 8 |
+
# Instantiate the model globally
|
| 9 |
+
cear_analyzer = CEARModel()
|
| 10 |
+
|
| 11 |
+
def analyze_user_data(input_table):
|
| 12 |
+
# 1. Convert Gradio input (list of lists) to DataFrame
|
| 13 |
+
user_data_df = pd.DataFrame(input_table, columns=['platform_name', 'minutes_per_week'])
|
| 14 |
+
user_data_df['minutes_per_week'] = pd.to_numeric(user_data_df['minutes_per_week'], errors='coerce').fillna(0)
|
| 15 |
+
|
| 16 |
+
# 2. Call the core model
|
| 17 |
+
raw_scores = cear_analyzer.calculate_scores(user_data_df)
|
| 18 |
+
|
| 19 |
+
# 3. Format output for the user (The "App" layer)
|
| 20 |
+
summary = f"""
|
| 21 |
+
## 📊 Analysis Summary
|
| 22 |
+
- **Cultural Connectedness Score (C-Score):** **{raw_scores['C_Score']:.2f}**
|
| 23 |
+
- **Algorithmic Risk Score (A-Risk):** **{raw_scores['A_Risk']:.2f}**
|
| 24 |
+
- **Platform Diversity Index (D-Index):** **{raw_scores['D_Index']:.2f}**
|
| 25 |
+
---
|
| 26 |
+
### 📝 Interpretation
|
| 27 |
+
*Your C-Score is based on logarithmically scaled time, reflecting diminishing returns. Your A-Risk is based on raw time, reflecting concentrated attention.*
|
| 28 |
+
"""
|
| 29 |
+
|
| 30 |
+
# Return the formatted string and potentially a table of efficiency
|
| 31 |
+
return summary, pd.DataFrame(raw_scores['Per_Platform_Efficiency'])
|
| 32 |
+
|
| 33 |
+
# Define the Gradio interface
|
| 34 |
+
iface = gr.Interface(
|
| 35 |
+
fn=analyze_user_data,
|
| 36 |
+
inputs=gr.Dataframe(
|
| 37 |
+
headers=['platform_name', 'minutes_per_week'],
|
| 38 |
+
row_count=5,
|
| 39 |
+
col_count=(2, 'fixed'),
|
| 40 |
+
label="Weekly Screen Time Input (Source data from OS Tracker)"
|
| 41 |
+
),
|
| 42 |
+
outputs=[
|
| 43 |
+
gr.Markdown(label="Score Results"),
|
| 44 |
+
gr.Dataframe(label="Per-Platform Cultural Efficiency")
|
| 45 |
+
],
|
| 46 |
+
title="CEAR Baseline: Cultural Exposure & Algorithmic Risk Analyzer"
|
| 47 |
+
)
|
| 48 |
+
|
| 49 |
+
iface.launch()
|
cear_model.py
ADDED
|
@@ -0,0 +1,69 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# cear_model.py
|
| 2 |
+
import numpy as np
|
| 3 |
+
import pandas as pd
|
| 4 |
+
import json
|
| 5 |
+
import os # Necessary for finding the JSON file
|
| 6 |
+
|
| 7 |
+
# --- 1. Load PLATFORM_WEIGHTS variable from JSON ---
|
| 8 |
+
PLATFORM_WEIGHTS = {} # Default value
|
| 9 |
+
|
| 10 |
+
try:
|
| 11 |
+
# Get the directory of the current script (cear_model.py)
|
| 12 |
+
script_dir = os.path.dirname(os.path.abspath(__file__))
|
| 13 |
+
json_path = os.path.join(script_dir, 'platform_weights.json')
|
| 14 |
+
|
| 15 |
+
with open(json_path, 'r') as f:
|
| 16 |
+
# Load the configuration data into the global variable
|
| 17 |
+
PLATFORM_WEIGHTS = json.load(f)
|
| 18 |
+
|
| 19 |
+
except FileNotFoundError:
|
| 20 |
+
# This warning is useful for debugging if the file is missing
|
| 21 |
+
print("FATAL ERROR: platform_weights.json not found! Using empty weights.")
|
| 22 |
+
# The default empty {} dict is used if the file is missing
|
| 23 |
+
|
| 24 |
+
# --- 2. Define the Model Class ---
|
| 25 |
+
# The class can now safely reference the global PLATFORM_WEIGHTS variable
|
| 26 |
+
class CEARModel:
|
| 27 |
+
def __init__(self, weights=PLATFORM_WEIGHTS):
|
| 28 |
+
# The weights dictionary is passed as a default parameter
|
| 29 |
+
self.weights = weights
|
| 30 |
+
|
| 31 |
+
def _diminishing_returns(self, minutes):
|
| 32 |
+
# ... your method code ...
|
| 33 |
+
return np.log10(minutes + 1)
|
| 34 |
+
def calculate_scores(self, user_input_df: pd.DataFrame):
|
| 35 |
+
# 1. Merge weights with user input
|
| 36 |
+
df = user_input_df.merge(
|
| 37 |
+
pd.DataFrame.from_dict(self.weights, orient='index'),
|
| 38 |
+
left_on='platform_name',
|
| 39 |
+
right_index=True,
|
| 40 |
+
how='left'
|
| 41 |
+
).fillna(0) # Fills missing weights with 0 for platforms not in list
|
| 42 |
+
|
| 43 |
+
total_mins = df['minutes_per_week'].sum()
|
| 44 |
+
|
| 45 |
+
# 2. Calculate Core Scores
|
| 46 |
+
df['C_Contrib'] = df.apply(lambda row: row['W_C'] * self._diminishing_returns(row['minutes_per_week']), axis=1)
|
| 47 |
+
df['A_Contrib'] = df.apply(lambda row: row['W_A'] * row['minutes_per_week'], axis=1)
|
| 48 |
+
|
| 49 |
+
C_Score = df['C_Contrib'].sum()
|
| 50 |
+
A_Risk = df['A_Contrib'].sum()
|
| 51 |
+
|
| 52 |
+
# 3. Calculate D-Index (Platform Diversity)
|
| 53 |
+
df['Min_Share'] = df['minutes_per_week'] / total_mins
|
| 54 |
+
D_Index = 1 / (df['Min_Share']**2).sum() if total_mins > 0 else 0
|
| 55 |
+
|
| 56 |
+
# 4. Calculate Cultural Efficiency
|
| 57 |
+
df['Cultural_Efficiency'] = df['C_Contrib'] / df['minutes_per_week'].replace(0, np.nan) # Avoid div by zero
|
| 58 |
+
|
| 59 |
+
return {
|
| 60 |
+
"C_Score": C_Score,
|
| 61 |
+
"A_Risk": A_Risk,
|
| 62 |
+
"D_Index": D_Index,
|
| 63 |
+
"Per_Platform_Efficiency": df[['platform_name', 'Cultural_Efficiency']].dropna().to_dict('records')
|
| 64 |
+
}
|
| 65 |
+
|
| 66 |
+
# Example Usage:
|
| 67 |
+
# user_data = pd.DataFrame([{'platform_name': 'TikTok', 'minutes_per_week': 300}, ...])
|
| 68 |
+
# model = CEARModel()
|
| 69 |
+
# model.calculate_scores(user_data)
|
platform_weights.json
ADDED
|
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"TikTok": {"W_C": 0.95, "W_A": 0.90},
|
| 3 |
+
"Instagram": {"W_C": 0.85, "W_A": 0.85},
|
| 4 |
+
"YouTube": {"W_C": 0.70, "W_A": 0.75},
|
| 5 |
+
"X/Twitter": {"W_C": 0.80, "W_A": 0.70},
|
| 6 |
+
"Facebook": {"W_C": 0.50, "W_A": 0.60},
|
| 7 |
+
"Reddit": {"W_C": 0.60, "W_A": 0.40},
|
| 8 |
+
"LinkedIn": {"W_C": 0.10, "W_A": 0.20}
|
| 9 |
+
}
|
requirements.txt
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Core Model Dependencies
|
| 2 |
+
pandas
|
| 3 |
+
numpy
|
| 4 |
+
|
| 5 |
+
# Gradio Space Dependencies
|
| 6 |
+
# Gradio is used to build the simple web application interface (app.py)
|
| 7 |
+
gradio
|