Upload 5 files

Browse files

Files changed (5) hide show

README.md +91 -0
app.py +49 -0
cear_model.py +69 -0
platform_weights.json +9 -0
requirements.txt +7 -0

README.md ADDED Viewed

	@@ -0,0 +1,91 @@

+# Cultural Exposure & Algorithmic Risk (CEAR) Baseline v1.0
+## Model Description
+The **Cultural Exposure & Algorithmic Risk (CEAR) Model** is an **analytic, rule-based scoring system** designed to help users and researchers interpret social media usage in terms of its potential impact on cultural awareness and algorithmic vulnerability.
+This version is a V1 Baseline: it is **deterministic** (theory-driven by fixed rules and weights) and does not rely on supervised machine learning or proprietary user data.
+### 🎯 Key Outputs
+1.  **Cultural Connectedness Score (C-Score):** Estimates exposure to viral and trending content, modeled with diminishing returns on time.
+2.  **Algorithmic Risk Score (A-Risk):** Quantifies vulnerability incurred from concentrated time on high-intensity, opaque algorithmic feeds.
+3.  **Platform Diversity Index (D-Index):** Measures the concentration/spread of usage across platforms (using $1/\text{HHI}$).
+4.  **Cultural Efficiency:** Per-platform estimates of C-Score gained per minute spent.
+## ⚙️ Analytic Basis & Scoring Logic
+The model is defined by transparent assumptions encoded in the Python code (`cear_model.py`) and the platform weights (`platform_weights.json`).
+### Core Formulas
+The key to the C-Score is the **Diminishing Returns Function** ($f_{DR}$), which prevents the C-Score from increasing linearly with time, acknowledging that the first hour is likely more valuable than the tenth.
+$$f_{DR}(\text{Min}) = \log_{10}(\text{Min} + 1)$$
+The final scores are calculated as:
+$$C_{Score} = \sum_{i} \left[ W_{C,i} \times f_{DR}(\text{Min}_i) \right]$$
+$$A_{Risk} = \sum_{i} \left[ W_{A,i} \times \text{Min}_i \right]$$
+*(Where $W_{C}$ is the Trend Density Weight and $W_{A}$ is the Algorithmic Risk Weight, defined in `platform_weights.json`.)*
+## 🚀 Deployment & Usage (Hugging Face Space)
+This repository contains the core logic (`cear_model.py`) and the application interface (`app.py`) for a Hugging Face Space.
+### Model Integration (The Engine)
+The core logic can be imported and run in any environment:
+```python
+import pandas as pd
+from cear_model import CEARModel
+# Example Input Data
+user_data = pd.DataFrame([
+    {'platform_name': 'TikTok', 'minutes_per_week': 450},
+    {'platform_name': 'YouTube', 'minutes_per_week': 200},
+    {'platform_name': 'Reddit', 'minutes_per_week': 50},
+])
+model = CEARModel()
+results = model.calculate_scores(user_data)
+# {'C_Score': 3.75, 'A_Risk': 565.0, ...}
+# Application Interface (The App - app.py)
+The app.py script uses the Gradio library to create an interactive web interface. It handles:
+    Collecting user input via a table component.
+    Calling the CEARModel.calculate_scores() method.
+    Generating a qualitative natural language summary based on the quadrant of the C-Score and A-Risk (e.g., "High C, Low A").
+⚠️ Limitations and Ethical Considerations
+1. Theoretical, Not Validated: The scores are based on fixed, theoretical assumptions about platform design. They are not calibrated against real-world user survey data or outcomes (e.g., actual cultural literacy, actual regret). Scores are relative estimates only.
+2. No Content Analysis: The model only uses time and platform. It cannot distinguish between a productive hour watching educational content and an unproductive hour scrolling low-quality content.
+3. Future Work: This deterministic model serves as a foundation. Future versions are intended to use the same input schema to train supervised machine learning models that directly predict outcomes (e.g., predicting user-reported "felt caught up" or "post-scroll regret").
+---
+## 2. `requirements.txt` (For Deployment)
+This file lists the necessary Python packages for the Gradio Space to run your model and interface correctly.
+```text
+# requirements.txt
+# Core Model Dependencies
+pandas
+numpy
+# Gradio Space Dependencies
+# Gradio is used to build the simple web application interface (app.py)
+gradio

app.py ADDED Viewed

	@@ -0,0 +1,49 @@

+# app.py (Simplified Gradio code)
+import gradio as gr
+from cear_model import CEARModel
+import pandas as pd
+# ... (include logic to load PLATFORM_WEIGHTS)
+# Instantiate the model globally
+cear_analyzer = CEARModel()
+def analyze_user_data(input_table):
+    # 1. Convert Gradio input (list of lists) to DataFrame
+    user_data_df = pd.DataFrame(input_table, columns=['platform_name', 'minutes_per_week'])
+    user_data_df['minutes_per_week'] = pd.to_numeric(user_data_df['minutes_per_week'], errors='coerce').fillna(0)
+    # 2. Call the core model
+    raw_scores = cear_analyzer.calculate_scores(user_data_df)
+    # 3. Format output for the user (The "App" layer)
+    summary = f"""
+    ## 📊 Analysis Summary
+    - **Cultural Connectedness Score (C-Score):** **{raw_scores['C_Score']:.2f}**
+    - **Algorithmic Risk Score (A-Risk):** **{raw_scores['A_Risk']:.2f}**
+    - **Platform Diversity Index (D-Index):** **{raw_scores['D_Index']:.2f}**
+    ---
+    ### 📝 Interpretation
+    *Your C-Score is based on logarithmically scaled time, reflecting diminishing returns. Your A-Risk is based on raw time, reflecting concentrated attention.*
+    """
+    # Return the formatted string and potentially a table of efficiency
+    return summary, pd.DataFrame(raw_scores['Per_Platform_Efficiency'])
+# Define the Gradio interface
+iface = gr.Interface(
+    fn=analyze_user_data,
+    inputs=gr.Dataframe(
+        headers=['platform_name', 'minutes_per_week'],
+        row_count=5,
+        col_count=(2, 'fixed'),
+        label="Weekly Screen Time Input (Source data from OS Tracker)"
+    ),
+    outputs=[
+        gr.Markdown(label="Score Results"),
+        gr.Dataframe(label="Per-Platform Cultural Efficiency")
+    ],
+    title="CEAR Baseline: Cultural Exposure & Algorithmic Risk Analyzer"
+)
+iface.launch()

cear_model.py ADDED Viewed

	@@ -0,0 +1,69 @@

+# cear_model.py
+import numpy as np
+import pandas as pd
+import json
+import os # Necessary for finding the JSON file
+# --- 1. Load PLATFORM_WEIGHTS variable from JSON ---
+PLATFORM_WEIGHTS = {} # Default value
+try:
+    # Get the directory of the current script (cear_model.py)
+    script_dir = os.path.dirname(os.path.abspath(__file__))
+    json_path = os.path.join(script_dir, 'platform_weights.json')
+    with open(json_path, 'r') as f:
+        # Load the configuration data into the global variable
+        PLATFORM_WEIGHTS = json.load(f)
+except FileNotFoundError:
+    # This warning is useful for debugging if the file is missing
+    print("FATAL ERROR: platform_weights.json not found! Using empty weights.")
+    # The default empty {} dict is used if the file is missing
+# --- 2. Define the Model Class ---
+# The class can now safely reference the global PLATFORM_WEIGHTS variable
+class CEARModel:
+    def __init__(self, weights=PLATFORM_WEIGHTS):
+        # The weights dictionary is passed as a default parameter
+        self.weights = weights
+    def _diminishing_returns(self, minutes):
+        # ... your method code ...
+        return np.log10(minutes + 1)
+    def calculate_scores(self, user_input_df: pd.DataFrame):
+        # 1. Merge weights with user input
+        df = user_input_df.merge(
+            pd.DataFrame.from_dict(self.weights, orient='index'),
+            left_on='platform_name',
+            right_index=True,
+            how='left'
+        ).fillna(0) # Fills missing weights with 0 for platforms not in list
+        total_mins = df['minutes_per_week'].sum()
+        # 2. Calculate Core Scores
+        df['C_Contrib'] = df.apply(lambda row: row['W_C'] * self._diminishing_returns(row['minutes_per_week']), axis=1)
+        df['A_Contrib'] = df.apply(lambda row: row['W_A'] * row['minutes_per_week'], axis=1)
+        C_Score = df['C_Contrib'].sum()
+        A_Risk = df['A_Contrib'].sum()
+        # 3. Calculate D-Index (Platform Diversity)
+        df['Min_Share'] = df['minutes_per_week'] / total_mins
+        D_Index = 1 / (df['Min_Share']**2).sum() if total_mins > 0 else 0
+        # 4. Calculate Cultural Efficiency
+        df['Cultural_Efficiency'] = df['C_Contrib'] / df['minutes_per_week'].replace(0, np.nan) # Avoid div by zero
+        return {
+            "C_Score": C_Score,
+            "A_Risk": A_Risk,
+            "D_Index": D_Index,
+            "Per_Platform_Efficiency": df[['platform_name', 'Cultural_Efficiency']].dropna().to_dict('records')
+        }
+# Example Usage:
+# user_data = pd.DataFrame([{'platform_name': 'TikTok', 'minutes_per_week': 300}, ...])
+# model = CEARModel()
+# model.calculate_scores(user_data)

platform_weights.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+    "TikTok": {"W_C": 0.95, "W_A": 0.90},
+    "Instagram": {"W_C": 0.85, "W_A": 0.85},
+    "YouTube": {"W_C": 0.70, "W_A": 0.75},
+    "X/Twitter": {"W_C": 0.80, "W_A": 0.70},
+    "Facebook": {"W_C": 0.50, "W_A": 0.60},
+    "Reddit": {"W_C": 0.60, "W_A": 0.40},
+    "LinkedIn": {"W_C": 0.10, "W_A": 0.20}
+}

requirements.txt ADDED Viewed

	@@ -0,0 +1,7 @@

+# Core Model Dependencies
+pandas
+numpy
+# Gradio Space Dependencies
+# Gradio is used to build the simple web application interface (app.py)
+gradio