dotoking commited on
Commit
87cdc48
·
verified ·
1 Parent(s): 90197c9

Upload 5 files

Browse files
Files changed (5) hide show
  1. README.md +91 -0
  2. app.py +49 -0
  3. cear_model.py +69 -0
  4. platform_weights.json +9 -0
  5. requirements.txt +7 -0
README.md ADDED
@@ -0,0 +1,91 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Cultural Exposure & Algorithmic Risk (CEAR) Baseline v1.0
2
+
3
+ ## Model Description
4
+
5
+ The **Cultural Exposure & Algorithmic Risk (CEAR) Model** is an **analytic, rule-based scoring system** designed to help users and researchers interpret social media usage in terms of its potential impact on cultural awareness and algorithmic vulnerability.
6
+
7
+ This version is a V1 Baseline: it is **deterministic** (theory-driven by fixed rules and weights) and does not rely on supervised machine learning or proprietary user data.
8
+
9
+ ### 🎯 Key Outputs
10
+
11
+ 1. **Cultural Connectedness Score (C-Score):** Estimates exposure to viral and trending content, modeled with diminishing returns on time.
12
+ 2. **Algorithmic Risk Score (A-Risk):** Quantifies vulnerability incurred from concentrated time on high-intensity, opaque algorithmic feeds.
13
+ 3. **Platform Diversity Index (D-Index):** Measures the concentration/spread of usage across platforms (using $1/\text{HHI}$).
14
+ 4. **Cultural Efficiency:** Per-platform estimates of C-Score gained per minute spent.
15
+
16
+ ## ⚙️ Analytic Basis & Scoring Logic
17
+
18
+ The model is defined by transparent assumptions encoded in the Python code (`cear_model.py`) and the platform weights (`platform_weights.json`).
19
+
20
+ ### Core Formulas
21
+
22
+ The key to the C-Score is the **Diminishing Returns Function** ($f_{DR}$), which prevents the C-Score from increasing linearly with time, acknowledging that the first hour is likely more valuable than the tenth.
23
+
24
+ $$f_{DR}(\text{Min}) = \log_{10}(\text{Min} + 1)$$
25
+
26
+ The final scores are calculated as:
27
+
28
+ $$C_{Score} = \sum_{i} \left[ W_{C,i} \times f_{DR}(\text{Min}_i) \right]$$
29
+
30
+ $$A_{Risk} = \sum_{i} \left[ W_{A,i} \times \text{Min}_i \right]$$
31
+
32
+ *(Where $W_{C}$ is the Trend Density Weight and $W_{A}$ is the Algorithmic Risk Weight, defined in `platform_weights.json`.)*
33
+
34
+ ## 🚀 Deployment & Usage (Hugging Face Space)
35
+
36
+ This repository contains the core logic (`cear_model.py`) and the application interface (`app.py`) for a Hugging Face Space.
37
+
38
+ ### Model Integration (The Engine)
39
+
40
+ The core logic can be imported and run in any environment:
41
+
42
+ ```python
43
+ import pandas as pd
44
+ from cear_model import CEARModel
45
+
46
+ # Example Input Data
47
+ user_data = pd.DataFrame([
48
+ {'platform_name': 'TikTok', 'minutes_per_week': 450},
49
+ {'platform_name': 'YouTube', 'minutes_per_week': 200},
50
+ {'platform_name': 'Reddit', 'minutes_per_week': 50},
51
+ ])
52
+
53
+ model = CEARModel()
54
+ results = model.calculate_scores(user_data)
55
+ # {'C_Score': 3.75, 'A_Risk': 565.0, ...}
56
+
57
+ # Application Interface (The App - app.py)
58
+
59
+ The app.py script uses the Gradio library to create an interactive web interface. It handles:
60
+
61
+ Collecting user input via a table component.
62
+
63
+ Calling the CEARModel.calculate_scores() method.
64
+
65
+ Generating a qualitative natural language summary based on the quadrant of the C-Score and A-Risk (e.g., "High C, Low A").
66
+
67
+ ⚠️ Limitations and Ethical Considerations
68
+
69
+ 1. Theoretical, Not Validated: The scores are based on fixed, theoretical assumptions about platform design. They are not calibrated against real-world user survey data or outcomes (e.g., actual cultural literacy, actual regret). Scores are relative estimates only.
70
+
71
+ 2. No Content Analysis: The model only uses time and platform. It cannot distinguish between a productive hour watching educational content and an unproductive hour scrolling low-quality content.
72
+
73
+ 3. Future Work: This deterministic model serves as a foundation. Future versions are intended to use the same input schema to train supervised machine learning models that directly predict outcomes (e.g., predicting user-reported "felt caught up" or "post-scroll regret").
74
+
75
+
76
+ ---
77
+
78
+ ## 2. `requirements.txt` (For Deployment)
79
+
80
+ This file lists the necessary Python packages for the Gradio Space to run your model and interface correctly.
81
+
82
+ ```text
83
+ # requirements.txt
84
+
85
+ # Core Model Dependencies
86
+ pandas
87
+ numpy
88
+
89
+ # Gradio Space Dependencies
90
+ # Gradio is used to build the simple web application interface (app.py)
91
+ gradio
app.py ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # app.py (Simplified Gradio code)
2
+
3
+ import gradio as gr
4
+ from cear_model import CEARModel
5
+ import pandas as pd
6
+ # ... (include logic to load PLATFORM_WEIGHTS)
7
+
8
+ # Instantiate the model globally
9
+ cear_analyzer = CEARModel()
10
+
11
+ def analyze_user_data(input_table):
12
+ # 1. Convert Gradio input (list of lists) to DataFrame
13
+ user_data_df = pd.DataFrame(input_table, columns=['platform_name', 'minutes_per_week'])
14
+ user_data_df['minutes_per_week'] = pd.to_numeric(user_data_df['minutes_per_week'], errors='coerce').fillna(0)
15
+
16
+ # 2. Call the core model
17
+ raw_scores = cear_analyzer.calculate_scores(user_data_df)
18
+
19
+ # 3. Format output for the user (The "App" layer)
20
+ summary = f"""
21
+ ## 📊 Analysis Summary
22
+ - **Cultural Connectedness Score (C-Score):** **{raw_scores['C_Score']:.2f}**
23
+ - **Algorithmic Risk Score (A-Risk):** **{raw_scores['A_Risk']:.2f}**
24
+ - **Platform Diversity Index (D-Index):** **{raw_scores['D_Index']:.2f}**
25
+ ---
26
+ ### 📝 Interpretation
27
+ *Your C-Score is based on logarithmically scaled time, reflecting diminishing returns. Your A-Risk is based on raw time, reflecting concentrated attention.*
28
+ """
29
+
30
+ # Return the formatted string and potentially a table of efficiency
31
+ return summary, pd.DataFrame(raw_scores['Per_Platform_Efficiency'])
32
+
33
+ # Define the Gradio interface
34
+ iface = gr.Interface(
35
+ fn=analyze_user_data,
36
+ inputs=gr.Dataframe(
37
+ headers=['platform_name', 'minutes_per_week'],
38
+ row_count=5,
39
+ col_count=(2, 'fixed'),
40
+ label="Weekly Screen Time Input (Source data from OS Tracker)"
41
+ ),
42
+ outputs=[
43
+ gr.Markdown(label="Score Results"),
44
+ gr.Dataframe(label="Per-Platform Cultural Efficiency")
45
+ ],
46
+ title="CEAR Baseline: Cultural Exposure & Algorithmic Risk Analyzer"
47
+ )
48
+
49
+ iface.launch()
cear_model.py ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # cear_model.py
2
+ import numpy as np
3
+ import pandas as pd
4
+ import json
5
+ import os # Necessary for finding the JSON file
6
+
7
+ # --- 1. Load PLATFORM_WEIGHTS variable from JSON ---
8
+ PLATFORM_WEIGHTS = {} # Default value
9
+
10
+ try:
11
+ # Get the directory of the current script (cear_model.py)
12
+ script_dir = os.path.dirname(os.path.abspath(__file__))
13
+ json_path = os.path.join(script_dir, 'platform_weights.json')
14
+
15
+ with open(json_path, 'r') as f:
16
+ # Load the configuration data into the global variable
17
+ PLATFORM_WEIGHTS = json.load(f)
18
+
19
+ except FileNotFoundError:
20
+ # This warning is useful for debugging if the file is missing
21
+ print("FATAL ERROR: platform_weights.json not found! Using empty weights.")
22
+ # The default empty {} dict is used if the file is missing
23
+
24
+ # --- 2. Define the Model Class ---
25
+ # The class can now safely reference the global PLATFORM_WEIGHTS variable
26
+ class CEARModel:
27
+ def __init__(self, weights=PLATFORM_WEIGHTS):
28
+ # The weights dictionary is passed as a default parameter
29
+ self.weights = weights
30
+
31
+ def _diminishing_returns(self, minutes):
32
+ # ... your method code ...
33
+ return np.log10(minutes + 1)
34
+ def calculate_scores(self, user_input_df: pd.DataFrame):
35
+ # 1. Merge weights with user input
36
+ df = user_input_df.merge(
37
+ pd.DataFrame.from_dict(self.weights, orient='index'),
38
+ left_on='platform_name',
39
+ right_index=True,
40
+ how='left'
41
+ ).fillna(0) # Fills missing weights with 0 for platforms not in list
42
+
43
+ total_mins = df['minutes_per_week'].sum()
44
+
45
+ # 2. Calculate Core Scores
46
+ df['C_Contrib'] = df.apply(lambda row: row['W_C'] * self._diminishing_returns(row['minutes_per_week']), axis=1)
47
+ df['A_Contrib'] = df.apply(lambda row: row['W_A'] * row['minutes_per_week'], axis=1)
48
+
49
+ C_Score = df['C_Contrib'].sum()
50
+ A_Risk = df['A_Contrib'].sum()
51
+
52
+ # 3. Calculate D-Index (Platform Diversity)
53
+ df['Min_Share'] = df['minutes_per_week'] / total_mins
54
+ D_Index = 1 / (df['Min_Share']**2).sum() if total_mins > 0 else 0
55
+
56
+ # 4. Calculate Cultural Efficiency
57
+ df['Cultural_Efficiency'] = df['C_Contrib'] / df['minutes_per_week'].replace(0, np.nan) # Avoid div by zero
58
+
59
+ return {
60
+ "C_Score": C_Score,
61
+ "A_Risk": A_Risk,
62
+ "D_Index": D_Index,
63
+ "Per_Platform_Efficiency": df[['platform_name', 'Cultural_Efficiency']].dropna().to_dict('records')
64
+ }
65
+
66
+ # Example Usage:
67
+ # user_data = pd.DataFrame([{'platform_name': 'TikTok', 'minutes_per_week': 300}, ...])
68
+ # model = CEARModel()
69
+ # model.calculate_scores(user_data)
platform_weights.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "TikTok": {"W_C": 0.95, "W_A": 0.90},
3
+ "Instagram": {"W_C": 0.85, "W_A": 0.85},
4
+ "YouTube": {"W_C": 0.70, "W_A": 0.75},
5
+ "X/Twitter": {"W_C": 0.80, "W_A": 0.70},
6
+ "Facebook": {"W_C": 0.50, "W_A": 0.60},
7
+ "Reddit": {"W_C": 0.60, "W_A": 0.40},
8
+ "LinkedIn": {"W_C": 0.10, "W_A": 0.20}
9
+ }
requirements.txt ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ # Core Model Dependencies
2
+ pandas
3
+ numpy
4
+
5
+ # Gradio Space Dependencies
6
+ # Gradio is used to build the simple web application interface (app.py)
7
+ gradio