Gmrock commited on
Commit
fef9b59
·
verified ·
1 Parent(s): 3dfd68e

Upload 5 files

Browse files
Files changed (5) hide show
  1. README.md +245 -9
  2. app.py +338 -0
  3. fetch_data.py +178 -0
  4. posthog_impact_data.csv +107 -0
  5. requirements.txt +4 -0
README.md CHANGED
@@ -1,12 +1,248 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
- title: Engineer Impact
3
- emoji: 👁
4
- colorFrom: blue
5
- colorTo: green
6
- sdk: docker
7
- pinned: false
8
- license: afl-3.0
9
- short_description: Most impactful engineer in public repo
 
 
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🏛️ Engineering Impact Dashboard
2
+
3
+ A hybrid quantitative–qualitative engineering leadership engine that moves beyond naive developer analytics (such as counting commits or lines of code) and instead measures **engineering leverage, intent, and team citizenship**.
4
+
5
+ This framework scales telemetry relative to active team baselines, filters out low-signal automated activity, rewards high-leverage structural work, and incorporates qualitative leadership impact.
6
+
7
+ ---
8
+
9
+ # 🧭 Core Philosophy & Pillars
10
+
11
+ Traditional engineering trackers are often easy to game and can alienate developers. This project evaluates engineering value across four strategic pillars:
12
+
13
+ ## 📦 Execution Baseline
14
+
15
+ Measures operational scope, complex feature delivery, and high-priority issue resolution.
16
+
17
+ The engine scans pull request metadata for:
18
+
19
+ * Critical indicators
20
+ * Bug labels and fix signals
21
+ * Architectural modifications
22
+ * Scope and delivery patterns
23
+
24
+ ---
25
+
26
+ ## 💬 Collaboration & Mentorship
27
+
28
+ Quantifies engineering leverage and team citizenship.
29
+
30
+ The framework analyzes code review behavior using a **Substantive Word Filter (>15 words)** to isolate meaningful engineering feedback from low-signal approvals such as:
31
+
32
+ * "LGTM"
33
+ * "Looks good"
34
+ * Rubber-stamp reviews
35
+
36
+ This helps surface engineers contributing thoughtful mentorship and review depth.
37
+
38
+ ---
39
+
40
+ ## 🛑 System Quality
41
+
42
+ Tracks production stability and defensive engineering practices.
43
+
44
+ The system introduces a structural accountability layer by applying deduction penalties for:
45
+
46
+ * Triggered Git reverts
47
+ * Avoidable regressions
48
+ * Stability-related disruptions
49
+
50
+ ---
51
+
52
+ ## 🤝 Human Touch
53
+
54
+ A qualitative layer completed by engineering managers to capture high-value leadership signals that repositories cannot measure directly, including:
55
+
56
+ * Architectural planning
57
+ * Team leadership
58
+ * Mentorship
59
+ * Incident responsiveness
60
+ * Availability during unscripted operational escalations
61
+
62
+ ---
63
+
64
+ # 📐 How the Scoring Engine Works
65
+
66
+ The scoring model avoids rigid quotas by using **Peer Cohort Normalization**.
67
+
68
+ Instead of evaluating engineers against fixed thresholds, raw metrics are scaled relative to the strongest contributor (**Peak**) inside a rolling **90-day window**.
69
+
70
+ This ensures performance expectations adapt naturally to:
71
+
72
+ * Team velocity
73
+ * Product lifecycle stage
74
+ * Organizational priorities
75
+
76
+ ### Pillar Component Ratio
77
+
78
+ ```math
79
+ Pillar Component Ratio =
80
+ Individual Raw Value / Cohort Max Ceiling (90-day Peak)
81
+ ```
82
+
83
+ ### Impact Score Formula
84
+
85
+ An engineer’s final score is dynamically calculated across all weighted pillars and capped at **100 points**.
86
+
87
+ ```math
88
+ Impact Score =
89
+ Σ (Normalized Pillar Strength × Strategy Weight) × 100
90
+ ```
91
+
92
+ ---
93
+
94
+ # 🛠️ System Architecture
95
+
96
+ The ecosystem consists of a lightweight two-tier telemetry pipeline:
97
+
98
+ ```text
99
+ [ GitHub API Engine ]
100
+
101
+
102
+ (Extracts Raw Telemetry & Text Filters)
103
+ ┌──────────────────────────┐
104
+ │ fetch_data.py │
105
+ └──────────────────────────┘
106
+
107
+
108
+ (Persists Metrics Matrix)
109
+ ┌──────────────────────────┐
110
+ │ posthog_impact_data.csv │
111
+ └──────────────────────────┘
112
+
113
+
114
+ (Dynamic Weights & Normalization Engine)
115
+ ┌──────────────────────────┐
116
+ │ app.py (Streamlit UI) │
117
+ └──────────────────────────┘
118
+ ```
119
+
120
+ ## `fetch_data.py`
121
+
122
+ The ingestion pipeline.
123
+
124
+ Responsibilities include:
125
+
126
+ * Connecting to repository APIs
127
+ * Parsing pull request labels
128
+ * Tracking merge timelines
129
+ * Measuring code review comment depth
130
+ * Detecting revert activity
131
+ * Persisting telemetry into:
132
+
133
+ ```text
134
+ posthog_impact_data.csv
135
+ ```
136
+
137
+ ## `app.py`
138
+
139
+ The interactive leadership dashboard built using Streamlit.
140
+
141
+ Responsibilities include:
142
+
143
+ * Reading telemetry matrices
144
+ * Applying normalization logic
145
+ * Dynamically adjusting strategy weights
146
+ * Re-scoring engineers in real time based on business priorities
147
+
148
+ ---
149
+
150
+ # 🚀 Quick Start & Installation
151
+
152
+ ## 1. Clone the Repository
153
+
154
+ ```bash
155
+ git clone https://github.com/gmrock/engineer-impact.git
156
+ cd engineer-impact
157
+ ```
158
+
159
+ ## 2. Install Dependencies
160
+
161
+ Ensure you have **Python 3.9+** installed.
162
+
163
+ Then install the required packages:
164
+
165
+ ```bash
166
+ pip install -r requirements.txt
167
+ ```
168
+
169
+ ## 3. Generate Telemetry Cache
170
+
171
+ Run the ingestion pipeline:
172
+
173
+ ```bash
174
+ python fetch_data.py
175
+ ```
176
+
177
+ This step populates:
178
+
179
+ ```text
180
+ posthog_impact_data.csv
181
+ ```
182
+
183
+ with the underlying telemetry baseline variables.
184
+
185
+ You may configure environment credentials to connect against production repository APIs.
186
+
187
+ ## 4. Launch the Dashboard
188
+
189
+ Start the Streamlit application locally:
190
+
191
+ ```bash
192
+ streamlit run app.py
193
+ ```
194
+
195
+ ---
196
+
197
+ # ⚙️ Strategic Priority Alignment in Practice
198
+
199
+ Instead of enforcing a rigid definition of engineering impact, the dashboard gives leadership dynamic control through adjustable strategy weights.
200
+
201
+ ## 🚀 Feature Shipping Sprint
202
+
203
+ Increase **Execution Weight** (`0.50+`) to prioritize:
204
+
205
+ * Feature throughput
206
+ * Fast iteration cycles
207
+ * Delivery velocity
208
+
209
+ ---
210
+
211
+ ## 🛡️ System Stability Freeze
212
+
213
+ Increase **System Quality Weight** (`0.40+`) when reliability becomes the top priority.
214
+
215
+ This shifts rewards toward engineers who:
216
+
217
+ * Stabilize production systems
218
+ * Reduce regressions
219
+ * Prevent reverts
220
+ * Slow feature development to improve reliability
221
+
222
  ---
223
+
224
+ ## 👥 Mentorship & Onboarding Focus
225
+
226
+ Increase **Collaboration Weight** to recognize engineers investing time in:
227
+
228
+ * Detailed code reviews
229
+ * Technical mentoring
230
+ * Structural engineering guidance
231
+ * Onboarding support
232
+
233
  ---
234
 
235
+ # 🎯 Why This Exists
236
+
237
+ Most engineering metrics systems optimize for **activity**.
238
+
239
+ This framework optimizes for **impact**.
240
+
241
+ Rather than rewarding sheer output volume, it attempts to surface engineers who:
242
+
243
+ * Create leverage
244
+ * Improve system reliability
245
+ * Mentor teammates
246
+ * Make thoughtful architectural contributions
247
+ * Increase overall engineering effectiveness
248
+
app.py ADDED
@@ -0,0 +1,338 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import streamlit as st
2
+ import pandas as pd
3
+ import numpy as np
4
+ from datetime import datetime, timedelta
5
+
6
+ # Set page layout to wide for dashboard tracking
7
+ st.set_page_config(layout="wide", page_title="PostHog Engineering Impact Dashboard")
8
+
9
+ # -------------------------------------------------------------
10
+ # 🎯 INJECTED CSS: HIDES STREAMLIT ROW-SELECTION BUTTONS COLUMN
11
+ # -------------------------------------------------------------
12
+ st.html("""
13
+ <style>
14
+ /* Target and completely hide the data grid's row-selection column wrapper */
15
+ div[data-testid="stDataFrame"] [class*="gdg-row-header"],
16
+ div[data-testid="stDataFrame"] .glide-data-grid-row-header-container,
17
+ div[data-testid="stDataFrame"] th[class*="row-header"] {
18
+ display: none !important;
19
+ width: 0px !important;
20
+ }
21
+ </style>
22
+ """)
23
+
24
+ # Load the data generated by fetch_data.py
25
+ try:
26
+ df = pd.read_csv("posthog_impact_data.csv")
27
+ except FileNotFoundError:
28
+ st.error("❌ Data file 'posthog_impact_data.csv' not found. Please run 'python fetch_data.py' first to collect telemetry.")
29
+ st.stop()
30
+
31
+ # -------------------------------------------------------------
32
+ # DYNAMIC TIMELINE DETECTOR
33
+ # -------------------------------------------------------------
34
+ end_date = datetime.now()
35
+ start_date = end_date - timedelta(days=90)
36
+ date_string = f"🗓️ Duration: {start_date.strftime('%b %d, %Y')} – {end_date.strftime('%b %d, %Y')} (Past 90 Days)"
37
+
38
+ # -------------------------------------------------------------
39
+ # SIDEBAR: CORE PILLARS PHILOSOPHY & CONTROLS
40
+ # -------------------------------------------------------------
41
+ st.sidebar.title("🏛️ Impact Framework Definitions")
42
+
43
+ st.sidebar.markdown("""
44
+ **📦 1. Execution:**
45
+ Measures operational scope, and handling of complex features. Blends bug Fix tags, core architectural, library, infrastructure, core, critical, P0, P1 text/labels/tags matches.
46
+ ***
47
+ **💬 2. Collaboration:**
48
+ Quantifies engineering leverage and team citizenship. Blends *Review Actions* with a *Rubber-Stamp Filter* (>15 words) to isolate meaningful mentorship.
49
+ ***
50
+ **🛑 3. System Quality:**
51
+ Tracks production stability and defensive coding. Evaluates long-term stability by applying a deduction penalty for triggered *Git Reverts*.
52
+ ***
53
+ **🤝 4. Human Touch:**
54
+ Captures critical qualitive values provided through direct team leadership, presence during incident escalation triage, and guidance in design/planning syncs.
55
+ """)
56
+
57
+ st.sidebar.markdown("---")
58
+ st.sidebar.header("⚖️ Strategic Priority Weights")
59
+ st.sidebar.markdown("Adjust macro priorities based on organizational needs:")
60
+
61
+ # Default weights: 0.35, 0.35, 0.20, 0.10
62
+ exec_w = st.sidebar.slider("Execution Weight", 0.0, 1.0, 0.35, 0.05)
63
+ collab_w = st.sidebar.slider("Collaboration Weight", 0.0, 1.0, 0.35, 0.05)
64
+ quality_w = st.sidebar.slider("System Quality Weight", 0.0, 1.0, 0.20, 0.05)
65
+ human_w = st.sidebar.slider("Human Touch Weight", 0.0, 1.0, 0.10, 0.05)
66
+
67
+ # Defensive Zero-Weight Divide-by-Zero Guard
68
+ total_weight = exec_w + collab_w + quality_w + human_w
69
+
70
+ if np.isclose(total_weight, 0.0):
71
+ exec_w_norm = 0.25
72
+ collab_w_norm = 0.25
73
+ quality_w_norm = 0.25
74
+ human_w_norm = 0.25
75
+ st.sidebar.info("ℹ️ All weights set to 0. Defaulting to an equal split (25% each) to prevent math errors.")
76
+ else:
77
+ exec_w_norm = exec_w / total_weight
78
+ collab_w_norm = collab_w / total_weight
79
+ quality_w_norm = quality_w / total_weight
80
+ human_w_norm = human_w / total_weight
81
+
82
+ # -------------------------------------------------------------
83
+ # CORE METRICS ENGINE: Peer Cohort Normalization
84
+ # -------------------------------------------------------------
85
+ max_prs = df['prs_merged'].max() if df['prs_merged'].max() > 0 else 1
86
+ max_bugs = df['bug_fixes'].max() if df['bug_fixes'].max() > 0 else 1
87
+ max_mult = df['multiplier_impact'].max() if df['multiplier_impact'].max() > 0 else 1
88
+ max_actions = df['review_actions'].max() if df['review_actions'].max() > 0 else 1
89
+ max_words = df['review_words_written'].max() if df['review_words_written'].max() > 0 else 1
90
+ max_reverts = df['reverts_triggered'].max() if df['reverts_triggered'].max() > 0 else 1
91
+
92
+ # Synthesize normalized values (0.0 to 1.0)
93
+ df['norm_prs'] = df['prs_merged'] / max_prs
94
+ df['norm_bugs'] = df['bug_fixes'] / max_bugs
95
+ df['norm_mult'] = df['multiplier_impact'] / max_mult
96
+ df['norm_actions'] = df['review_actions'] / max_actions
97
+ df['norm_words'] = df['review_words_written'] / max_words
98
+ df['norm_reverts'] = df['reverts_triggered'] / max_reverts
99
+
100
+ # Human Touch Core Mock Value Generator
101
+ df['human_touch_baseline'] = 0.85
102
+
103
+ # Calculate Internal Pillar Strengths
104
+ df['Execution_Pillar'] = (df['norm_prs'] * 0.4) + (df['norm_bugs'] * 0.3) + (df['norm_mult'] * 0.3)
105
+ df['Collaboration_Pillar'] = (df['norm_actions'] * 0.5) + (df['norm_words'] * 0.5)
106
+ df['Quality_Pillar'] = 1.0 - df['norm_reverts']
107
+ df['Human_Pillar'] = df['human_touch_baseline']
108
+
109
+ # Calculate final component contribution points
110
+ df['Exec_Contribution'] = df['Execution_Pillar'] * exec_w_norm * 100
111
+ df['Collab_Contribution'] = df['Collaboration_Pillar'] * collab_w_norm * 100
112
+ df['Quality_Contribution'] = df['Quality_Pillar'] * quality_w_norm * 100
113
+ df['Human_Contribution'] = df['Human_Pillar'] * human_w_norm * 100
114
+
115
+ # Calculate Final Aggregated Impact Score
116
+ df['Impact_Score'] = df['Exec_Contribution'] + df['Collab_Contribution'] + df['Quality_Contribution'] + df['Human_Contribution']
117
+
118
+ # Sort dataset by absolute overall impact
119
+ df = df.sort_values(by="Impact_Score", ascending=False).reset_index(drop=True)
120
+
121
+ # -------------------------------------------------------------
122
+ # MAIN DISPLAY: LEADERBOARD MATRIX WITH DIRECT ROW SELECTION
123
+ # -------------------------------------------------------------
124
+ st.title("🏛️ PostHog Engineering Impact Leaderboard")
125
+ st.markdown(f"**{date_string}**")
126
+ st.caption("💡 Click on checkbox on the engineer's row below to instantly update their deep-dive profile.")
127
+
128
+ # Dynamic row count limiter dropdown
129
+ view_option = st.selectbox(
130
+ "Set Leaderboard Depth Range:",
131
+ options=["Top 5", "Top 10", "Top 20", "Top 30", "View All Teams"],
132
+ index=0
133
+ )
134
+
135
+ if view_option == "Top 5":
136
+ limit = 5
137
+ elif view_option == "Top 10":
138
+ limit = 10
139
+ elif view_option == "Top 20":
140
+ limit = 20
141
+ elif view_option == "Top 30":
142
+ limit = 30
143
+ else:
144
+ limit = len(df)
145
+
146
+ # Prepare clean dataframe containing active slice data
147
+ leaderboard_slice = df.head(limit).copy()
148
+
149
+ # Dynamically calculate the maximum points possible per column based on weights
150
+ max_exec_possible = exec_w_norm * 100
151
+ max_collab_possible = collab_w_norm * 100
152
+ max_quality_possible = quality_w_norm * 100
153
+ max_human_possible = human_w_norm * 100
154
+
155
+ # Construct display dataframe with explicit Max Point indicators in headers
156
+ display_df = pd.DataFrame({
157
+ 'Engineer Username': leaderboard_slice['engineer'],
158
+ '🏅 Total Impact Score (out of 100)': leaderboard_slice['Impact_Score'].round(1),
159
+ f'📦 Execution (Max {max_exec_possible:.1f} pts)': leaderboard_slice['Exec_Contribution'].round(1),
160
+ f'💬 Collaboration (Max {max_collab_possible:.1f} pts)': leaderboard_slice['Collab_Contribution'].round(1),
161
+ f'🛑 System Quality (Max {max_quality_possible:.1f} pts)': leaderboard_slice['Quality_Contribution'].round(1),
162
+ f'🤝 Human Touch (Max {max_human_possible:.1f} pts)': leaderboard_slice['Human_Contribution'].round(1)
163
+ })
164
+
165
+ # Dynamically calculate optimal table height to eliminate empty rows
166
+ row_height = 35
167
+ header_height = 40
168
+ calculated_height = min(header_height + (len(display_df) * row_height), 450)
169
+
170
+ # Render interactive table with selection tracking active
171
+ selection = st.dataframe(
172
+ display_df.style.format({
173
+ '🏅 Total Impact Score (out of 100)': '{:.1f}',
174
+ f'📦 Execution (Max {max_exec_possible:.1f} pts)': '{:.1f}',
175
+ f'💬 Collaboration (Max {max_collab_possible:.1f} pts)': '{:.1f}',
176
+ f'🛑 System Quality (Max {max_quality_possible:.1f} pts)': '{:.1f}',
177
+ f'🤝 Human Touch (Max {max_human_possible:.1f} pts)': '{:.1f}'
178
+ }),
179
+ use_container_width=True,
180
+ height=calculated_height,
181
+ hide_index=True,
182
+ on_select="rerun",
183
+ selection_mode="single-row-required"
184
+ )
185
+
186
+ # -------------------------------------------------------------
187
+ # MASTER-DETAIL VIEW: DYNAMIC METRICS AUDITOR
188
+ # -------------------------------------------------------------
189
+ st.markdown("<br>", unsafe_allow_html=True)
190
+ st.markdown("---")
191
+
192
+ # Extract chosen engineer row natively without checking box arrays
193
+ if selection and selection.get("selection", {}).get("rows"):
194
+ selected_row_idx = selection["selection"]["rows"][0]
195
+ eng_row = leaderboard_slice.iloc[selected_row_idx]
196
+ else:
197
+ # Safely fall back to the absolute top engineer on landing
198
+ eng_row = df.iloc[0]
199
+
200
+ selected_eng = eng_row['engineer']
201
+
202
+ # --- ADDED: DIRECT MATH PROOF OF THE MAIN MATRIX ACCURACY ---
203
+ st.info(
204
+ f"📊 **Formula Proof for {selected_eng}:** "
205
+ f"📦 Execution (`{eng_row['Exec_Contribution']:.1f}`) + "
206
+ f"💬 Collaboration (`{eng_row['Collab_Contribution']:.1f}`) + "
207
+ f"🛑 Quality (`{eng_row['Quality_Contribution']:.1f}`) + "
208
+ f"🤝 Human Touch (`{eng_row['Human_Contribution']:.1f}`) = "
209
+ f"**🏅 Total Impact Score of {eng_row['Impact_Score']:.1f} / 100**"
210
+ )
211
+
212
+ st.subheader(f"🔍 Deep-Dive Calculation Audit Engine: {selected_eng}")
213
+
214
+ col1, col2 = st.columns([1, 2], gap="large")
215
+
216
+ with col1:
217
+ st.metric("Overall Performance Rating", f"{eng_row['Impact_Score']:.1f} / 100")
218
+ st.markdown(f"""
219
+ **Active Weight Allocation Matrix:**
220
+ * 📦 **Execution Contribution:** `{eng_row['Exec_Contribution']:.1f}` pts
221
+ * 💬 **Collaboration Contribution:** `{eng_row['Collab_Contribution']:.1f}` pts
222
+ * 🛑 **System Quality Contribution:** `{eng_row['Quality_Contribution']:.1f}` pts
223
+ * 🤝 **Human Touch Contribution:** `{eng_row['Human_Contribution']:.1f}` pts
224
+ """)
225
+
226
+ with col2:
227
+ st.markdown("#### **Line-Item Pillar Math Breakdowns**")
228
+
229
+ # -------------------------------------------------------------
230
+ # PILLAR 1: EXECUTION DEEP DIVE
231
+ # -------------------------------------------------------------
232
+ with st.expander(f"📦 Execution Pillar Breakdown: {eng_row['Exec_Contribution']:.1f} pts", expanded=False):
233
+ st.markdown("**1. Cohort Normalization (Raw vs Peak Team Ceiling):**")
234
+ st.markdown(f"- Merged PRs: `{int(eng_row['prs_merged'])}` / `{int(max_prs)}` Max = **{eng_row['norm_prs']:.3f}** ratio")
235
+ st.markdown(f"- Bug Fixes: `{int(eng_row['bug_fixes'])}` / `{int(max_bugs)}` Max = **{eng_row['norm_bugs']:.3f}** ratio")
236
+ st.markdown(f"- **Impact Multipliers:** `{int(eng_row['multiplier_impact'])}` / `{int(max_mult)}` Max = **{eng_row['norm_mult']:.3f}** ratio")
237
+ st.markdown("""
238
+ > 💡 **What is an Impact Multiplier?** \n
239
+ > This tracks high-leverage architectural code contributions. It scans text logs, labels, and files across your pull requests for engineering foundations that multiply the velocity of other teams:
240
+ > * 🛠️ **Infrastructure & Shared Libraries** (`lib`, `infra`, `framework`)
241
+ > * ⚡ **Core System Optimization** (`core`, `performance`, `latency`)
242
+ > * 🔒 **Security & High-Criticality Guards** (`critical`, `P0`, `P1`, `security`, `auth`)
243
+ """)
244
+
245
+ st.markdown("**2. Composite Subsystem Weight Assembly Formula:**")
246
+ st.code(f"""
247
+ Execution Baseline Score = (Norm_PRs * 0.4) + (Norm_Bugs * 0.3) + (Norm_Multipliers * 0.3)
248
+ = ({eng_row['norm_prs']:.3f} * 0.4) + ({eng_row['norm_bugs']:.3f} * 0.3) + ({eng_row['norm_mult']:.3f} * 0.3)
249
+ = {eng_row['Execution_Pillar']:.3f}
250
+ """, language="text")
251
+
252
+ st.markdown("**3. Priority Control Scaling:**")
253
+ st.code(f"""
254
+ Final Points = Baseline Score * Strategy Weight * 100
255
+ = {eng_row['Execution_Pillar']:.3f} * {exec_w_norm:.2f} * 100
256
+ = {eng_row['Exec_Contribution']:.1f} pts
257
+ """, language="text")
258
+
259
+ # -------------------------------------------------------------
260
+ # PILLAR 2: COLLABORATION DEEP DIVE
261
+ # -------------------------------------------------------------
262
+ with st.expander(f"💬 Collaboration Pillar Breakdown: {eng_row['Collab_Contribution']:.1f} pts", expanded=False):
263
+ st.markdown("**1. Cohort Normalization (Raw vs Peak Team Ceiling):**")
264
+ st.markdown(f"- Review Actions Count: `{int(eng_row['review_actions'])}` / `{int(max_actions)}` Max = **{eng_row['norm_actions']:.3f}** ratio")
265
+ st.markdown(f"- Substantive Mentorship Words (>15w): `{int(eng_row['review_words_written'])}` / `{int(max_words)}` Max = **{eng_row['norm_words']:.3f}** ratio")
266
+
267
+ st.markdown("**2. Composite Subsystem Weight Assembly Formula:**")
268
+ st.code(f"""
269
+ Collaboration Baseline Score = (Norm_Actions * 0.5) + (Norm_Words * 0.5)
270
+ = ({eng_row['norm_actions']:.3f} * 0.5) + ({eng_row['norm_words']:.3f} * 0.5)
271
+ = {eng_row['Collaboration_Pillar']:.3f}
272
+ """, language="text")
273
+
274
+ st.markdown("**3. Priority Control Scaling:**")
275
+ st.code(f"""
276
+ Final Points = Baseline Score * Strategy Weight * 100
277
+ = {eng_row['Collaboration_Pillar']:.3f} * {collab_w_norm:.2f} * 100
278
+ = {eng_row['Collab_Contribution']:.1f} pts
279
+ """, language="text")
280
+
281
+ # -------------------------------------------------------------
282
+ # PILLAR 3: SYSTEM QUALITY DEEP DIVE
283
+ # -------------------------------------------------------------
284
+ with st.expander(f"🛑 System Quality Pillar Breakdown: {eng_row['Quality_Contribution']:.1f} pts", expanded=False):
285
+ st.markdown("**1. Cohort Normalization (Raw vs Peak Team Ceiling):**")
286
+ st.markdown(f"- Git Reverts Triggered: `{int(eng_row['reverts_triggered'])}` / `{int(max_reverts)}` Max = **{eng_row['norm_reverts']:.3f}** ratio")
287
+
288
+ st.markdown("**2. Composite Subsystem Weight Assembly Formula:**")
289
+ st.code(f"""
290
+ Quality Baseline Score = 1.0 - Norm_Reverts
291
+ = 1.0 - {eng_row['norm_reverts']:.3f}
292
+ = {eng_row['Quality_Pillar']:.3f}
293
+ """, language="text")
294
+
295
+ st.markdown("**3. Priority Control Scaling:**")
296
+ st.code(f"""
297
+ Final Points = Baseline Score * Strategy Weight * 100
298
+ = {eng_row['Quality_Pillar']:.3f} * {quality_w_norm:.2f} * 100
299
+ = {eng_row['Quality_Contribution']:.1f} pts
300
+ """, language="text")
301
+
302
+ # -------------------------------------------------------------
303
+ # PILLAR 4: HUMAN TOUCH DEEP DIVE
304
+ # -------------------------------------------------------------
305
+ with st.expander(f"🤝 Human Touch Pillar Breakdown: {eng_row['Human_Contribution']:.1f} pts", expanded=False):
306
+ st.markdown("**1. Qualitative Evaluation Criteria Score (Manager Inputs Matrix):**")
307
+ st.markdown(f"- Current Assigned Sync/Escalation Presence Rating = **{eng_row['human_touch_baseline']:.2f}** / 1.0")
308
+ st.markdown("""
309
+ > 💡 **What factors calculate the Human Touch Rating?** \n
310
+ > This value tracks critical behaviors that telemetry cannot isolate from GitHub APIs alone:
311
+ > * 🧠 **Planning & Brainstorming** (Active, clarifying architectural contributions during syncs)
312
+ > * 🚨 **Incident Escalation Response** (Availability and speed to jumping on critical production issues)
313
+ """)
314
+
315
+ st.markdown("**2. Composite Assembly Score Formula:**")
316
+ st.code(f"""
317
+ Human Touch Baseline Score = Manager Evaluation Score
318
+ = {eng_row['human_touch_baseline']:.2f}
319
+ """, language="text")
320
+
321
+ st.markdown("**3. Priority Control Scaling:**")
322
+ st.code(f"""
323
+ Final Points = Baseline Score * Strategy Weight * 100
324
+ = {eng_row['Human_Pillar']:.2f} * {human_w_norm:.2f} * 100
325
+ = {eng_row['Human_Contribution']:.1f} pts
326
+ """, language="text")
327
+
328
+ # -------------------------------------------------------------
329
+ # UNDER THE HOOD RAW TELEMETRY (COLLAPSED BY DEFAULT)
330
+ # -------------------------------------------------------------
331
+ st.markdown("<br>", unsafe_allow_html=True)
332
+ with st.expander("📊 View Underlying Raw GitHub Telemetry Metrics"):
333
+ st.markdown("This section details the raw activity counts gathered before weights or normalization filters were applied.")
334
+ st.dataframe(
335
+ df[['engineer', 'prs_merged', 'bug_fixes', 'multiplier_impact', 'review_actions', 'review_words_written', 'reverts_triggered']],
336
+ use_container_width=True,
337
+ hide_index=True
338
+ )
fetch_data.py ADDED
@@ -0,0 +1,178 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import requests
3
+ import pandas as pd
4
+ from datetime import datetime, timedelta
5
+ from dotenv import load_dotenv
6
+
7
+ # Load variables from .env file
8
+ load_dotenv()
9
+
10
+ # Configuration
11
+ GITHUB_TOKEN = os.getenv("GITHUB_TOKEN")
12
+ REPO = "PostHog/posthog"
13
+ HEADERS = {
14
+ "Accept": "application/vnd.github+json",
15
+ "X-GitHub-Api-Version": "2022-11-28"
16
+ }
17
+
18
+ if GITHUB_TOKEN:
19
+ # Clean up any accidental leading/trailing quotes or whitespace from terminal exports
20
+ token_clean = GITHUB_TOKEN.strip().strip('\"').strip("'")
21
+ HEADERS["Authorization"] = f"Bearer {token_clean}"
22
+ else:
23
+ print("⚠️ WARNING: GITHUB_TOKEN environment variable not found.")
24
+ print("Using unauthenticated requests. GitHub will rate-limit this instantly.")
25
+
26
+ engineers = {}
27
+
28
+ def get_or_init(user):
29
+ if not user or user.endswith("[bot]"):
30
+ return None
31
+ if user not in engineers:
32
+ engineers[user] = {
33
+ "prs_merged": 0,
34
+ "bug_fixes": 0,
35
+ "reverts_triggered": 0,
36
+ "review_actions": 0,
37
+ "review_words_written": 0,
38
+ "multiplier_impact": 0
39
+ }
40
+ return engineers[user]
41
+
42
+ print("🏁 Extracting Advanced Impact Metrics matched to PostHog Topology...")
43
+ cutoff_date = datetime.now() - timedelta(days=90)
44
+
45
+ # -------------------------------------------------------------
46
+ # Phase 1: Scan PR Stream (Execution, Complexity, Reverts)
47
+ # -------------------------------------------------------------
48
+ print("\n📦 Phase 1: Fetching recent Pull Requests...")
49
+ pr_url = f"https://api.github.com/repos/{REPO}/pulls"
50
+ phase_1_success = False
51
+
52
+ for page in range(1, 11):
53
+ params = {
54
+ "state": "closed",
55
+ "sort": "updated",
56
+ "direction": "desc",
57
+ "per_page": 100,
58
+ "page": page
59
+ }
60
+ res = requests.get(pr_url, headers=HEADERS, params=params)
61
+
62
+ if res.status_code != 200:
63
+ print(f"❌ Phase 1 Error on page {page}: API returned {res.status_code} - {res.json().get('message')}")
64
+ break
65
+
66
+ prs = res.json()
67
+ if not prs:
68
+ break
69
+ phase_1_success = True
70
+
71
+ for pr in prs:
72
+ if not pr.get("merged_at"):
73
+ continue
74
+
75
+ merged_at = datetime.strptime(pr["merged_at"], "%Y-%m-%dT%H:%M:%SZ")
76
+ if merged_at < cutoff_date:
77
+ continue
78
+
79
+ author = pr["user"]["login"]
80
+ eng = get_or_init(author)
81
+ if not eng:
82
+ continue
83
+
84
+ # Track raw baseline engineering velocity
85
+ eng["prs_merged"] += 1
86
+
87
+ # Extract textual fields for heuristics matching
88
+ title = pr.get("title", "").lower()
89
+
90
+ # Metric: System Quality (Avoidable Revert Tracking)
91
+ if "revert" in title:
92
+ eng["reverts_triggered"] += 1
93
+
94
+
95
+ # Extract native labels payload once for all downstream metric evaluations
96
+ labels = [l["name"].lower() for l in pr.get("labels", [])]
97
+
98
+ # Condition A: Structural Complexity Multiplier (Title Analysis)
99
+ if any(x in title for x in ["lib", "core", "infra", "architecture", "critical"]):
100
+ eng["multiplier_impact"] += 1
101
+
102
+ # Condition B: High Severity Multiplier (Native Priority Label Analysis)
103
+ # Adds an extra point if the PR is explicitly flagged as a P0 or P1 incident/initiative
104
+ if any(p in labels for p in ["p0", "p1"]):
105
+ eng["multiplier_impact"] += 1
106
+
107
+ # Metric: Native Bug Tracking
108
+ if "bug" in labels or any("bug" in label_name for label_name in labels):
109
+ eng["bug_fixes"] += 1
110
+
111
+ # -------------------------------------------------------------
112
+ # Phase 2: Scan Review Comments Stream (Citizenship & Depth)
113
+ # -------------------------------------------------------------
114
+ print("\n💬 Phase 2: Fetching repository-wide review comments...")
115
+ comments_url = f"https://api.github.com/repos/{REPO}/pulls/comments"
116
+ phase_2_success = False
117
+
118
+ for page in range(1, 11):
119
+ params = {
120
+ "sort": "created",
121
+ "direction": "desc",
122
+ "per_page": 100,
123
+ "page": page
124
+ }
125
+ res = requests.get(comments_url, headers=HEADERS, params=params)
126
+
127
+ if res.status_code != 200:
128
+ print(f"❌ Phase 2 Error on page {page}: API returned {res.status_code} - {res.json().get('message')}")
129
+ break
130
+
131
+ comments = res.json()
132
+ if not comments:
133
+ break
134
+ phase_2_success = True
135
+
136
+ for comment in comments:
137
+ created_at = datetime.strptime(comment["created_at"], "%Y-%m-%dT%H:%M:%SZ")
138
+ if created_at < cutoff_date:
139
+ continue
140
+
141
+ reviewer = comment["user"]["login"]
142
+ eng = get_or_init(reviewer)
143
+ if not eng:
144
+ continue
145
+
146
+ # Track raw volume of code review interaction
147
+ eng["review_actions"] += 1
148
+
149
+ # Metric: Meaningful Review Depth (Filters out superficial "LGTM" comments)
150
+ body = comment.get("body", "")
151
+ word_count = len(body.split())
152
+ if word_count > 15:
153
+ eng["review_words_written"] += word_count
154
+
155
+ # -------------------------------------------------------------
156
+ # Phase 3: Defensive Data Processing and Export
157
+ # -------------------------------------------------------------
158
+ print("\n📊 Phase 3: Processing and Exporting Data...")
159
+ if engineers and (phase_1_success or phase_2_success):
160
+ df = pd.DataFrame.from_dict(engineers, orient='index').reset_index().rename(columns={'index': 'engineer'})
161
+
162
+ # Defensive Schema Guard: Force-initialize expected columns to protect against downstream KeyErrors
163
+ expected_cols = ["prs_merged", "review_actions", "bug_fixes", "reverts_triggered", "multiplier_impact", "review_words_written"]
164
+ for expected_col in expected_cols:
165
+ if expected_col not in df.columns:
166
+ df[expected_col] = 0
167
+ df[expected_col] = df[expected_col].fillna(0)
168
+
169
+ # Prune inactive records to keep dataset compact
170
+ df = df[(df['prs_merged'] > 0) | (df['review_actions'] > 0)]
171
+
172
+ if not df.empty:
173
+ df.to_csv("posthog_impact_data.csv", index=False)
174
+ print("🚀 Advanced metrics pipeline successfully saved to posthog_impact_data.csv")
175
+ else:
176
+ print("⚠️ DataFrame filtered down to 0 rows. No matching active engineers found in this window.")
177
+ else:
178
+ print("❌ Critical Error: No data payload compiled. Please check the API error codes printed above.")
posthog_impact_data.csv ADDED
@@ -0,0 +1,107 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ engineer,prs_merged,bug_fixes,reverts_triggered,review_actions,review_words_written,multiplier_impact
2
+ sampennington,49,0,0,161,7318,2
3
+ cat-ph,12,0,0,4,75,0
4
+ VojtechBartos,11,0,1,13,70,0
5
+ georgemunyoro,3,0,0,0,0,0
6
+ Piccirello,9,0,0,8,974,0
7
+ rnegron,21,0,0,1,33,0
8
+ richardsolomou,5,0,0,13,442,0
9
+ developers-universe-1,1,0,0,0,0,0
10
+ skoob13,6,0,0,3,34,0
11
+ danielcarletti,15,0,0,9,307,0
12
+ andrewm4894,3,0,0,12,416,0
13
+ arnohillen,6,0,0,3,57,2
14
+ pl,6,0,0,8,274,0
15
+ Radu-Raicea,3,0,0,6,237,0
16
+ webjunkie,20,0,0,1,0,0
17
+ rafaeelaudibert,17,0,0,3,41,0
18
+ meikelmosby,6,0,0,3,162,0
19
+ pauldambra,51,0,0,19,1738,1
20
+ turnipdabeets,2,0,0,2,109,0
21
+ dmarchuk,4,0,1,3,44,0
22
+ sakce,7,0,0,6,152,0
23
+ fasyy612,5,0,0,0,0,0
24
+ vdekrijger,2,0,0,136,3485,0
25
+ gesh,15,0,0,4,54,0
26
+ jonmcwest,6,0,0,0,0,0
27
+ jurajmajerik,7,0,0,1,0,0
28
+ robbie-c,12,0,0,3,0,1
29
+ Gilbert09,13,0,0,8,396,0
30
+ leonposthog,1,0,0,0,0,0
31
+ Twixes,5,0,0,1,0,0
32
+ eleftheriatrivyzaki,2,0,0,0,0,0
33
+ joethreepwood,1,0,0,0,0,0
34
+ TueHaulund,9,0,0,0,0,1
35
+ darkopia,1,0,0,0,0,0
36
+ orian,7,0,0,0,0,0
37
+ charlescook-ph,1,0,0,0,0,0
38
+ jabahamondes,2,0,0,2,0,0
39
+ ksvat,4,0,0,0,0,0
40
+ DanielVisca,13,0,0,5,423,0
41
+ gantoine,6,0,1,0,0,0
42
+ nickbest-ph,14,0,0,4,138,0
43
+ haacked,6,0,0,13,1394,0
44
+ fercgomes,8,0,0,0,0,0
45
+ z0br0wn,8,0,0,7,298,0
46
+ matheus-vb,2,0,0,2,134,0
47
+ gustavohstrassburger,4,0,1,0,0,0
48
+ adboio,1,0,0,0,0,0
49
+ feliperalmeida,1,0,0,0,0,0
50
+ arthurdedeus,9,0,0,5,70,0
51
+ a-lider,12,0,1,11,314,0
52
+ eli-r-ph,11,0,0,5,172,0
53
+ kyleswank,1,0,0,0,0,0
54
+ jordanm-posthog,7,0,0,0,0,0
55
+ carlos-marchal-ph,3,0,0,1,0,0
56
+ rorylshanks,5,0,1,0,0,0
57
+ yasen-posthog,2,0,0,7,359,0
58
+ tomasfarias,6,0,0,6,26,0
59
+ estefaniarabadan,5,0,0,3,39,0
60
+ christiaan-ph,3,0,0,0,0,0
61
+ patricio-posthog,2,0,0,0,0,0
62
+ ablaszkiewicz,6,0,0,2,193,0
63
+ andyzzhao,10,0,0,0,0,0
64
+ nicowaltz,4,0,0,4,56,0
65
+ andehen,2,0,0,0,0,0
66
+ thmsobrmlr,11,0,0,0,0,0
67
+ abhischekt,4,0,1,5,22,0
68
+ shauryapednekar,1,0,0,0,0,0
69
+ oliverb123,1,0,0,0,0,0
70
+ andrewjmcgehee,2,0,0,0,0,0
71
+ lricoy,23,0,0,2,43,0
72
+ rodrigoi,7,0,0,0,0,0
73
+ MattBro,6,0,0,9,425,0
74
+ ryans-posthog,1,0,0,0,0,0
75
+ afsuyadi,1,0,0,0,0,0
76
+ clr182,4,0,0,0,0,0
77
+ slshults,2,0,0,0,0,0
78
+ nakshatra-nahar,1,0,0,0,0,0
79
+ mayteio,1,0,0,5,123,0
80
+ marandaneto,1,0,0,0,0,0
81
+ k11kirky,1,0,0,0,0,0
82
+ jose-sequeira,3,0,0,0,0,0
83
+ willwearing,1,0,0,0,0,0
84
+ sortafreel,4,0,0,0,0,0
85
+ MattPua,8,0,0,2,18,0
86
+ joshsny,18,0,0,3,121,0
87
+ ioannisj,1,0,0,0,0,0
88
+ pawel-cebula,1,0,0,5,243,0
89
+ mp-hog,5,0,0,2,245,0
90
+ MarconLP,1,0,0,0,0,0
91
+ ReeceJones,8,0,0,11,166,0
92
+ lucasheriques,5,0,0,2,116,0
93
+ okxint,2,0,0,0,0,0
94
+ adamleithp,5,0,0,0,0,0
95
+ dmarticus,6,0,0,0,0,0
96
+ erezrokah,1,0,0,0,0,0
97
+ benjackwhite,4,0,0,0,0,0
98
+ hpouillot,4,0,0,1,0,0
99
+ bigjohnn1,1,0,0,0,0,0
100
+ xljones,3,0,0,0,0,0
101
+ tatoalo,5,0,0,0,0,0
102
+ luke-belton,1,0,0,0,0,0
103
+ frankh,4,0,0,0,0,0
104
+ langesven,1,0,0,0,0,0
105
+ Copilot,0,0,0,41,2364,0
106
+ brandonleung,0,0,0,3,212,0
107
+ cvolzer3,0,0,0,5,95,0
requirements.txt ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ streamlit
2
+ pandas
3
+ plotly
4
+ python-dotenv