Spaces:

RishiXD
/

Datathon

No application file

App Files Files Community

RishiXD commited on Feb 8

Commit

b23ff00

verified ·

1 Parent(s): 493f2cf

Upload 67 files

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

backend/Dockerfile +28 -0
backend/README.md +64 -0
backend/backend_app/__init__.py +0 -0
backend/backend_app/__pycache__/__init__.cpython-312.pyc +0 -0
backend/backend_app/__pycache__/main.cpython-311.pyc +0 -0
backend/backend_app/__pycache__/main.cpython-312.pyc +0 -0
backend/backend_app/api/__pycache__/planning_routes.cpython-311.pyc +0 -0
backend/backend_app/api/__pycache__/planning_routes.cpython-312.pyc +0 -0
backend/backend_app/api/__pycache__/proxy_routes.cpython-312.pyc +0 -0
backend/backend_app/api/__pycache__/risk_analysis.cpython-311.pyc +0 -0
backend/backend_app/api/__pycache__/risk_analysis.cpython-312.pyc +0 -0
backend/backend_app/api/__pycache__/routes.cpython-311.pyc +0 -0
backend/backend_app/api/__pycache__/routes.cpython-312.pyc +0 -0
backend/backend_app/api/__pycache__/strategic_routes.cpython-311.pyc +0 -0
backend/backend_app/api/__pycache__/strategic_routes.cpython-312.pyc +0 -0
backend/backend_app/api/planning_routes.py +48 -0
backend/backend_app/api/risk_analysis.py +118 -0
backend/backend_app/api/routes.py +147 -0
backend/backend_app/api/strategic_routes.py +32 -0
backend/backend_app/core/__pycache__/config.cpython-311.pyc +0 -0
backend/backend_app/core/__pycache__/config.cpython-312.pyc +0 -0
backend/backend_app/core/__pycache__/explain.cpython-311.pyc +0 -0
backend/backend_app/core/__pycache__/explain.cpython-312.pyc +0 -0
backend/backend_app/core/__pycache__/github_client.cpython-311.pyc +0 -0
backend/backend_app/core/__pycache__/metrics.cpython-311.pyc +0 -0
backend/backend_app/core/__pycache__/metrics.cpython-312.pyc +0 -0
backend/backend_app/core/__pycache__/models.cpython-311.pyc +0 -0
backend/backend_app/core/__pycache__/models.cpython-312.pyc +0 -0
backend/backend_app/core/__pycache__/planning_engine.cpython-311.pyc +0 -0
backend/backend_app/core/__pycache__/planning_engine.cpython-312.pyc +0 -0
backend/backend_app/core/__pycache__/planning_loader.cpython-311.pyc +0 -0
backend/backend_app/core/__pycache__/planning_loader.cpython-312.pyc +0 -0
backend/backend_app/core/__pycache__/planning_models.cpython-311.pyc +0 -0
backend/backend_app/core/__pycache__/planning_models.cpython-312.pyc +0 -0
backend/backend_app/core/__pycache__/signals.cpython-311.pyc +0 -0
backend/backend_app/core/__pycache__/signals.cpython-312.pyc +0 -0
backend/backend_app/core/__pycache__/strategic_controller.cpython-311.pyc +0 -0
backend/backend_app/core/__pycache__/strategic_controller.cpython-312.pyc +0 -0
backend/backend_app/core/config.py +13 -0
backend/backend_app/core/explain.py +40 -0
backend/backend_app/core/github_client.py +148 -0
backend/backend_app/core/metrics.py +140 -0
backend/backend_app/core/models.py +67 -0
backend/backend_app/core/planning_engine.py +312 -0
backend/backend_app/core/planning_loader.py +19 -0
backend/backend_app/core/planning_models.py +66 -0
backend/backend_app/core/signals.py +146 -0
backend/backend_app/core/strategic_controller.py +256 -0
backend/backend_app/integrations/__pycache__/repo_api.cpython-311.pyc +0 -0
backend/backend_app/integrations/__pycache__/repo_api.cpython-312.pyc +0 -0

backend/Dockerfile ADDED Viewed

	@@ -0,0 +1,28 @@

+# Use an official lightweight Python image.
+# https://hub.docker.com/_/python
+FROM python:3.9-slim
+# Set environment variables
+ENV PYTHONDONTWRITEBYTECODE=1
+ENV PYTHONUNBUFFERED=1
+# Set work directory
+WORKDIR /code
+# Install dependencies
+COPY requirements.txt /code/requirements.txt
+RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt
+# Copy project including `backend_app` directory
+COPY . /code/
+# Expose the API port (Hugging Face Spaces defaults to 7860)
+EXPOSE 7860
+# Add both /code and /code/backend to PYTHONPATH to ensure backend_app can be imported
+# regardless of whether the build context was root or the backend folder.
+ENV PYTHONPATH="/code:/code/backend:$PYTHONPATH"
+# Command to run the application using uvicorn.
+# We use 'sh -c' to inspect the directory structure before starting, which helps debug 'ModuleNotFoundError'.
+CMD ["sh", "-c", "ls -R /code && uvicorn backend_app.main:app --host 0.0.0.0 --port 7860"]

backend/README.md ADDED Viewed

	@@ -0,0 +1,64 @@

+# Tribal Knowledge Risk Index & Auto-Correct Planning Engine
+A FastAPI service to analyze knowledge concentration (bus factor) and auto-correct sprint plans based on reality gaps.
+## Setup
+1.  **Install dependencies**:
+    ```bash
+    pip install -r requirements.txt
+    ```
+2.  **Ensure Data is present**:
+    Place JSON files in `data/`.
+    -   GitHub Dummy Data: `prs.json`, `reviews.json`, `commits.json`, `modules.json`
+    -   Jira Dummy Data: `jira_sprints.json`, `jira_issues.json`, `jira_issue_events.json`
+## Running the Service
+Start the server:
+```bash
+python app/main.py
+```
+Or:
+```bash
+uvicorn app.main:app --reload
+```
+API: `http://127.0.0.1:8000`
+## API Endpoints
+### 1. Source System Loading (Run First)
+-   `POST /load_data`: Load GitHub data.
+-   `POST /planning/load_jira_dummy`: Load Jira data.
+### 2. Computation
+-   `POST /compute`: Compute Tribal Knowledge Risks.
+-   `POST /planning/compute_autocorrect`: Compute Reality Gaps & Plan Corrections.
+### 3. Features
+**Tribal Knowledge**:
+-   `GET /modules`: List modules by risk.
+-   `GET /modules/{id}`: Detailed knowledge metrics.
+**Auto-Correct Planning**:
+-   `GET /planning/sprints`: List sprints with reality gaps and predictions.
+-   `GET /planning/sprints/{id}`: Detailed sprint metrics.
+-   `GET /planning/autocorrect/rules`: Learned historical correction rules.
+## Example Flow
+```bash
+# 1. Load All Data
+curl -X POST http://127.0.0.1:8000/load_data
+curl -X POST http://127.0.0.1:8000/planning/load_jira_dummy
+# 2. Compute Insights
+curl -X POST http://127.0.0.1:8000/compute
+curl -X POST http://127.0.0.1:8000/planning/compute_autocorrect
+# 3. Check "Auto-Correct" Insights
+# See the reality gap for the current sprint
+curl http://127.0.0.1:8000/planning/sprints
+```

backend/backend_app/__init__.py ADDED Viewed

File without changes

backend/backend_app/__pycache__/__init__.cpython-312.pyc ADDED Viewed

Binary file (152 Bytes). View file

backend/backend_app/__pycache__/main.cpython-311.pyc ADDED Viewed

Binary file (1.72 kB). View file

backend/backend_app/__pycache__/main.cpython-312.pyc ADDED Viewed

Binary file (1.78 kB). View file

backend/backend_app/api/__pycache__/planning_routes.cpython-311.pyc ADDED Viewed

Binary file (3.58 kB). View file

backend/backend_app/api/__pycache__/planning_routes.cpython-312.pyc ADDED Viewed

Binary file (3.15 kB). View file

backend/backend_app/api/__pycache__/proxy_routes.cpython-312.pyc ADDED Viewed

Binary file (2.36 kB). View file

backend/backend_app/api/__pycache__/risk_analysis.cpython-311.pyc ADDED Viewed

Binary file (5.07 kB). View file

backend/backend_app/api/__pycache__/risk_analysis.cpython-312.pyc ADDED Viewed

Binary file (4.84 kB). View file

backend/backend_app/api/__pycache__/routes.cpython-311.pyc ADDED Viewed

Binary file (8.52 kB). View file

backend/backend_app/api/__pycache__/routes.cpython-312.pyc ADDED Viewed

Binary file (7.67 kB). View file

backend/backend_app/api/__pycache__/strategic_routes.cpython-311.pyc ADDED Viewed

Binary file (1.41 kB). View file

backend/backend_app/api/__pycache__/strategic_routes.cpython-312.pyc ADDED Viewed

Binary file (1.72 kB). View file

backend/backend_app/api/planning_routes.py ADDED Viewed

	@@ -0,0 +1,48 @@

+from fastapi import APIRouter, HTTPException, Path
+from typing import List, Dict
+from backend_app.state.store import store
+from backend_app.core.planning_models import AutoCorrectHeadline, SprintMetrics, CorrectionRule
+router = APIRouter(prefix="/planning", tags=["planning"])
+@router.post("/load_jira_dummy")
+def load_jira_dummy():
+    try:
+        counts = store.load_jira_data()
+        return {"status": "loaded", "counts": counts}
+    except FileNotFoundError as e:
+        raise HTTPException(status_code=404, detail=str(e))
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e))
+@router.post("/compute_autocorrect", response_model=AutoCorrectHeadline)
+def compute_autocorrect():
+    try:
+        store.compute_planning()
+    except ValueError as e:
+        raise HTTPException(status_code=400, detail=str(e))
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e))
+    return AutoCorrectHeadline(headline=store.planning_headline)
+@router.get("/sprints", response_model=List[SprintMetrics])
+def list_sprints():
+    if not store.jira_loaded:
+        raise HTTPException(status_code=400, detail="Jira data not loaded. Call /planning/load_jira_dummy first.")
+    return store.get_sprints()
+@router.get("/sprints/{sprint_id}", response_model=SprintMetrics)
+def get_sprint(sprint_id: str):
+    sprint = store.get_sprint(sprint_id)
+    if not sprint:
+        # Check if loaded but not computed?
+        if not store.jira_loaded:
+             raise HTTPException(status_code=400, detail="Jira data not loaded.")
+        raise HTTPException(status_code=404, detail="Sprint not found or metrics not computed.")
+    return sprint
+@router.get("/autocorrect/rules", response_model=List[CorrectionRule])
+def list_rules():
+    return store.get_corrections()

backend/backend_app/api/risk_analysis.py ADDED Viewed

	@@ -0,0 +1,118 @@

+from fastapi import APIRouter, HTTPException
+from typing import Dict, Any
+from pydantic import BaseModel
+from backend_app.state.store import store
+router = APIRouter()
+class RiskAnalysisRequest(BaseModel):
+    org: str
+    repo: str
+@router.post("/analyze/risk", response_model=Dict[str, Any])
+def analyze_risk(req: RiskAnalysisRequest):
+    """
+    One-shot API to:
+    1. Load Live Data
+    2. Compute Metrics
+    3. Return 'Bus Factor' Risk Analysis (Feature #1)
+       PLUS Detailed raw stats: commits, PRs, merges per user.
+    """
+    try:
+        # 1. Load Data
+        print(f"Loading data for {req.org}/{req.repo}...")
+        store.load_live_data(req.org, req.repo)
+        # 2. Compute
+        print("Computing metrics...")
+        store.compute()
+        # 3. Collect Detailed Stats
+        # We want: commits, comments (reviews), PRs, Merge counts per user.
+        stats = {}
+        # Structure: { "user": { "commits": 0, "prs_opened": 0, "prs_merged": 0, "reviews": 0 } }
+        def get_stat(u):
+            if u not in stats: stats[u] = {"commits": 0, "prs_opened": 0, "prs_merged": 0, "reviews": 0}
+            return stats[u]
+        # Commits
+        for c in store.commits:
+            u = c.author or "unknown"
+            get_stat(u)["commits"] += 1
+        # PRs
+        for p in store.prs:
+            u = p.author or "unknown"
+            get_stat(u)["prs_opened"] += 1
+            if p.merged_at:
+                get_stat(u)["prs_merged"] += 1
+        # Reviews
+        for r in store.reviews:
+            u = r.reviewer or "unknown"
+            get_stat(u)["reviews"] += 1
+        # Format for response
+        detailed_stats = []
+        for user, data in stats.items():
+            detailed_stats.append({
+                "user": user,
+                **data
+            })
+        # Sort by commits desc
+        detailed_stats.sort(key=lambda x: x["commits"], reverse=True)
+        modules = store.get_modules()
+        if not modules:
+             return {
+                "headline": "No activity found.",
+                "overall_repo_risk": 0,
+                "user_stats": detailed_stats,
+                "modules_analysis": []
+            }
+        top_risk_module = modules[0]
+        results = []
+        for mod in modules:
+            if not mod.people: continue
+            top_person = mod.people[0]
+            share = top_person.share_pct * 100
+            # Bus Factor Check
+            bus_factor = mod.bus_factor
+            insight = f"Healthy distribution."
+            if bus_factor == 1:
+                insight = f"CRITICAL: {top_person.person_id} is a single point of failure (Bus Factor 1). If they leave, {share:.1f}% of module logic is orphaned."
+            elif share > 50:
+                 insight = f"HIGH RISK: {top_person.person_id} dominates ({share:.1f}%)."
+            results.append({
+                "module": mod.module_id,
+                "risk_score": mod.risk_index,
+                "severity": mod.severity,
+                "bus_factor": bus_factor,
+                "key_person": top_person.person_id,
+                "knowledge_share_pct": round(share, 1),
+                "insight": insight,
+                "evidence": mod.evidence
+            })
+        headline = f"Repo Analysis: {top_risk_module.module_id} is at {top_risk_module.severity} risk."
+        if top_risk_module.bus_factor == 1:
+            headline += f" {top_risk_module.people[0].person_id} is a Single Point of Failure."
+        return {
+            "headline": headline,
+            "overall_repo_risk": top_risk_module.risk_index,
+            "user_stats": detailed_stats,
+            "modules_analysis": results
+        }
+    except Exception as e:
+        import traceback
+        traceback.print_exc()
+        raise HTTPException(status_code=500, detail=str(e))

backend/backend_app/api/routes.py ADDED Viewed

	@@ -0,0 +1,147 @@

+from fastapi import APIRouter, HTTPException, Path, Body
+from typing import List, Dict, Optional
+from pydantic import BaseModel
+from backend_app.state.store import store
+from backend_app.core.models import LoadStatus, ComputeHeadline, ModuleMetric
+from backend_app.integrations.supabase_client import supabase
+import requests
+import uuid
+router = APIRouter()
+class LiveDataRequest(BaseModel):
+    org: str
+    repo: str
+@router.get("/health")
+def health_check():
+    return {"status": "ok"}
+@router.get("/test-supabase")
+def test_supabase_connection():
+    if not supabase:
+        return {"status": "error", "message": "Supabase client is None (failed to init)"}
+    try:
+        # Try a lightweight query
+        print("Testing Supabase connection...")
+        response = supabase.table("pull_requests").select("count", count="exact").limit(1).execute()
+        print(f"Supabase Test Result: {response}")
+        return {
+            "status": "ok",
+            "data": response.data,
+            "message": "Connection successful"
+        }
+    except Exception as e:
+        print(f"Supabase Test Failed: {e}")
+        return {"status": "error", "message": str(e)}
+@router.post("/load_data", response_model=Dict)
+def load_data(req: Optional[LiveDataRequest] = None):
+    try:
+        if req and req.org and req.repo:
+            return store.load_live_data(req.org, req.repo)
+        else:
+             counts = store.load_data()
+             return {
+                "prs": counts.get("prs", 0),
+                "reviews": counts.get("reviews", 0),
+                "commits": counts.get("commits", 0),
+                "modules": counts.get("modules", 0),
+                "source": "Dummy Data"
+            }
+    except Exception as e:
+        # Check for specific integration failure message
+        msg = str(e)
+        if "Integration failed" in msg:
+             raise HTTPException(status_code=502, detail=msg)
+        if "missing" in msg.lower(): # File missing
+             raise HTTPException(status_code=404, detail=msg)
+        raise HTTPException(status_code=500, detail=f"Error loading data: {msg}")
+@router.post("/compute", response_model=ComputeHeadline)
+def compute():
+    try:
+        store.compute()
+    except ValueError as e:
+        raise HTTPException(status_code=400, detail=str(e))
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e))
+    # Generate headline from the highest risk module
+    modules = store.get_modules()
+    if not modules:
+        return ComputeHeadline(headline="No modules found or computed.")
+    # Pick top risk
+    top_mod = modules[0]
+    risk_level = top_mod.severity
+    # Extract top person
+    top_person_name = "No one"
+    if top_mod.people:
+        top_person_name = top_mod.people[0].person_id
+    headline = f"{top_mod.module_id} module is at {risk_level} risk because {top_person_name} owns most of the knowledge signals."
+    return ComputeHeadline(headline=headline)
+@router.get("/modules", response_model=List[ModuleMetric], response_model_exclude={"people", "evidence", "plain_explanation"})
+def list_modules():
+    """
+    List modules sorted by risk_index desc.
+    Excludes detailed fields for the list view to keep it lightweight if needed,
+    but the prompt asks for specific fields.
+    Prompt: "List modules... with fields: module_id, risk_index, severity, top1_share_pct, top2_share_pct, bus_factor, total_knowledge_weight, signals_count"
+    The response_model_exclude in FastAPI handles hiding fields.
+    """
+    if not store.loaded and not store.module_metrics:
+        # If not loaded/computed, return empty or error?
+        # Prompt doesn't specify. Implicitly empty list or 400.
+        # But if compute hasn't run, module_metrics is empty.
+        pass
+    return store.get_modules()
+@router.get("/modules/{module_id}", response_model=ModuleMetric)
+def get_module(module_id: str = Path(..., description="The ID of the module")):
+    metric = store.get_module(module_id)
+    if not metric:
+        raise HTTPException(status_code=404, detail=f"Module '{module_id}' not found. Ensure signals are computed.")
+    return metric
+@router.get("/commits")
+def get_commits_list():
+    """
+    Returns the list of loaded commits.
+    """
+    total_count = len(store.commits)
+    # Sort by timestamp desc
+    sorted_commits = sorted(store.commits, key=lambda c: c.timestamp, reverse=True)
+    return {
+        "count": total_count,
+        "commits": sorted_commits
+    }
+@router.post("/run-workflow")
+def run_workflow_endpoint(input_text: str = Body(default="hello world!", embed=True)):
+    api_key = 'sk-y2mGytaDwLg927nc2LqZDOs-Go1dWGzjvjlHUN7zXj8'
+    url = "http://localhost:7860/api/v1/run/7e37cb01-7c44-44df-be5e-9969091a5ffe"
+    payload = {
+        "output_type": "chat",
+        "input_type": "text",
+        "input_value": input_text
+    }
+    payload["session_id"] = str(uuid.uuid4())
+    headers = {"x-api-key": api_key}
+    try:
+        response = requests.post(url, json=payload, headers=headers)
+        response.raise_for_status()
+        return {"output": response.text}
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=f"Workflow Error: {str(e)}")

backend/backend_app/api/strategic_routes.py ADDED Viewed

	@@ -0,0 +1,32 @@

+from fastapi import APIRouter, HTTPException
+from backend_app.state.store import store
+from backend_app.core.strategic_controller import get_strategic_audit, analyze_jira_from_db
+from pydantic import BaseModel
+class StrategicAuditResponse(BaseModel):
+    briefing: str
+router = APIRouter(prefix="/strategic", tags=["strategic"])
+@router.post("/audit", response_model=StrategicAuditResponse)
+def compute_strategic_audit():
+    # Make sure we have GitHub data loaded
+    if not store.loaded:
+         raise HTTPException(status_code=400, detail="GitHub data not loaded. Call /load_data first.")
+    # We can run without Jira loaded but it will default to empty plan.
+    # But best to encourage full load.
+    briefing_text = get_strategic_audit()
+    return StrategicAuditResponse(briefing=briefing_text)
+@router.post("/jira-audit", response_model=StrategicAuditResponse)
+def compute_jira_audit():
+    """
+    Compares the latest Jira data from DB with active GitHub data.
+    """
+    if not store.loaded:
+         raise HTTPException(status_code=400, detail="GitHub data not loaded. Call /load_data first.")
+    analysis = analyze_jira_from_db()
+    return StrategicAuditResponse(briefing=analysis)

backend/backend_app/core/__pycache__/config.cpython-311.pyc ADDED Viewed

Binary file (778 Bytes). View file

backend/backend_app/core/__pycache__/config.cpython-312.pyc ADDED Viewed

Binary file (773 Bytes). View file

backend/backend_app/core/__pycache__/explain.cpython-311.pyc ADDED Viewed

Binary file (1.79 kB). View file

backend/backend_app/core/__pycache__/explain.cpython-312.pyc ADDED Viewed

Binary file (1.84 kB). View file

backend/backend_app/core/__pycache__/github_client.cpython-311.pyc ADDED Viewed

Binary file (8.05 kB). View file

backend/backend_app/core/__pycache__/metrics.cpython-311.pyc ADDED Viewed

Binary file (5.29 kB). View file

backend/backend_app/core/__pycache__/metrics.cpython-312.pyc ADDED Viewed

Binary file (4.83 kB). View file

backend/backend_app/core/__pycache__/models.cpython-311.pyc ADDED Viewed

Binary file (3.84 kB). View file

backend/backend_app/core/__pycache__/models.cpython-312.pyc ADDED Viewed

Binary file (3.06 kB). View file

backend/backend_app/core/__pycache__/planning_engine.cpython-311.pyc ADDED Viewed

Binary file (11.3 kB). View file

backend/backend_app/core/__pycache__/planning_engine.cpython-312.pyc ADDED Viewed

Binary file (10.5 kB). View file

backend/backend_app/core/__pycache__/planning_loader.cpython-311.pyc ADDED Viewed

Binary file (2.6 kB). View file

backend/backend_app/core/__pycache__/planning_loader.cpython-312.pyc ADDED Viewed

Binary file (1.85 kB). View file

backend/backend_app/core/__pycache__/planning_models.cpython-311.pyc ADDED Viewed

Binary file (3.27 kB). View file

backend/backend_app/core/__pycache__/planning_models.cpython-312.pyc ADDED Viewed

Binary file (2.64 kB). View file

backend/backend_app/core/__pycache__/signals.cpython-311.pyc ADDED Viewed

Binary file (4.6 kB). View file

backend/backend_app/core/__pycache__/signals.cpython-312.pyc ADDED Viewed

Binary file (4.15 kB). View file

backend/backend_app/core/__pycache__/strategic_controller.cpython-311.pyc ADDED Viewed

Binary file (7.1 kB). View file

backend/backend_app/core/__pycache__/strategic_controller.cpython-312.pyc ADDED Viewed

Binary file (11.1 kB). View file

backend/backend_app/core/config.py ADDED Viewed

	@@ -0,0 +1,13 @@

+from pathlib import Path
+BASE_DIR = Path(__file__).resolve().parent.parent.parent
+DATA_DIR = BASE_DIR / "data"
+PRS_FILE = DATA_DIR / "prs.json"
+REVIEWS_FILE = DATA_DIR / "reviews.json"
+COMMITS_FILE = DATA_DIR / "commits.json"
+MODULES_FILE = DATA_DIR / "modules.json"
+JIRA_SPRINTS_FILE = DATA_DIR / "jira_sprints.json"
+JIRA_ISSUES_FILE = DATA_DIR / "jira_issues.json"
+JIRA_EVENTS_FILE = DATA_DIR / "jira_issue_events.json"

backend/backend_app/core/explain.py ADDED Viewed

	@@ -0,0 +1,40 @@

+from typing import TYPE_CHECKING
+if TYPE_CHECKING:
+    from backend_app.core.models import ModuleMetric
+def generate_explanation(metric: 'ModuleMetric') -> str:
+    """
+    A deterministic explanation that mentions:
+    - risk score
+    - top1 share %
+    - bus factor interpretation
+    - 1–2 evidence lines
+    """
+    # Headline
+    text = f"Risk Score: {metric.risk_index} ({metric.severity}). "
+    # Top Share
+    top_person = metric.people[0] if metric.people else None
+    if top_person:
+        text += f"Top contributor {top_person.person_id} holds {top_person.share_pct*100:.1f}% of the knowledge. "
+    else:
+        text += "No knowledge signals recorded. "
+        return text
+    # Bus Factor
+    if metric.bus_factor == 0:
+        text += "Bus factor is 0 (CRITICAL: No one has >10% share? Check data). "
+    elif metric.bus_factor == 1:
+        text += "Bus factor is 1 (Single point of failure). "
+    elif metric.bus_factor < 3:
+        text += f"Bus factor is {metric.bus_factor} (Low redundancy). "
+    else:
+        text += f"Bus factor is {metric.bus_factor} (Good redundancy). "
+    # Evidence (1-2 lines)
+    if metric.evidence:
+        text += "Key evidence: "
+        text += "; ".join(metric.evidence[:2]) + "."
+    return text

backend/backend_app/core/github_client.py ADDED Viewed

	@@ -0,0 +1,148 @@

+from datetime import datetime, timezone
+from typing import List, Dict, Any, Optional
+import requests
+import json
+import random # Fallback for story points
+from backend_app.core.models import RawCommit, RawPR, RawReview
+from backend_app.core.planning_models import RawIssue, RawIssueEvent, RawSprint
+# Base URL for the custom GitHub App/API
+BASE_URL = "https://samyak000-github-app.hf.space/insights"
+class GitHubClient:
+    def __init__(self, org: str, repo: str):
+        self.org = org
+        self.repo = repo
+    def _parse_ts(self, ts_str: Optional[str]) -> datetime:
+        if not ts_str:
+            return datetime.now(timezone.utc)
+        try:
+            return datetime.fromisoformat(ts_str.replace("Z", "+00:00"))
+        except:
+            return datetime.now(timezone.utc)
+    def fetch_commits(self) -> List[RawCommit]:
+        url = f"{BASE_URL}/commits"
+        payload = {"org": self.org, "repo": self.repo}
+        try:
+            resp = requests.post(url, json=payload, timeout=10)
+            if resp.status_code != 200: return []
+            data = resp.json()
+            commits = []
+            for item in data.get("commits", []):
+                try:
+                    c = item.get("commit", {})
+                    author_info = c.get("author", {})
+                    ts = self._parse_ts(author_info.get("date"))
+                    author_name = author_info.get("name", "Unknown")
+                    if item.get("author") and "login" in item["author"]:
+                        author_name = item["author"]["login"]
+                    files = []
+                    if "files" in item:
+                        files = [f.get("filename") for f in item["files"] if "filename" in f]
+                    commits.append(RawCommit(
+                        commit_id=item.get("sha", ""),
+                        author=author_name,
+                        timestamp=ts,
+                        files_changed=files
+                    ))
+                except Exception: continue
+            return commits
+        except Exception: return []
+    def fetch_prs(self) -> List[RawPR]:
+        url = f"{BASE_URL}/pull-requests"
+        payload = {"org": self.org, "repo": self.repo}
+        try:
+            resp = requests.post(url, json=payload, timeout=15)
+            if resp.status_code != 200: return []
+            data = resp.json()
+            # Adjust based on actual key.
+            # If endpoint is /pull-requests, maybe key is "pull_requests" or "prs"?
+            # I'll check generic keys if specific fails
+            raw_list = data.get("pull_requests", data.get("prs", []))
+            prs = []
+            for item in raw_list:
+                try:
+                    # Generic structure mapping
+                    pid = str(item.get("number", item.get("id", "unknown")))
+                    user = item.get("user", {})
+                    author = user.get("login", "unknown")
+                    created = self._parse_ts(item.get("created_at"))
+                    merged = self._parse_ts(item.get("merged_at")) if item.get("merged_at") else None
+                    # Files? Usually not in list view.
+                    # If this API is "smart", maybe it includes them?
+                    # If not, we assume empty or try "files" key
+                    files = [] # item.get("files", []) if we're lucky
+                    prs.append(RawPR(
+                        pr_id=pid,
+                        author=author,
+                        created_at=created,
+                        merged_at=merged,
+                        files_changed=files
+                    ))
+                except: continue
+            return prs
+        except Exception: return []
+    def fetch_issues(self) -> List[RawIssue]:
+        url = f"{BASE_URL}/pull-issues"
+        payload = {"org": self.org, "repo": self.repo}
+        try:
+            resp = requests.post(url, json=payload, timeout=15)
+            if resp.status_code != 200: return []
+            data = resp.json()
+            raw_list = data.get("issues", [])
+            issues = []
+            for item in raw_list:
+                try:
+                    # Skip PRs if they come through this endpoint
+                    if "pull_request" in item and item["pull_request"]:
+                        continue
+                    iid = f"GH-{item.get('number')}"
+                    title = item.get("title", "")
+                    # Map to Planning Model (Jira-style)
+                    # We need to fabricate some data for the Planning Engine to work
+                    assignees = item.get("assignees", [])
+                    assignee = assignees[0].get("login") if assignees else "unassigned"
+                    # Module? Try label
+                    labels = [l.get("name") for l in item.get("labels", [])]
+                    module_id = "general"
+                    for l in labels:
+                        if "module:" in l: # Convention?
+                            module_id = l.replace("module:", "")
+                            break
+                    # Sprint? Milestone?
+                    sprint_id = "SPR-LIVE" # Default bucket
+                    if item.get("milestone"):
+                        sprint_id = f"SPR-{item['milestone'].get('title')}"
+                    issues.append(RawIssue(
+                        issue_id=iid,
+                        sprint_id=sprint_id,
+                        title=title,
+                        issue_type="Story", # Default
+                        story_points=1, # Default
+                        assignee=assignee,
+                        module_id=module_id,
+                        created_at=self._parse_ts(item.get("created_at"))
+                    ))
+                except: continue
+            return issues
+        except Exception: return []
+    def fetch_activity(self) -> List[RawIssueEvent]:
+        # Maps activity timeline to issue events (transitions)
+        return [] # Placeholder, complex to map generic activity stream to "status changes" reliably without more info

backend/backend_app/core/metrics.py ADDED Viewed

	@@ -0,0 +1,140 @@

+from typing import List, Dict
+import math
+from backend_app.core.models import Signal, ModuleMetric, PersonMetric
+from backend_app.core.explain import generate_explanation
+def compute_metrics(module_id: str, signals: List[Signal], max_total_score_global: float) -> ModuleMetric:
+    # 1. Aggregate scores per person
+    person_scores: Dict[str, float] = {}
+    person_signal_counts: Dict[str, Dict[str, int]] = {} # person -> type -> count
+    total_score = 0.0
+    for s in signals:
+        total_score += s.weight
+        person_scores[s.person_id] = person_scores.get(s.person_id, 0.0) + s.weight
+        if s.person_id not in person_signal_counts:
+            person_signal_counts[s.person_id] = {}
+        person_signal_counts[s.person_id][s.signal_type] = person_signal_counts[s.person_id].get(s.signal_type, 0) + 1
+    # 2. Calculate person metrics
+    people_metrics: List[PersonMetric] = []
+    # Sort people by score desc
+    sorted_people = sorted(person_scores.items(), key=lambda x: x[1], reverse=True)
+    for person_id, score in sorted_people:
+        share = score / total_score if total_score > 0 else 0.0
+        people_metrics.append(PersonMetric(
+            person_id=person_id,
+            knowledge_score=score,
+            share_pct=share, # Keep as 0-1 float for now, will format later or in model? Model says float.
+            type_counts=person_signal_counts.get(person_id, {})
+        ))
+    # 3. Module level metrics
+    top1_share = people_metrics[0].share_pct if len(people_metrics) > 0 else 0.0
+    top2_share = people_metrics[1].share_pct if len(people_metrics) > 1 else 0.0
+    bus_factor = sum(1 for p in people_metrics if p.share_pct >= 0.10)
+    # Risk Index Formula
+    # silo = clamp((top1_share - 0.4)/0.6, 0, 1)
+    # bus = clamp((2 - bus_factor)/2, 0, 1)
+    # criticality = clamp(total_score / max_total_score_across_modules, 0, 1)
+    # risk = 100 * (0.5*silo + 0.3*bus + 0.2*criticality)
+    def clamp(val, min_v, max_v):
+        return max(min_v, min(val, max_v))
+    if not signals:
+        # No signals = No Risk (or No Data)
+        return ModuleMetric(
+            module_id=module_id,
+            risk_index=0.0,
+            severity="HEALTHY",
+            top1_share_pct=0.0,
+            top2_share_pct=0.0,
+            bus_factor=0,
+            total_knowledge_weight=0.0,
+            signals_count=0,
+            people=[],
+            evidence=[],
+            plain_explanation="No activity detected."
+        )
+    silo_factor = (top1_share - 0.4) / 0.6
+    bus_risk_factor = (2 - bus_factor) / 2.0
+    criticality_factor = total_score / max_total_score_global if max_total_score_global > 0 else 0.0
+    # Ensure non-negative before clamping
+    silo_factor = max(silo_factor, 0.0)
+    bus_risk_factor = max(bus_risk_factor, 0.0)
+    criticality_factor = max(criticality_factor, 0.0)
+    # Calculate raw risk
+    risk_index_raw = 100.0 * (0.6 * silo_factor + 0.25 * bus_risk_factor + 0.15 * criticality_factor)
+    # Remove dampening logic entirely as it's suppressing real risk on small repos
+    # if len(signals) < 10:
+    #    risk_index_raw = risk_index_raw * (len(signals) / 10.0)
+    risk_index = round(min(risk_index_raw, 100.0), 2)
+    # Severity - lowered thresholds slightly to show more "risk"
+    if risk_index >= 60:
+        severity = "SEVERE"
+    elif risk_index >= 30:
+        severity = "MODERATE"
+    else:
+        severity = "HEALTHY"
+    # 4. Evidence
+    # Generate 2-5 evidence strings for top contributors
+    evidence_lines = []
+    for p in people_metrics[:5]:
+        # “dev_a: share 84.0% | approvals=2, review_notes=0, commits=2”
+        # Map internal types to display names if needed, or just use raw keys
+        # The prompt examples: approvals, review_notes, commits
+        # My keys: review_approval, review_comment, review_changes_requested, commit
+        # Helper to get count
+        def gc(k): return p.type_counts.get(k, 0)
+        # Just dumping all counts for simplicity or Mapping to prettier names?
+        # Prompt: approvals=2, review_notes=0, commits=2
+        # I'll try to map to something readable.
+        parts = []
+        parts.append(f"commits={gc('commit')}")
+        approvals = gc('review_approval')
+        if approvals > 0: parts.append(f"approvals={approvals}")
+        comments = gc('review_comment')
+        if comments > 0: parts.append(f"comments={comments}")
+        changes = gc('review_changes_requested')
+        if changes > 0: parts.append(f"changes_requested={changes}")
+        counts_str = ", ".join(parts)
+        line = f"{p.person_id}: share {p.share_pct*100:.1f}% | {counts_str}"
+        evidence_lines.append(line)
+    mod_metric = ModuleMetric(
+        module_id=module_id,
+        risk_index=risk_index,
+        severity=severity,
+        top1_share_pct=top1_share,
+        top2_share_pct=top2_share,
+        bus_factor=bus_factor,
+        total_knowledge_weight=total_score,
+        signals_count=len(signals),
+        people=people_metrics,
+        evidence=evidence_lines,
+        plain_explanation=""
+    )
+    # Generate explanation
+    mod_metric.plain_explanation = generate_explanation(mod_metric)
+    return mod_metric

backend/backend_app/core/models.py ADDED Viewed

	@@ -0,0 +1,67 @@

+from pydantic import BaseModel, Field
+from typing import List, Optional, Dict
+from datetime import datetime
+# --- Input Models ---
+class RawPR(BaseModel):
+    pr_id: str
+    author: str
+    created_at: datetime
+    merged_at: Optional[datetime] = None
+    files_changed: List[str]
+class RawReview(BaseModel):
+    pr_id: str
+    reviewer: str
+    state: str  # APPROVED, CHANGES_REQUESTED, COMMENTED
+    timestamp: datetime
+class RawCommit(BaseModel):
+    commit_id: str
+    author: str
+    timestamp: datetime
+    message: Optional[str] = ""
+    files_changed: List[str]
+# Dictionary mapping module_id -> list of path prefixes
+ModulesConfig = Dict[str, List[str]]
+# --- Output / Internal Models ---
+class Signal(BaseModel):
+    person_id: str
+    module_id: str
+    signal_type: str
+    weight: float
+    timestamp: datetime
+    source_id: str  # pr_id or commit_id
+class PersonMetric(BaseModel):
+    person_id: str
+    knowledge_score: float
+    share_pct: float
+    type_counts: Dict[str, int] = Field(default_factory=dict)
+class ModuleMetric(BaseModel):
+    module_id: str
+    risk_index: float
+    severity: str  # SEVERE, MODERATE, HEALTHY
+    top1_share_pct: float
+    top2_share_pct: float
+    bus_factor: int
+    total_knowledge_weight: float
+    signals_count: int
+    people: List[PersonMetric]
+    evidence: List[str]
+    plain_explanation: str
+class ComputeHeadline(BaseModel):
+    headline: str
+class LoadStatus(BaseModel):
+    prs_count: int
+    reviews_count: int
+    commits_count: int
+    modules_count: int

backend/backend_app/core/planning_engine.py ADDED Viewed

	@@ -0,0 +1,312 @@

+from datetime import datetime, timezone, timedelta
+from typing import List, Dict, Optional, Tuple
+import math
+from backend_app.core.planning_models import (
+    RawSprint, RawIssue, RawIssueEvent,
+    SprintMetrics, CorrectionRule, AutoCorrectHeadline
+)
+from backend_app.core.models import Signal, RawPR, RawReview
+# We need access to GitHub data (processed signals or raw)
+# Heuristic Constants
+DEFAULT_POINTS_PER_DAY_DEV = 1.0 # Fallback
+REALITY_GAP_WEIGHT_POINTS = 0.6
+REALITY_GAP_WEIGHT_REVIEW = 0.4
+def compute_autocorrect(
+    sprints: List[RawSprint],
+    issues: List[RawIssue],
+    events: List[RawIssueEvent],
+    github_prs: List[RawPR],
+    github_reviews: List[RawReview],
+    modules_config: Dict[str, List[str]]
+) -> Tuple[List[SprintMetrics], List[CorrectionRule], str]:
+    # 1. Organize Data
+    # Issues per sprint
+    issues_by_sprint = {s.sprint_id: [] for s in sprints}
+    for i in issues:
+        if i.sprint_id in issues_by_sprint:
+            issues_by_sprint[i.sprint_id].append(i)
+    # Events by issue
+    events_by_issue = {i.issue_id: [] for i in issues}
+    for e in events:
+        if e.issue_id in events_by_issue:
+            events_by_issue[e.issue_id].append(e)
+    # Sort events by time
+    for iid in events_by_issue:
+        events_by_issue[iid].sort(key=lambda x: x.timestamp)
+    # 2. Historical Analysis (Correction Rules)
+    # We look at COMPLETED sprints to learn multipliers.
+    # Current time is "now" (simulated). We can assume "now" is the end of the last sprint or mid-current.
+    # The prompt says "current local time is 2026-02-07".
+    # Sprint 1 (Jan 15-29) is done. Sprint 2 (Feb 1-14) is in progress.
+    correction_rules = _learn_correction_rules(sprints, issues, events_by_issue)
+    # 3. Compute Metrics for Sprints (focus on active/recent)
+    sprint_metrics_list = []
+    # We need to simulate "current status" relative to 2026-02-07 (NOW)
+    NOW = datetime(2026, 2, 7, 14, 0, 0, tzinfo=timezone.utc)
+    headline = "No active sprint analysis."
+    for sprint in sprints:
+        # Determine if sprint is past, current, or future
+        # Simple check
+        is_current = sprint.start_date <= NOW <= sprint.end_date
+        is_past = sprint.end_date < NOW
+        # Calculate Planned
+        total_points = sprint.planned_story_points
+        days_duration = (sprint.end_date - sprint.start_date).days + 1
+        points_per_day_planned = total_points / days_duration if days_duration > 0 else 0
+        # Calculate Actual / Projected
+        # Points completed within sprint window (for past) or up to NOW (for current)
+        completed_points = 0
+        sprint_issues = issues_by_sprint[sprint.sprint_id]
+        # Track module breakdown
+        # mod_id -> {planned: int, completed: int}
+        mod_stats = {}
+        for issue in sprint_issues:
+            mid = issue.module_id
+            if mid not in mod_stats: mod_stats[mid] = {"planned": 0, "completed": 0}
+            mod_stats[mid]["planned"] += issue.story_points
+            # Check if done
+            # Issue is done if it has a transition to DONE within the sprint window
+            # For current sprint, within start -> NOW
+            # For past, within start -> end
+            cutoff = NOW if is_current else sprint.end_date
+            done_time = None
+            evt_list = events_by_issue.get(issue.issue_id, [])
+            for evt in evt_list:
+                if evt.to_status == "DONE":
+                    done_time = evt.timestamp
+                    break # Assuming once done stays done for simplicity
+            if done_time and done_time <= cutoff and done_time >= sprint.start_date:
+                completed_points += issue.story_points
+                mod_stats[mid]["completed"] += issue.story_points
+        # --- Gap Analysis ---
+        # Expected completion based on linear burn
+        # For past sprints, expected at end is 100%.
+        # For current, expected is proportional to time passed.
+        if is_past:
+            time_progress_pct = 1.0
+        else:
+            days_passed = (NOW - sprint.start_date).days
+            if days_passed < 0: days_passed = 0
+            time_progress_pct = days_passed / days_duration
+        expected_points = total_points * time_progress_pct
+        points_gap = expected_points - completed_points
+        # Review Delay Signal from GitHub
+        # Get PRs created during this sprint
+        sprint_prs = []
+        # Naive PR filter by created_at in sprint window
+        # Note: timezone awareness might be tricky if mixed naive/aware.
+        # Assuming GitHub data is loaded as datetime (model).
+        for pr in github_prs:
+            # check overlap? created_at inside sprint
+            # Handle tz: ensure both are consistent.
+            # Our models define datetime, likely parsed as aware or naive.
+            # We'll assume both are UTC aware for this exercise.
+            if sprint.start_date <= pr.created_at <= sprint.end_date:
+                sprint_prs.append(pr)
+        # Calculate avg review time
+        # We need reviews for these PRs
+        # Map needed.
+        # This is expensive if unrelated, but dataset is small.
+        review_delays = []
+        for pr in sprint_prs:
+            # Find approval
+            approval_ts = None
+            for rev in github_reviews:
+                if rev.pr_id == pr.pr_id and rev.state == "APPROVED":
+                    approval_ts = rev.timestamp
+                    break
+            if approval_ts:
+                delay = (approval_ts - pr.created_at).total_seconds() / 86400.0 # days
+                review_delays.append(delay)
+            elif is_current:
+                 # If not approved yet, delay is (NOW - created)
+                 current_wait = (NOW - pr.created_at).total_seconds() / 86400.0
+                 if current_wait > 1.0: # Only count if waiting > 1 day
+                    review_delays.append(current_wait)
+        avg_review_delay = sum(review_delays)/len(review_delays) if review_delays else 0.5 # default 0.5d
+        # Baseline review delay? Say 0.6 is good.
+        review_gap = max(0, avg_review_delay - 0.6)
+        # Reality Gap Score (0-100)
+        # normalize points gap: if we are 30% behind, that's bad.
+        pct_behind = points_gap / total_points if total_points > 0 else 0
+        score_points = min(100, max(0, pct_behind * 100 * 2)) # Multiplier 2x: 50% behind = 100 risk
+        score_review = min(100, review_gap * 20) # 1 day late = 20 pts, 5 days = 100
+        reality_gap_score = int(score_points * 0.7 + score_review * 0.3)
+        # Prediction
+        # Simple velocity based on current completed vs time used
+        predicted_slip = 0
+        predicted_finish = sprint.end_date
+        if is_current and completed_points < total_points and time_progress_pct > 0.1:
+            # Pace: points per day actual
+            days_spent = (NOW - sprint.start_date).days
+            if days_spent < 1: days_spent = 1
+            avg_pace = completed_points / days_spent
+            remaining = total_points - completed_points
+            if avg_pace > 0:
+                days_needed = remaining / avg_pace
+                finish_date = NOW + timedelta(days=days_needed)
+                slip = (finish_date - sprint.end_date).days
+                if slip > 0:
+                    predicted_slip = int(slip)
+                    predicted_finish = finish_date
+            else:
+                # Stall
+                predicted_slip = 99
+                predicted_finish = NOW + timedelta(days=30)
+        # Explainability
+        top_drivers = []
+        # Who is missing points?
+        # Which modules?
+        bad_modules = []
+        for m, stats in mod_stats.items():
+            if stats["planned"] > 0:
+                p = stats["completed"] / stats["planned"]
+                # Adjust expectation: expected p should be time_progress_pct
+                if p < (time_progress_pct * 0.7): # 30% buffer
+                   bad_modules.append(m)
+        if bad_modules:
+            top_drivers.append(f"Modules behind schedule: {', '.join(bad_modules)}")
+        if review_gap > 1.0:
+            top_drivers.append(f"High review delays (avg {avg_review_delay:.1f}d)")
+        if points_gap > 5:
+            top_drivers.append(f"Point completion gap: {points_gap} pts behind plan")
+        # Recommendations
+        actions = []
+        if is_current and "payments" in bad_modules and review_gap > 1.0:
+             actions.append("Payments module is bottlenecked by reviews. Assign 1 extra reviewer.")
+        if predicted_slip > 2:
+             actions.append(f"Predicted slip {predicted_slip} days. Reduce scope by {int(points_gap)} pts.")
+        metric = SprintMetrics(
+            sprint_id=sprint.sprint_id,
+            name=sprint.name,
+            start_date=sprint.start_date,
+            end_date=sprint.end_date,
+            planned_story_points=total_points,
+            completed_story_points=completed_points,
+            completion_pct=round(completed_points / total_points * 100, 1) if total_points else 0,
+            reality_gap_score=reality_gap_score,
+            points_completion_gap=round(points_gap, 1),
+            predicted_slip_days=predicted_slip,
+            predicted_finish_date=predicted_finish.strftime("%Y-%m-%d"),
+            module_breakdown=mod_stats,
+            top_drivers=top_drivers,
+            recommended_actions=actions
+        )
+        sprint_metrics_list.append(metric)
+        if is_current:
+            drivers_short = "; ".join(top_drivers[:1]) if top_drivers else "on track"
+            headline = f"{sprint.name} is trending {predicted_slip} days late: {drivers_short}."
+    return sprint_metrics_list, correction_rules, headline
+def _learn_correction_rules(sprints: List[RawSprint], issues: List[RawIssue], events_by_issue: Dict[str, List[RawIssueEvent]]) -> List[CorrectionRule]:
+    """
+    Learn from past COMPLETED sprints.
+    Correction = actual_duration / planned_duration
+    Wait, issues don't have "planned duration", they have points.
+    We need:
+       planned_days = points / sprint_avg_velocity (points/day)
+       actual_days = DONE - IN_PROGRESS timestamp
+    """
+    rules = []
+    # Group by (team, module, type) -> list of ratios
+    history: Dict[Tuple[str, str, str], List[float]] = {}
+    # Pre-calc sprint velocities
+    sprint_velocities = {} # sprint_id -> points/day
+    for s in sprints:
+        duration = (s.end_date - s.start_date).days + 1
+        vel = s.planned_story_points / duration if duration > 0 else 1.0
+        sprint_velocities[s.sprint_id] = vel
+    for issue in issues:
+        # Only look at fully done issues
+        evts = events_by_issue.get(issue.issue_id, [])
+        start_ts = None
+        end_ts = None
+        for e in evts:
+            if e.to_status == "IN_PROGRESS": start_ts = e.timestamp
+            if e.to_status == "DONE": end_ts = e.timestamp
+        if start_ts and end_ts:
+            actual_days = (end_ts - start_ts).total_seconds() / 86400.0
+            if actual_days < 0.1: actual_days = 0.1 # min
+            # Planned days
+            vel = sprint_velocities.get(issue.sprint_id, 1.0)
+            planned_days = issue.story_points / vel
+            ratio = actual_days / planned_days
+            # Key
+            # We assume team_alpha for all as per dummy data
+            key = ("team_alpha", issue.module_id, issue.issue_type)
+            if key not in history: history[key] = []
+            history[key].append(ratio)
+    # Compile rules
+    for key, ratios in history.items():
+        team, mod, itype = key
+        avg_ratio = sum(ratios) / len(ratios)
+        # Clamp
+        multiplier = max(1.0, min(avg_ratio, 2.5))
+        # Build explanation
+        expl = f"Historically {mod}/{itype} tasks take {multiplier:.1f}x longer than planned."
+        rules.append(CorrectionRule(
+            team_id=team,
+            module_id=mod,
+            issue_type=itype,
+            multiplier=round(multiplier, 2),
+            samples_count=len(ratios),
+            explanation=expl
+        ))
+    return rules

backend/backend_app/core/planning_loader.py ADDED Viewed

	@@ -0,0 +1,19 @@

+import json
+from typing import List, Dict, Tuple
+from backend_app.core.config import JIRA_SPRINTS_FILE, JIRA_ISSUES_FILE, JIRA_EVENTS_FILE
+from backend_app.core.planning_models import RawSprint, RawIssue, RawIssueEvent
+def load_jira_files() -> Tuple[List[RawSprint], List[RawIssue], List[RawIssueEvent]]:
+    if not JIRA_SPRINTS_FILE.exists() or not JIRA_ISSUES_FILE.exists() or not JIRA_EVENTS_FILE.exists():
+        raise FileNotFoundError("One or more Jira data files are missing.")
+    with open(JIRA_SPRINTS_FILE, 'r') as f:
+        sprints = [RawSprint(**i) for i in json.load(f)]
+    with open(JIRA_ISSUES_FILE, 'r') as f:
+        issues = [RawIssue(**i) for i in json.load(f)]
+    with open(JIRA_EVENTS_FILE, 'r') as f:
+        events = [RawIssueEvent(**i) for i in json.load(f)]
+    return sprints, issues, events

backend/backend_app/core/planning_models.py ADDED Viewed

	@@ -0,0 +1,66 @@

+from pydantic import BaseModel, Field
+from typing import List, Optional, Dict
+from datetime import datetime
+# --- Jira Data Models ---
+class RawSprint(BaseModel):
+    sprint_id: str
+    name: str
+    start_date: datetime
+    end_date: datetime
+    team_id: str
+    planned_story_points: int
+class RawIssue(BaseModel):
+    issue_id: str
+    sprint_id: str
+    title: str
+    issue_type: str # Story|Bug|Task
+    story_points: int
+    assignee: str
+    module_id: str
+    created_at: datetime
+class RawIssueEvent(BaseModel):
+    issue_id: str
+    timestamp: datetime
+    from_status: str
+    to_status: str
+# --- Planning Output Models ---
+class SprintMetrics(BaseModel):
+    sprint_id: str
+    name: str
+    start_date: datetime
+    end_date: datetime
+    planned_story_points: int
+    completed_story_points: int
+    completion_pct: float
+    # Gap Metrics
+    reality_gap_score: int # 0-100
+    points_completion_gap: float
+    # Prediction
+    predicted_slip_days: int
+    predicted_finish_date: str # Just string for simplicity in display
+    # Breakdown by module for detailed views
+    module_breakdown: Dict[str, Dict[str, float]] = Field(default_factory=dict) # mod -> {planned, actual}
+    # Evidence & Recs
+    top_drivers: List[str]
+    recommended_actions: List[str]
+class CorrectionRule(BaseModel):
+    team_id: str
+    module_id: str
+    issue_type: str
+    multiplier: float
+    samples_count: int
+    explanation: str
+class AutoCorrectHeadline(BaseModel):
+    headline: str

backend/backend_app/core/signals.py ADDED Viewed

	@@ -0,0 +1,146 @@

+from typing import List, Dict, Set
+from datetime import datetime
+from backend_app.core.models import RawPR, RawReview, RawCommit, Signal
+WEIGHT_COMMIT = 1.0
+WEIGHT_REVIEW_APPROVED = 3.0
+WEIGHT_REVIEW_COMMENTED = 2.0
+WEIGHT_REVIEW_CHANGES_REQUESTED = 2.5
+def get_modules_for_paths(paths: List[str], modules_config: Dict[str, List[str]]) -> Set[str]:
+    """
+    Given a list of file paths changed in a PR or Commit,
+    return the set of module_ids that apply.
+    A path belongs to a module if it starts with any of the module's prefixes.
+    """
+    matched_modules = set()
+    # Debug: if no paths, maybe root? No, we need changed paths.
+    if not paths:
+        # If no paths info (e.g. from API limitation), assume root if config has root
+        if "root" in modules_config:
+            return {"root"}
+        return set()
+    for path in paths:
+        path_str = str(path)
+        mapped = False
+        for mod_id, prefixes in modules_config.items():
+            for prefix in prefixes:
+                # Handle root special case: prefix "" matches everything
+                if prefix == "" or path_str.startswith(prefix):
+                    matched_modules.add(mod_id)
+                    mapped = True
+        # Fallback: if path didn't map to anything specific, map to 'root' if it exists
+        if not mapped and "root" in modules_config:
+            matched_modules.add("root")
+    return matched_modules
+def process_signals(
+    prs: List[RawPR],
+    reviews: List[RawReview],
+    commits: List[RawCommit],
+    modules_config: Dict[str, List[str]]
+) -> Dict[str, List[Signal]]:
+    """
+    Convert raw events into signals grouped by module_id.
+    """
+    signals_by_module: Dict[str, List[Signal]] = {}
+    # Init empty list for all modules so we get a result even if 0 signals
+    for mod_id in modules_config:
+        signals_by_module[mod_id] = []
+    # Helper to append signal
+    def add_signal(mod_id: str, sig: Signal):
+        if mod_id not in signals_by_module:
+            signals_by_module[mod_id] = []
+        signals_by_module[mod_id].append(sig)
+    # 1. Process Commits
+    # 1. Process Commits
+    for commit in commits:
+        files = commit.files_changed
+        if not files:
+            # If files list is empty from Supabase, try to fallback to 'root'
+            files = []
+        affected_modules = get_modules_for_paths(files, modules_config)
+        for mod_id in affected_modules:
+            # Create signal from commit
+            sig = Signal(
+                person_id=commit.author,
+                module_id=mod_id,
+                signal_type="commit",
+                weight=1.0, # WEIGHT_COMMIT
+                timestamp=commit.timestamp,
+                source_id=commit.commit_id
+            )
+            # Add to module list
+            if mod_id not in signals_by_module:
+                 signals_by_module[mod_id] = []
+            signals_by_module[mod_id].append(sig)
+    # 2. Process PRs and Reviews
+    # NEW RULE: If reviews are missing in Supabase, treat PR creation/merge as a signal for the AUTHOR.
+    # This ensures we get risk data even if no reviews exist.
+    # Process PR Author signals
+    for pr in prs:
+        # Fallback for empty files list
+        files = pr.files_changed if pr.files_changed else []
+        affected_modules = get_modules_for_paths(files, modules_config)
+        for mod_id in affected_modules:
+            # Treat PR creation as a signal (e.g. weight 1.5)
+            sig = Signal(
+                person_id=pr.author,
+                module_id=mod_id,
+                signal_type="pr_created",
+                weight=1.5,
+                timestamp=pr.created_at,
+                source_id=pr.pr_id
+            )
+            add_signal(mod_id, sig)
+    # Process Reviews (if any)
+    pr_map = {pr.pr_id: pr for pr in prs}
+    for review in reviews:
+        if review.pr_id not in pr_map:
+            continue # Skip reviews for unknown PRs
+        pr = pr_map[review.pr_id]
+        affected_modules = get_modules_for_paths(pr.files_changed, modules_config)
+        # Determine weight and type
+        w = 0.0
+        s_type = ""
+        if review.state == "APPROVED":
+            w = WEIGHT_REVIEW_APPROVED
+            s_type = "review_approval"
+        elif review.state == "COMMENTED":
+            w = WEIGHT_REVIEW_COMMENTED
+            s_type = "review_comment"
+        elif review.state == "CHANGES_REQUESTED":
+            w = WEIGHT_REVIEW_CHANGES_REQUESTED
+            s_type = "review_changes_requested"
+        else:
+            continue # Unknown state
+        for mod_id in affected_modules:
+            sig = Signal(
+                person_id=review.reviewer,
+                module_id=mod_id,
+                signal_type=s_type,
+                weight=w,
+                timestamp=review.timestamp,
+                source_id=review.pr_id
+            )
+            add_signal(mod_id, sig)
+    return signals_by_module

backend/backend_app/core/strategic_controller.py ADDED Viewed

	@@ -0,0 +1,256 @@

+from datetime import datetime, timezone
+import json
+from openai import OpenAI
+from backend_app.state.store import store
+from backend_app.core.planning_engine import compute_autocorrect # Re-use metrics if needed
+from backend_app.integrations.supabase_client import supabase
+# --- Configuration ---
+FEATHERLESS_API_KEY = "rc_3a397e668b06eae8d8e477e2f5434b97dc9f3ffd8bf0856563a6d1cd9941fcac"
+FEATHERLESS_BASE_URL = "https://api.featherless.ai/v1"
+MODEL_NAME = "Qwen/Qwen2.5-7B-Instruct"
+def get_strategic_audit():
+    """
+    Gather data from the store, format it for the LLM, and request an executive briefing.
+    """
+    # 1. Gather Data (Jira Plan vs Github Reality)
+    # Jira: Current Active Sprint (Sprint 02 in dummy data)
+    # We need to find the "current" sprint relative to NOW (2026-02-07T15:00:39+05:30)
+    # Using fixed NOW from previous context: 2026-02-07
+    NOW = datetime(2026, 2, 7, 15, 0, 0, tzinfo=timezone.utc)
+    current_sprint = None
+    if store.sprints:
+        for s in store.sprints:
+            s_start = s.start_date.replace(tzinfo=timezone.utc) if s.start_date.tzinfo is None else s.start_date
+            s_end = s.end_date.replace(tzinfo=timezone.utc) if s.end_date.tzinfo is None else s.end_date
+            if s_start <= NOW <= s_end:
+                current_sprint = s
+                break
+        if not current_sprint:
+            current_sprint = store.sprints[-1]
+    jira_summary = {
+        "sprint": current_sprint.name if current_sprint else "Unknown",
+        "planned_points": current_sprint.planned_story_points if current_sprint else 0,
+        "team": current_sprint.team_id if current_sprint else "Unknown",
+        "active_issues": []
+    }
+    # Filter issues for this sprint
+    issues_list = []
+    if current_sprint:
+        # Be careful with ID match, assuming string match on sprint_id
+        issues_list = [i for i in store.issues if i.sprint_id == current_sprint.sprint_id]
+    for i in issues_list:
+        jira_summary["active_issues"].append({
+            "id": i.issue_id,
+            "title": i.title,
+            "points": i.story_points,
+            "module": i.module_id,
+            "assignee": i.assignee,
+            "type": i.issue_type # Assuming stored as issue_type
+        })
+    # GitHub: Recent Activity (in Sprint Window)
+    sprint_start = current_sprint.start_date.replace(tzinfo=timezone.utc) if current_sprint and current_sprint.start_date.tzinfo is None else (current_sprint.start_date if current_sprint else NOW)
+    sprint_end = current_sprint.end_date.replace(tzinfo=timezone.utc) if current_sprint and current_sprint.end_date.tzinfo is None else (current_sprint.end_date if current_sprint else NOW)
+    github_summary = {
+        "recent_commits_count": 0,
+        "recent_prs": [],
+        "active_contributors": set()
+    }
+    # Scan Commits
+    for c in store.commits:
+        c_ts = c.timestamp.replace(tzinfo=timezone.utc) if c.timestamp.tzinfo is None else c.timestamp
+        if sprint_start <= c_ts <= sprint_end:
+            github_summary["recent_commits_count"] += 1
+            github_summary["active_contributors"].add(c.author)
+    # Scan PRs
+    for p in store.prs:
+        p_ts = p.created_at.replace(tzinfo=timezone.utc) if p.created_at.tzinfo is None else p.created_at
+        relevant = False
+        if sprint_start <= p_ts <= sprint_end: relevant = True
+        if p.merged_at:
+            p_m = p.merged_at.replace(tzinfo=timezone.utc) if p.merged_at.tzinfo is None else p.merged_at
+            if sprint_start <= p_m <= sprint_end: relevant = True
+        if relevant:
+            github_summary["recent_prs"].append({
+                "id": p.pr_id,
+                "author": p.author,
+                "files": p.files_changed[:2],
+                "merged": bool(p.merged_at)
+            })
+            github_summary["active_contributors"].add(p.author)
+    github_summary["active_contributors"] = list(github_summary["active_contributors"])
+    # 2. Construct Prompt for LLM
+    system_prompt = (
+        "You are a 'Strategic Engineering Controller.' Your job is to reconcile two conflicting data sources: "
+        "Jira (The Plan) and GitHub (The Technical Reality). "
+        "You must identify 'Strategic Drift'—the gap between what the company thinks it's doing and what is actually happening. "
+        "Output your analysis in a concise, high-impact 'Executive Briefing' format."
+    )
+    user_prompt = f"""
+    DATA INPUTS:
+    Jira Sprint Data:
+    {json.dumps(jira_summary, indent=2, default=str)}
+    GitHub Activity:
+    {json.dumps(github_summary, indent=2, default=str)}
+    TASK: Analyze these inputs and provide:
+    1. The Reality Score: A percentage (0-100%) of how "on track" the project truly is compared to the Jira board.
+    2. The Shadow Work Audit: Identify what percentage of time is being spent on tasks NOT in Jira (e.g., maintenance, mentoring, or technical debt) based on GitHub activity vs Jira tickets.
+    3. The Tribal Knowledge Hero: Identify the developer who is providing the most "unseen" value through mentoring and code reviews (infer from PRs/commits).
+    4. Financial Risk Alert: Estimate the dollar cost of current delays (assume $100/hr avg cost) and suggest one specific resource reallocation to fix it.
+    5. Executive Summary: A 3-sentence briefing for the CEO.
+    Format the output clearly with headers. Be direct and concise.
+    """
+    # 3. Call LLM
+    try:
+        client = OpenAI(
+            base_url=FEATHERLESS_BASE_URL,
+            api_key=FEATHERLESS_API_KEY,
+        )
+        response = client.chat.completions.create(
+            model=MODEL_NAME,
+            messages=[
+                {"role": "system", "content": system_prompt},
+                {"role": "user", "content": user_prompt}
+            ],
+            temperature=0.7,
+            max_tokens=600
+        )
+        return response.choices[0].message.content
+    except Exception as e:
+        return f"Error generating strategic audit: {str(e)}"
+def get_latest_jira_payload():
+    """
+    Fetch the latest Jira payload from the 'jira_data' table in Supabase.
+    """
+    if not supabase:
+        print("Warning: Supabase client not initialized.")
+        return None
+    try:
+        # Fetch the latest record based on synced_at
+        res = supabase.table("jira_data").select("jira_payload").order("synced_at", desc=True).limit(1).execute()
+        if res.data:
+            return res.data[0]['jira_payload']
+        return None
+    except Exception as e:
+        print(f"Error fetching Jira payload: {e}")
+        return None
+def analyze_jira_from_db():
+    """
+    Compare Jira data from DB with GitHub reality using Featherless API.
+    """
+    # 1. Fetch Jira Payload
+    jira_payload = get_latest_jira_payload()
+    if not jira_payload:
+        return "No Jira data found in database."
+    # 2. Gather GitHub Data (Reality)
+    # Using existing store data (populated via /load_data or /load_live_data)
+    # We'll use a summary similar to get_strategic_audit but tailored for this comparison
+    # Calculate time window based on data available or assume last 2 weeks if not specified
+    # For now, let's use the full loaded data context to find active sprint window if possible,
+    # or just summarize recent activity.
+    # If jira_payload has sprint info, use that window.
+    # Assuming jira_payload is a dict with sprint info.
+    sprint_info = jira_payload.get("sprint", {}) if isinstance(jira_payload, dict) else {}
+    # If payload structure is list of issues, we might need to infer.
+    # Simplification: pass the raw payload structure to LLM to interpret,
+    # along with a structured summary of GitHub activity.
+    github_summary = {
+        "total_commits": len(store.commits),
+        "recent_commits": [
+            {
+                "author": c.author,
+                "message": c.message if hasattr(c, 'message') else "",
+                "timestamp": str(c.timestamp)
+            } for c in store.commits[:20] # Last 20 commits
+        ],
+        "active_prs": [
+            {
+                "id": str(p.pr_id),
+                "author": p.author,
+                "status": "merged" if p.merged_at else "open",
+                "created_at": str(p.created_at)
+            } for p in store.prs if not p.merged_at or (p.merged_at and (datetime.now(timezone.utc) - p.merged_at).days < 14)
+        ]
+    }
+    # 3. Construct Prompt
+    system_prompt = (
+        "You are an expert Engineering Analyst. Your goal is to compare the planned work (Jira) "
+        "against the actual engineering activity (GitHub) to identify discrepancies, risks, and "
+        "undocumented work."
+    )
+    user_prompt = f"""
+    JIRA DATA (The Plan):
+    {json.dumps(jira_payload, indent=2, default=str)}
+    GITHUB DATA (The Reality):
+    {json.dumps(github_summary, indent=2, default=str)}
+    TASK:
+    Analyze the alignment between the Jira plan and GitHub activity.
+    1. Identify any work in GitHub that is not tracked in Jira (Shadow Work).
+    2. Identify any Jira items that show no corresponding GitHub activity (Stalled Work).
+    3. Provide a 'Reality Score' (0-100%) indicating how well the plan matches reality.
+    4. Highlight top risks.
+    Output in a clear, executive summary format.
+    """
+    # 4. Call Featherless API
+    try:
+        client = OpenAI(
+            base_url=FEATHERLESS_BASE_URL,
+            api_key=FEATHERLESS_API_KEY,
+        )
+        response = client.chat.completions.create(
+            model=MODEL_NAME,
+            messages=[
+                {"role": "system", "content": system_prompt},
+                {"role": "user", "content": user_prompt}
+            ],
+            temperature=0.7,
+            max_tokens=800
+        )
+        return response.choices[0].message.content
+    except Exception as e:
+        return f"Error generating analysis: {str(e)}"

backend/backend_app/integrations/__pycache__/repo_api.cpython-311.pyc ADDED Viewed

Binary file (6.34 kB). View file

backend/backend_app/integrations/__pycache__/repo_api.cpython-312.pyc ADDED Viewed

Binary file (5.56 kB). View file