Spaces:

Hammad712
/

MAAS

Sleeping

App Files Files Community

Hammad712 commited on Jun 9, 2025

Commit

7616a39

0 Parent(s):

Initial commit

Browse files

Files changed (17) hide show

.gitignore +77 -0
Dockerfile +33 -0
README.md +223 -0
app/__init__.py +10 -0
app/config.py +53 -0
app/main.py +265 -0
app/models.py +112 -0
app/rag/__init__.py +0 -0
app/rag/db.py +16 -0
app/rag/embeddings.py +71 -0
app/rag/logging_config.py +13 -0
app/rag/routes.py +140 -0
app/rag/schemas.py +45 -0
app/rag/utils.py +136 -0
app/run_server.py +20 -0
app/services.py +306 -0
requirements.txt +12 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,77 @@

+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+# Virtual environment
+venv/
+env/
+.myenv/
+.myenv*/  # just in case named differently
+.env
+.env.*
+# VS Code
+.vscode/
+# Pytest
+.pytest_cache/
+# Jupyter Notebook
+.ipynb_checkpoints/
+# FastAPI / Uvicorn logs
+*.log
+# Cache
+.cache/
+*.sqlite3
+# Streamlit specific
+.streamlit/config.toml
+.streamlit/secrets.toml
+# FAISS vector store
+*.faiss*
+*.pkl
+*.index
+# OS-specific
+.DS_Store
+Thumbs.db
+# Docker artifacts
+__pycache__/
+*.tar
+*.pid
+*.sock
+*.db
+*.log
+# Python egg metadata
+*.egg-info/
+*.egg
+# Build artifacts
+build/
+dist/
+*.egg-info/
+# Coverage reports
+htmlcov/
+.coverage
+.tox/
+# Test artifacts
+tests/__pycache__/
+*.cover
+# IDEs
+.idea/
+*.iml
+# Node.js modules
+node_modules/
+# Custom virtual environments
+myenv/myenv/

Dockerfile ADDED Viewed

	@@ -0,0 +1,33 @@

+# Use the official Python 3.11 slim image as a base
+FROM python:3.11-slim
+# Prevent Python from buffering stdout/stderr (so logs appear immediately)
+ENV PYTHONUNBUFFERED=1
+# Install system dependencies needed for certain Python packages (e.g., faiss, PyTorch CPU wheels)
+RUN apt-get update && \
+    apt-get install -y --no-install-recommends \
+        build-essential \
+        git \
+        libgomp1 \
+    && rm -rf /var/lib/apt/lists/*
+# Set the working directory inside the container
+WORKDIR /app
+# Copy only requirements first to leverage Docker layer caching
+COPY requirements.txt .
+# Upgrade pip, install all Python dependencies, then install PyTorch CPU wheels
+RUN pip install --upgrade pip && \
+    pip install -r requirements.txt && \
+    pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
+# Copy the rest of the application code
+COPY . .
+# Expose port 8000 for the FastAPI app
+EXPOSE 8000
+# By default, run uvicorn to serve the FastAPI app
+CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

README.md ADDED Viewed

	@@ -0,0 +1,223 @@

+Here’s an updated `README.md` for your `MAAS` project, reflecting the expanded functionality and project structure that now includes the RAG-based chat system in addition to PageSpeed insights and Gemini-based analysis.
+---
+# MAAS API (Metrics & AI-Assisted Suggestions)
+A professional FastAPI application that offers two core services:
+1. **PageSpeed Performance Reports** – Using Google's PageSpeed Insights and Gemini AI for analysis and recommendations.
+2. **RAG-Powered Chat System** – Retrieval-Augmented Generation (RAG) chat sessions with document ingestion, vectorstore indexing (FAISS), and persistent chat history (MongoDB).
+## ✨ Features
+* 🔍 PageSpeed Insights integration for web performance metrics
+* 🤖 Gemini AI–powered optimization report generation
+* 📚 Document ingestion and chunked embedding with FAISS
+* 💬 RAG-based conversational system per user and chat session
+* 📄 Clean modular FastAPI architecture
+* 🛠️ Configuration via environment variables
+* 🔐 Secure, with input validation and API key protection
+* 📈 Built-in health check, detailed logging, and auto-generated API docs
+---
+## 🗂 Project Structure
+```
+MAAS/
+├── app/
+│   ├── rag/                         # RAG module for document ingestion and chat
+│   │   ├── db.py
+│   │   ├── embedding.py
+│   │   ├── routes.py               # RAG API endpoints
+│   │   ├── schemas.py
+│   │   └── utils.py
+│   ├── config.py                   # Environment & settings
+│   ├── main.py                     # FastAPI app instance & routers
+│   ├── models.py                   # Pydantic models
+│   ├── run_server.py               # Server runner
+│   └── services.py                 # PageSpeed + Gemini logic
+├── Dockerfile                      # Optional containerization
+├── requirements.txt                # Dependencies
+└── README.md                       # You're reading it
+```
+---
+## 🚀 Getting Started
+### 1. Install Dependencies
+```bash
+pip install -r requirements.txt
+```
+### 2. Create a `.env` file
+```env
+PAGESPEED_API_KEY=your_pagespeed_api_key_here
+GEMINI_API_KEY=your_gemini_api_key_here
+MONGO_URI=mongodb://localhost:27017
+HOST=0.0.0.0
+PORT=8000
+DEBUG=True
+```
+### 3. Run the Application
+```bash
+# Option 1: Using the script
+python run_server.py
+# Option 2: Directly with uvicorn
+uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
+```
+---
+## 📘 API Overview
+### 🔗 General
+| Method | Endpoint  | Description                    |
+| ------ | --------- | ------------------------------ |
+| GET    | `/`       | Welcome + links to docs/health |
+| GET    | `/health` | Health check and uptime        |
+---
+### 🧠 PageSpeed + Gemini Endpoints
+| Method | Endpoint               | Description                       |
+| ------ | ---------------------- | --------------------------------- |
+| POST   | `/pagespeed`           | Fetch raw PageSpeed Insights JSON |
+| POST   | `/generate-report`     | Generate AI optimization report   |
+| POST   | `/generate-priorities` | Rank optimizations by priority    |
+---
+### 📚 RAG Chat System Endpoints
+| Method | Endpoint                        | Description                                |
+| ------ | ------------------------------- | ------------------------------------------ |
+| POST   | `/rag/ingest/{user_id}`         | Ingest documents and store FAISS index     |
+| POST   | `/rag/chat/create/{user_id}`    | Start a new chat session (returns chat ID) |
+| POST   | `/rag/chat/{user_id}/{chat_id}` | Ask a question in an existing chat session |
+---
+## 📎 RAG Workflow
+1. **Ingest Documents**
+   * POST `/rag/ingest/{user_id}`
+   * Body: `{"documents": ["doc 1 text", "doc 2 text", ...]}`
+2. **Create Chat**
+   * POST `/rag/chat/create/{user_id}`
+   * Response: `chat_id`
+3. **Ask Questions**
+   * POST `/rag/chat/{user_id}/{chat_id}`
+   * Body: `{"question": "What does the document say about X?"}`
+---
+## 🛠 Example Usage (Python)
+```python
+import requests
+# Ingest docs
+requests.post("http://localhost:8000/rag/ingest/user123", json={
+    "documents": ["The capital of France is Paris.", "Python is a programming language."]
+})
+# Create chat
+res = requests.post("http://localhost:8000/rag/chat/create/user123")
+chat_id = res.json()["chat_id"]
+# Chat
+requests.post(f"http://localhost:8000/rag/chat/user123/{chat_id}", json={
+    "question": "What is the capital of France?"
+})
+```
+---
+## 📄 API Docs
+Once the app is running:
+* Swagger UI: [http://localhost:8000/docs](http://localhost:8000/docs)
+* ReDoc: [http://localhost:8000/redoc](http://localhost:8000/redoc)
+---
+## 🛡️ Error Handling
+* `400 Bad Request`: Invalid input
+* `404 Not Found`: Unknown endpoint or missing user/chat/doc
+* `500 Internal Server Error`: API or service errors
+---
+## 🧪 Development Tips
+* Use `DEBUG=True` in `.env` for auto-reload and verbose logs
+* Modify `CORS` policy in `main.py` before production
+* Use `logger` calls to trace errors or logic flows
+---
+## 🌍 API Key Setup
+### PageSpeed Insights
+1. [Google Cloud Console](https://console.cloud.google.com/)
+2. Enable the API, generate a key
+### Gemini AI
+1. [Google AI Studio](https://makersuite.google.com/)
+2. Create API Key
+Add both to your `.env`.
+---
+## 📦 Docker Support
+Basic Dockerfile is included. To build and run:
+```bash
+docker build -t maas-api .
+docker run -p 8000:8000 --env-file .env maas-api
+```
+---
+## 🤝 Contributing
+1. Follow existing modular structure
+2. Document all new endpoints clearly
+3. Test edge cases (e.g., malformed docs or bad chat IDs)
+4. Use logging for traceability
+5. Create clear, typed Pydantic schemas
+---
+## 📜 License
+Licensed under the MIT License.
+---
+## 🔗 Repository
+[https://github.com/Hammadwakeel/MAAS](https://github.com/Hammadwakeel/MAAS)
+---

app/__init__.py ADDED Viewed

	@@ -0,0 +1,10 @@

+"""
+PageSpeed Insights Report Generator API
+A professional FastAPI application for generating detailed PageSpeed Insights
+reports using Google's APIs and Gemini AI for advanced analysis.
+"""
+__version__ = "1.0.0"
+__author__ = "Hammad Wakeel"
+__email__ = "hammadshah71200@gmail.com"

app/config.py ADDED Viewed

	@@ -0,0 +1,53 @@

+import os
+from dotenv import load_dotenv
+from pydantic_settings import BaseSettings, SettingsConfigDict
+# Load environment variables from .env
+load_dotenv()
+class Settings(BaseSettings):
+    """Application settings loaded from environment variables."""
+    # ───────────────────────────────────────────────────────────────────────────
+    # Google API Keys
+    # ───────────────────────────────────────────────────────────────────────────
+    pagespeed_api_key: str = os.getenv("PAGESPEED_API_KEY", "")
+    gemini_api_key: str = os.getenv("GEMINI_API_KEY", "")
+    # ───────────────────────────────────────────────────────────────────────────
+    # Chat & RAG Configuration
+    # ───────────────────────────────────────────────────────────────────────────
+    groq_api_key: str = os.getenv("GROQ_API_KEY", "")
+    vectorstore_base_path: str = os.getenv("VECTORSTORE_BASE_PATH", "./vectorstores")
+    # ───────────────────────────────────────────────────────────────────────────
+    # MongoDB Configuration (Local)
+    # ───────────────────────────────────────────────────────────────────────────
+    mongo_uri: str = os.getenv("MONGO_URI", "mongodb://localhost:27017")
+    mongo_chat_db: str = os.getenv("MONGO_CHAT_DB", "Education_chatbot")
+    mongo_chat_collection: str = os.getenv("MONGO_CHAT_COLLECTION", "chat_histories")
+    # ───────────────────────────────────────────────────────────────────────────
+    # FastAPI Server Configuration
+    # ───────────────────────────────────────────────────────────────────────────
+    host: str = os.getenv("HOST", "0.0.0.0")
+    port: int = int(os.getenv("PORT", "8000"))
+    debug: bool = os.getenv("DEBUG", "False").lower() == "true"
+    # ───────────────────────────────────────────────────────────────────────────
+    # App Metadata (unchanged)
+    # ───────────────────────────────────────────────────────────────────────────
+    app_name: str = "PageSpeed Insights Report Generator"
+    app_version: str = "1.0.0"
+    app_description: str = (
+        "Professional API for generating PageSpeed Insights reports "
+        "using Google's APIs and Gemini AI"
+    )
+    model_config = SettingsConfigDict(
+        env_file=".env",
+        env_file_encoding="utf-8"
+    )
+# Instantiate settings
+settings = Settings()

app/main.py ADDED Viewed

	@@ -0,0 +1,265 @@

+"""
+Main FastAPI application module.
+"""
+import time
+import logging
+import json
+from datetime import datetime
+from fastapi import FastAPI, HTTPException, Depends
+from fastapi.middleware.cors import CORSMiddleware
+from fastapi.responses import JSONResponse
+from contextlib import asynccontextmanager
+from app.config import settings
+from app.models import (
+    PageSpeedRequest,
+    PageSpeedDataResponse,
+    ReportRequest,
+    ReportResponse,
+    HealthResponse,
+    PriorityRequest,
+    PriorityResponse
+)
+from app.services import PageSpeedService
+from app.rag.routes import router as rag_router
+# ------------------------
+# Configure root logger
+# ------------------------
+logger = logging.getLogger("app")
+logger.setLevel(logging.INFO)
+handler = logging.StreamHandler()
+formatter = logging.Formatter(
+    "%(asctime)s | %(levelname)s | %(name)s | %(message)s",
+    datefmt="%Y-%m-%d %H:%M:%S"
+)
+handler.setFormatter(formatter)
+logger.addHandler(handler)
+# Global variable to track startup time
+startup_time = None
+@asynccontextmanager
+async def lifespan(app: FastAPI):
+    """Application lifespan manager."""
+    global startup_time
+    startup_time = time.time()
+    logger.info("🚀 Starting %s v%s", settings.app_name, settings.app_version)
+    logger.info("📊 Server running on %s:%s", settings.host, settings.port)
+    yield
+    logger.info("📊 Shutting down %s", settings.app_name)
+# Create FastAPI app instance
+app = FastAPI(
+    title=settings.app_name,
+    description=settings.app_description,
+    version=settings.app_version,
+    lifespan=lifespan,
+    docs_url="/docs",
+    redoc_url="/redoc"
+)
+# Mount RAG router
+app.include_router(rag_router)
+# Add CORS middleware
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],  # In production, specify exact origins
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+# Dependency to get PageSpeed service
+def get_pagespeed_service() -> PageSpeedService:
+    """Dependency to get a new PageSpeedService instance."""
+    return PageSpeedService()
+@app.get("/", response_model=dict)
+async def root():
+    """Root endpoint with API information."""
+    return {
+        "message": f"Welcome to {settings.app_name}",
+        "version": settings.app_version,
+        "description": settings.app_description,
+        "docs": "/docs",
+        "health": "/health"
+    }
+@app.get("/health", response_model=HealthResponse)
+async def health_check():
+    """Health check endpoint."""
+    global startup_time
+    if startup_time:
+        uptime_seconds = time.time() - startup_time
+        uptime_str = f"{uptime_seconds:.2f} seconds"
+    else:
+        uptime_str = "Unknown"
+    return HealthResponse(
+        status="healthy",
+        version=settings.app_version,
+        uptime=uptime_str
+    )
+@app.post("/pagespeed", response_model=PageSpeedDataResponse)
+async def fetch_pagespeed(
+    request: PageSpeedRequest,
+    service: PageSpeedService = Depends(get_pagespeed_service)
+):
+    """
+    Fetch raw PageSpeed Insights data for a given URL.
+    Request body:
+    {
+      "url": "https://www.example.com"
+    }
+    Returns:
+    {
+      "success": true,
+      "url": "https://www.example.com",
+      "pagespeed_data": { ... },
+      "error": null
+    }
+    """
+    url_str = str(request.url)
+    logger.info("Received POST /pagespeed for URL: %s", url_str)
+    try:
+        pagespeed_data = service.get_pagespeed_data(url_str)
+        logger.info("Returning PageSpeed data for %s", url_str)
+        return PageSpeedDataResponse(
+            success=True,
+            url=url_str,
+            pagespeed_data=pagespeed_data,
+            error=None
+        )
+    except Exception as e:
+        logger.error("Error in /pagespeed endpoint for URL %s: %s", url_str, e, exc_info=True)
+        return PageSpeedDataResponse(
+            success=False,
+            url=url_str,
+            pagespeed_data=None,
+            error=str(e)
+        )
+@app.post("/generate-report", response_model=ReportResponse)
+async def generate_report(
+    body: ReportRequest,
+    service: PageSpeedService = Depends(get_pagespeed_service)
+):
+    """
+    Generate a Gemini-based optimization report from previously-fetched PageSpeed JSON.
+    Request body:
+    {
+      "pagespeed_data": { …full PageSpeed JSON… }
+    }
+    Returns:
+    {
+      "success": true,
+      "report": "Gemini-generated analysis…",
+      "error": null
+    }
+    """
+    logger.info("Received POST /generate-report")
+    try:
+        pagespeed_data = body.pagespeed_data
+        logger.debug("PageSpeed JSON payload size: %d bytes", len(str(pagespeed_data)))
+        report_text = service.generate_report_with_gemini(pagespeed_data)
+        logger.info("Returning Gemini report.")
+        return ReportResponse(
+            success=True,
+            report=report_text,
+            error=None
+        )
+    except Exception as e:
+        logger.error("Error in /generate-report endpoint: %s", e, exc_info=True)
+        return ReportResponse(
+            success=False,
+            report=None,
+            error=str(e)
+        )
+@app.post("/generate-priorities", response_model=PriorityResponse)
+async def generate_priorities(
+    request: PriorityRequest,
+    service: PageSpeedService = Depends(get_pagespeed_service)
+):
+    """
+    Generate a prioritized list of performance improvements from a Gemini report.
+    Request body:
+    {
+      "report": "Full Gemini-generated performance report..."
+    }
+    Returns:
+    {
+      "success": true,
+      "priorities": {
+        "High": ["Optimize TBT by reducing JS execution", ...],
+        "Medium": [...],
+        "Low": [...]
+      },
+      "error": null
+    }
+    """
+    logger.info("Received POST /generate-priorities")
+    try:
+        priorities = service.generate_priority(request.report)
+        return PriorityResponse(success=True, priorities=priorities)
+    except Exception as e:
+        logger.error("Error in /generate-priorities: %s", e, exc_info=True)
+        return PriorityResponse(success=False, priorities=None, error=str(e))
+@app.exception_handler(404)
+async def not_found_handler(request, exc):
+    """Custom 404 handler."""
+    logger.warning("404 Not Found: %s %s", request.method, request.url.path)
+    return JSONResponse(
+        status_code=404,
+        content={
+            "error": "Not Found",
+            "message": "The requested endpoint was not found",
+            "docs": "/docs"
+        }
+    )
+@app.exception_handler(500)
+async def internal_error_handler(request, exc):
+    """Custom 500 handler."""
+    logger.error("500 Internal Server Error: %s %s -> %s", request.method, request.url.path, exc, exc_info=True)
+    return JSONResponse(
+        status_code=500,
+        content={
+            "error": "Internal Server Error",
+            "message": "An unexpected error occurred",
+            "timestamp": datetime.now().isoformat()
+        }
+    )
+if __name__ == "__main__":
+    import uvicorn
+    # When running directly, uvicorn will print its own logs. We just start it here.
+    uvicorn.run(
+        "app.main:app",
+        host=settings.host,
+        port=settings.port,
+        reload=settings.debug
+    )

app/models.py ADDED Viewed

	@@ -0,0 +1,112 @@

+# app/models.py
+"""
+Pydantic models for request/response validation.
+"""
+from pydantic import BaseModel, HttpUrl, Field
+from typing import Optional, Dict, Any, List
+class PageSpeedRequest(BaseModel):
+    """Request model for fetching PageSpeed data."""
+    url: HttpUrl = Field(
+        ...,
+        description="The URL to analyze for PageSpeed insights",
+        example="https://www.example.com"
+    )
+    class Config:
+        json_schema_extra = {
+            "example": {
+                "url": "https://www.ocoya.com/"
+            }
+        }
+class PageSpeedDataResponse(BaseModel):
+    """Response model that returns only the raw PageSpeed data."""
+    success: bool = Field(
+        ...,
+        description="Whether the PageSpeed fetch was successful"
+    )
+    url: str = Field(
+        ...,
+        description="The analyzed URL"
+    )
+    pagespeed_data: Optional[Dict[Any, Any]] = Field(
+        None,
+        description="Raw PageSpeed Insights data"
+    )
+    error: Optional[str] = Field(
+        None,
+        description="Error message if fetching failed"
+    )
+class ReportRequest(BaseModel):
+    """
+    Request model for generating a Gemini report.
+    Expects the entire raw PageSpeed JSON payload in the body.
+    """
+    pagespeed_data: Dict[Any, Any] = Field(
+        ...,
+        description="Raw PageSpeed Insights data (JSON) previously fetched",
+    )
+    class Config:
+        schema_extra = {
+            "example": {
+                "pagespeed_data": {
+                    # (Truncated example; in practice this would be
+                    # the full runPagespeed v5 JSON structure)
+                    "lighthouseResult": {
+                        "audits": {
+                            "first-contentful-paint": {"numericValue": 1234},
+                            "largest-contentful-paint": {"numericValue": 2345}
+                        }
+                    },
+                    "loadingExperience": {
+                        "metrics": {
+                            "FIRST_CONTENTFUL_PAINT_MS": {"percentile": 1200, "category": "FAST"}
+                        }
+                    }
+                    # …etc.
+                }
+            }
+        }
+class ReportResponse(BaseModel):
+    """Response model that returns only the Gemini-generated report."""
+    success: bool = Field(
+        ...,
+        description="Whether report generation was successful"
+    )
+    report: Optional[str] = Field(
+        None,
+        description="Gemini-generated performance optimization report"
+    )
+    error: Optional[str] = Field(
+        None,
+        description="Error message if report generation failed"
+    )
+class HealthResponse(BaseModel):
+    """Health check response model."""
+    status: str = Field(
+        ...,
+        description="Health status of the API"
+    )
+    version: str = Field(
+        ...,
+        description="API version"
+    )
+    uptime: str = Field(
+        ...,
+        description="API uptime"
+    )
+class PriorityRequest(BaseModel):
+    report: str
+class PriorityResponse(BaseModel):
+    success: bool
+    priorities: Optional[Dict[str, List[str]]] = None
+    error: Optional[str] = None

app/rag/__init__.py ADDED Viewed

File without changes

app/rag/db.py ADDED Viewed

	@@ -0,0 +1,16 @@

+from pymongo import MongoClient
+from app.config import settings
+# ──────────────────────────────────────────────────────────────────────────────
+# MongoDB Initialization
+# ──────────────────────────────────────────────────────────────────────────────
+# Connect to MongoDB using the URI from app/config.py
+mongo_client = MongoClient(settings.mongo_uri)
+mongo_db = mongo_client[settings.mongo_chat_db]
+# Collection to store metadata that maps user_id → vectorstore_path
+vectorstore_meta_coll = mongo_db["vectorstore_metadata"]
+# Name of the collection that MongoDBChatMessageHistory will write to
+chat_collection_name = settings.mongo_chat_collection

app/rag/embeddings.py ADDED Viewed

	@@ -0,0 +1,71 @@

+import os
+from langchain_community.embeddings import HuggingFaceBgeEmbeddings
+from langchain.text_splitter import RecursiveCharacterTextSplitter
+from langchain.prompts import ChatPromptTemplate
+def get_llm():
+    """
+    Returns a ChatGroq LLM instance (Llama 3.3 70B) using the GROQ API key
+    stored in the environment.
+    """
+    from langchain_groq import ChatGroq
+    llm = ChatGroq(
+        model="meta-llama/llama-4-scout-17b-16e-instruct",
+        temperature=0,
+        max_tokens=1024,
+        api_key=os.getenv("GROQ_API_KEY", "")  # Put your actual GROQ key in .env as GROQ_API_KEY
+    )
+    return llm
+# ──────────────────────────────────────────────────────────────────────────────
+# 1. Text Splitter (512 tokens per chunk, 100 token overlap)
+# ──────────────────────────────────────────────────────────────────────────────
+text_splitter = RecursiveCharacterTextSplitter(chunk_size=512, chunk_overlap=100)
+# ──────────────────────────────────────────────────────────────────────────────
+# 2. Embeddings Model (HuggingFace BGE) on CPU
+# ──────────────────────────────────────────────────────────────────────────────
+HF_TOKEN = os.getenv("HUGGINGFACEHUB_API_TOKEN")
+from huggingface_hub import login
+login(HF_TOKEN)
+model_name = "BAAI/bge-small-en-v1.5"
+model_kwargs = {"device": "cpu"}
+encode_kwargs = {"normalize_embeddings": True}
+embeddings = HuggingFaceBgeEmbeddings(
+    model_name=model_name, model_kwargs=model_kwargs, encode_kwargs=encode_kwargs
+)
+# ──────────────────────────────────────────────────────────────────────────────
+# 3. Prompt Template for RAG Assistant
+# ──────────────────────────────────────────────────────────────────────────────
+prompt_template = """
+You are an assistant specialized in analyzing and improving website performance. Your goal is to provide accurate, practical, and performance-driven answers.
+Use the following retrieved context (such as PageSpeed Insights data or audit results) to answer the user's question.
+If the context lacks sufficient information, respond with "I don't know." Do not make up answers or provide unverified information.
+Guidelines:
+1. Extract relevant performance insights from the context to form a helpful and actionable response.
+2. Maintain a clear, professional, and user-focused tone.
+3. If the question is unclear or needs more detail, ask for clarification politely.
+4. Prioritize recommendations that follow web performance best practices (e.g., optimizing load times, reducing blocking resources, improving visual stability).
+Retrieved context:
+{context}
+User's question:
+{question}
+Your response:
+"""
+user_prompt = ChatPromptTemplate.from_messages(
+    [
+        ("system", prompt_template),
+        ("human", "{question}"),
+    ]
+)

app/rag/logging_config.py ADDED Viewed

	@@ -0,0 +1,13 @@

+import logging
+# Configure a module‐level logger for RAG components
+logger = logging.getLogger("app.rag")
+logger.setLevel(logging.INFO)
+handler = logging.StreamHandler()
+formatter = logging.Formatter(
+    "%(asctime)s | %(levelname)s | %(name)s | %(message)s",
+    datefmt="%Y-%m-%d %H:%M:%S"
+)
+handler.setFormatter(formatter)
+logger.addHandler(handler)

app/rag/routes.py ADDED Viewed

	@@ -0,0 +1,140 @@

+import os
+import uuid
+from fastapi import APIRouter, HTTPException
+from typing import Optional
+from .schemas import (
+    IngestRequest,
+    IngestResponse,
+    CreateChatResponse,
+    ChatRequest,
+    ChatResponse
+)
+from .utils import (
+    text_splitter,
+    embeddings,
+    get_vectorstore_path,
+    save_vectorstore_to_disk,
+    upsert_vectorstore_metadata,
+    build_or_load_vectorstore,
+    build_rag_chain,
+    initialize_chat_history
+)
+from .logging_config import logger
+router = APIRouter(prefix="/rag", tags=["rag"])
+@router.post("/ingest/{user_id}", response_model=IngestResponse)
+async def ingest_documents(user_id: str, body: IngestRequest):
+    """
+    Ingest a list of text documents into a FAISS vectorstore for this user.
+    Steps:
+      1. Concatenate all documents into one string.
+      2. Split into chunks using RecursiveCharacterTextSplitter.
+      3. Create a FAISS vectorstore from those chunks.
+      4. Save the vectorstore to disk under ./vectorstores/{user_id}/faiss_index.
+      5. Upsert metadata in Mongo (user_id -> vectorstore_path).
+    """
+    logger.info("Ingestion requested for user_id=%s. Number of docs=%d", user_id, len(body.documents))
+    try:
+        # 1. Join all provided documents
+        all_text = "\n\n".join(body.documents)
+        # 2. Split into chunks
+        text_chunks = text_splitter.split_text(all_text)
+        logger.info("Split into %d chunks", len(text_chunks))
+        # 3. Build FAISS vectorstore
+        from langchain.vectorstores import FAISS as _FAISS
+        vs = _FAISS.from_texts(texts=text_chunks, embedding=embeddings)
+        # 4. Save to disk
+        faiss_path = save_vectorstore_to_disk(vs, user_id)
+        logger.info("Saved FAISS index to %s", faiss_path)
+        # 5. Upsert metadata
+        upsert_vectorstore_metadata(user_id, faiss_path)
+        logger.info("Upserted vectorstore metadata for user_id=%s", user_id)
+        return IngestResponse(
+            success=True,
+            message="Vectorstore created successfully.",
+            user_id=user_id,
+            vectorstore_path=faiss_path
+        )
+    except Exception as e:
+        logger.error("Error during ingestion for user_id=%s: %s", user_id, e, exc_info=True)
+        raise HTTPException(status_code=500, detail=f"Ingestion failed: {e}")
+@router.post("/chat/create/{user_id}", response_model=CreateChatResponse)
+async def create_chat_session(user_id: str):
+    """
+    Create a new chat session for this user:
+      - Generate a chat_id (UUID).
+      - Initialize an empty MongoDBChatMessageHistory for that chat_id.
+      - Return the chat_id so the client can use it in subsequent calls.
+    """
+    logger.info("Creating new chat session for user_id=%s", user_id)
+    try:
+        chat_id = str(uuid.uuid4())
+        # Initialize chat history (this writes an empty session to Mongo)
+        _ = initialize_chat_history(chat_id)
+        logger.info("Created chat history in Mongo for chat_id=%s", chat_id)
+        return CreateChatResponse(
+            success=True,
+            message="Chat session created.",
+            user_id=user_id,
+            chat_id=chat_id
+        )
+    except Exception as e:
+        logger.error("Error creating chat for user_id=%s: %s", user_id, e, exc_info=True)
+        raise HTTPException(status_code=500, detail=f"Failed to create chat session: {e}")
+@router.post("/chat/{user_id}/{chat_id}", response_model=ChatResponse)
+async def chat_with_user(user_id: str, chat_id: str, body: ChatRequest):
+    """
+    Send a user question to the RAG chain and return the LLM answer.
+    - Loads the FAISS index for user_id (404 if not found).
+    - Retrieves (or initializes) the MongoDBChatMessageHistory for chat_id.
+    - Runs the ConversationalRetrievalChain to get an answer.
+    - Returns the answer, plus re‐stores chat history in Mongo automatically.
+    """
+    question = body.question
+    logger.info("Received chat request: user_id=%s, chat_id=%s, question='%s'", user_id, chat_id, question)
+    try:
+        # 1. Build the RAG chain (or 404 if no vectorstore)
+        chain = build_rag_chain(user_id, chat_id)
+        # 2. Call the chain
+        result = chain.invoke({"question": question})
+        # Some chains use "answer", some use "output_text"
+        answer = result.get("answer") or result.get("output_text") or None
+        if answer is None:
+            logger.error("Chain returned no 'answer' or 'output_text': %s", result)
+            raise Exception("Failed to retrieve answer from chain.")
+        logger.info("Chain answered for chat_id=%s: %s", chat_id, answer)
+        return ChatResponse(
+            success=True,
+            answer=answer,
+            error=None,
+            chat_id=chat_id,
+            user_id=user_id
+        )
+    except HTTPException:
+        # Re‐raise known HTTPExceptions (e.g. 404 from build_rag_chain)
+        raise
+    except Exception as e:
+        logger.error("Error in chat endpoint for user_id=%s, chat_id=%s: %s", user_id, chat_id, e, exc_info=True)
+        return ChatResponse(
+            success=False,
+            answer=None,
+            error=str(e),
+            chat_id=chat_id,
+            user_id=user_id
+        )

app/rag/schemas.py ADDED Viewed

	@@ -0,0 +1,45 @@

+from pydantic import BaseModel, Field
+from typing import List, Optional
+class IngestRequest(BaseModel):
+    """
+    Request body for ingesting documents into a user's FAISS vector store.
+    """
+    documents: List[str] = Field(
+        ...,
+        description="A list of text documents (strings) to ingest into the vector store."
+    )
+class IngestResponse(BaseModel):
+    """
+    Response after ingesting documents for a user.
+    """
+    success: bool
+    message: str
+    user_id: str
+    vectorstore_path: Optional[str] = None
+class CreateChatResponse(BaseModel):
+    """
+    Response after creating a new chat session for a user.
+    """
+    success: bool
+    message: str
+    user_id: str
+    chat_id: Optional[str] = None
+class ChatRequest(BaseModel):
+    """
+    Body for sending a user message to an existing chat session.
+    """
+    question: str = Field(..., description="The user's question or message.")
+class ChatResponse(BaseModel):
+    """
+    Response from the RAG chatbot endpoint.
+    """
+    success: bool
+    answer: Optional[str] = None
+    error: Optional[str] = None
+    chat_id: str
+    user_id: str

app/rag/utils.py ADDED Viewed

	@@ -0,0 +1,136 @@

+import os
+from typing import Optional, Dict, Any
+from fastapi import HTTPException
+from langchain_community.vectorstores import FAISS
+from langchain_mongodb.chat_message_histories import MongoDBChatMessageHistory
+from langchain.memory import ConversationBufferMemory                # ← IMPORT THIS
+from langchain.chains import ConversationalRetrievalChain
+from app.config import settings
+from .db import vectorstore_meta_coll, chat_collection_name
+from .embeddings import embeddings, text_splitter, user_prompt, get_llm
+from .logging_config import logger
+# ──────────────────────────────────────────────────────────────────────────────
+# 1. Helper: Path to Store (or Load) a User's FAISS Vectorstore on Disk
+# ──────────────────────────────────────────────────────────────────────────────
+def get_vectorstore_path(user_id: str) -> str:
+    """
+    Ensure a local directory exists for this user's vectorstore.
+    Returns a path like './vectorstores/{user_id}'.
+    """
+    base_dir = settings.vectorstore_base_path
+    user_dir = os.path.join(base_dir, user_id)
+    os.makedirs(user_dir, exist_ok=True)
+    return user_dir
+# ──────────────────────────────────────────────────────────────────────────────
+# 2. Build or Load an Existing FAISS Index for a User
+# ──────────────────────────────────────────────────────────────────────────────
+def build_or_load_vectorstore(user_id: str) -> FAISS:
+    """
+    Attempt to load an existing FAISS index for this user.
+    If not found on disk, raise a FileNotFoundError.
+    """
+    user_dir = get_vectorstore_path(user_id)
+    faiss_index_path = os.path.join(user_dir, "faiss_index")
+    if not os.path.isdir(faiss_index_path):
+        raise FileNotFoundError(f"No vectorstore found at {faiss_index_path}")
+    # Allow loading your own index via pickle
+    return FAISS.load_local(
+        folder_path=faiss_index_path,
+        embeddings=embeddings,
+        allow_dangerous_deserialization=True
+    )
+# ──────────────────────────────────────────────────────────────────────────────
+# 3. Save a FAISS Vectorstore to Disk for a User
+# ──────────────────────────────────────────────────────────────────────────────
+def save_vectorstore_to_disk(vectorstore: FAISS, user_id: str) -> str:
+    """
+    Save the FAISS vectorstore under './vectorstores/{user_id}/faiss_index'.
+    Returns the path to that saved folder.
+    """
+    user_dir = get_vectorstore_path(user_id)
+    faiss_index_path = os.path.join(user_dir, "faiss_index")
+    os.makedirs(faiss_index_path, exist_ok=True)
+    vectorstore.save_local(folder_path=faiss_index_path)
+    return faiss_index_path
+# ──────────────────────────────────────────────────────────────────────────────
+# 4. Upsert or Fetch Vectorstore Metadata in MongoDB
+# ──────────────────────────────────────────────────────────────────────────────
+def upsert_vectorstore_metadata(user_id: str, vectorstore_path: str) -> None:
+    """
+    Insert or update a document mapping user_id → vectorstore_path in MongoDB.
+    """
+    vectorstore_meta_coll.update_one(
+        {"user_id": user_id},
+        {"$set": {"vectorstore_path": vectorstore_path}},
+        upsert=True
+    )
+def get_vectorstore_metadata(user_id: str) -> Optional[Dict[str, Any]]:
+    """
+    Retrieve the metadata doc (if any) for this user_id.
+    """
+    return vectorstore_meta_coll.find_one({"user_id": user_id})
+# ──────────────────────────────────────────────────────────────────────────────
+# 5. Initialize (or Return) a MongoDBChatMessageHistory for chat_id
+# ────────────────────────────────���─────────────────────────────────────────────
+def initialize_chat_history(chat_id: str) -> MongoDBChatMessageHistory:
+    """
+    Create and return a MongoDBChatMessageHistory for the given chat_id.
+    """
+    return MongoDBChatMessageHistory(
+        session_id=chat_id,
+        connection_string=settings.mongo_uri,
+        database_name=settings.mongo_chat_db,
+        collection_name=chat_collection_name,
+    )
+# ──────────────────────────────────────────────────────────────────────────────
+# 6. Build a ConversationalRetrievalChain (RAG Chain) for user_id + chat_id
+# ──────────────────────────────────────────────────────────────────────────────
+def build_rag_chain(user_id: str, chat_id: str) -> ConversationalRetrievalChain:
+    """
+    - Loads the FAISS index for user_id.
+    - Creates a retriever (k=3).
+    - Wraps MongoDBChatMessageHistory in a ConversationBufferMemory.
+    - Attaches the ChatGroq LLM + user_prompt.
+    """
+    # 1. Load FAISS index (or 404 if not found)
+    try:
+        faiss_vs = build_or_load_vectorstore(user_id)
+    except FileNotFoundError:
+        raise HTTPException(status_code=404, detail="Vectorstore not found for this user. Call /rag/ingest first.")
+    retriever = faiss_vs.as_retriever(search_kwargs={"k": 5})
+    # 2. Instantiate a MongoDB-based chat history
+    chat_history = initialize_chat_history(chat_id)
+    # 3. Wrap that history in a ConversationBufferMemory, so the chain gets a valid "Memory" object
+    memory = ConversationBufferMemory(
+        memory_key="chat_history",    # how the chain will reference the stored chat messages
+        chat_history=chat_history     # THIS tells the memory to use your MongoDB store
+    )
+    # 4. Get the LLM
+    llm = get_llm()
+    # 5. Build the ConversationalRetrievalChain with the wrapped memory
+    chain = ConversationalRetrievalChain.from_llm(
+        llm=llm,
+        retriever=retriever,
+        memory=memory,                             # ← pass the ConversationBufferMemory here
+        return_source_documents=False,
+        chain_type="stuff",
+        combine_docs_chain_kwargs={"prompt": user_prompt},
+        verbose=False,
+    )
+    return chain

app/run_server.py ADDED Viewed

	@@ -0,0 +1,20 @@

+"""
+Server runner script for the PageSpeed Insights API.
+"""
+import uvicorn
+from app.config import settings
+if __name__ == "__main__":
+    print(f"🚀 Starting {settings.app_name}")
+    print(f"📍 Server: {settings.host}:{settings.port}")
+    print(f"🔧 Debug Mode: {settings.debug}")
+    print(f"📚 API Documentation: http://{settings.host}:{settings.port}/docs")
+    print(f"📋 Alternative Docs: http://{settings.host}:{settings.port}/redoc")
+    uvicorn.run(
+        "app.main:app",
+        host=settings.host,
+        port=settings.port,
+        reload=settings.debug,
+        log_level="info" if not settings.debug else "debug"
+    )

app/services.py ADDED Viewed

	@@ -0,0 +1,306 @@

+"""
+Business logic services for PageSpeed analysis.
+"""
+import json
+import requests
+import logging
+import google.generativeai as genai
+from typing import Dict, Any
+from app.config import settings
+# Create a module-level logger
+logger = logging.getLogger(__name__)
+class PageSpeedService:
+    """Service class for PageSpeed Insights operations."""
+    def __init__(self):
+        self.pagespeed_api_key = settings.pagespeed_api_key
+        self.gemini_api_key = settings.gemini_api_key
+        if self.gemini_api_key:
+            logger.info("Configuring Gemini AI with provided API key.")
+            genai.configure(api_key=self.gemini_api_key)
+        else:
+            logger.warning("No Gemini API key found. Gemini reporting will fail if called.")
+    def get_pagespeed_data(self, target_url: str) -> Dict[Any, Any]:
+        """
+        Fetch data from the PageSpeed Insights API for the given URL.
+        Args:
+            target_url (str): The URL to analyze
+        Returns:
+            Dict[Any, Any]: PageSpeed Insights data
+        Raises:
+            Exception: If API request fails
+        """
+        logger.info("Starting PageSpeed fetch for URL: %s", target_url)
+        if not self.pagespeed_api_key:
+            msg = "PageSpeed API key not configured"
+            logger.error(msg)
+            raise Exception(msg)
+        endpoint = "https://www.googleapis.com/pagespeedonline/v5/runPagespeed"
+        params = {
+            "url": target_url,
+            "key": self.pagespeed_api_key
+        }
+        try:
+            response = requests.get(endpoint, params=params, timeout=60)
+            response.raise_for_status()
+            logger.info("Successfully fetched PageSpeed data for %s (status %s)", target_url, response.status_code)
+            return response.json()
+        except requests.exceptions.HTTPError as http_err:
+            msg = f"HTTP error fetching PageSpeed data: {http_err}"
+            logger.error(msg, exc_info=True)
+            raise Exception(msg)
+        except requests.exceptions.RequestException as req_err:
+            msg = f"Request exception fetching PageSpeed data: {req_err}"
+            logger.error(msg, exc_info=True)
+            raise Exception(msg)
+        except Exception as e:
+            msg = f"Unexpected error in get_pagespeed_data: {e}"
+            logger.error(msg, exc_info=True)
+            raise Exception(msg)
+    def generate_report_with_gemini(self, pagespeed_data: Dict[Any, Any]) -> str:
+        """
+        Uses the Gemini model to generate a detailed report based on the PageSpeed Insights data,
+        employing an advanced prompt for specialized analysis and recommendations.
+        Args:
+            pagespeed_data (Dict[Any, Any]): PageSpeed Insights data
+        Returns:
+            str: Generated performance optimization report
+        Raises:
+            Exception: If report generation fails
+        """
+        logger.info("Starting Gemini report generation.")
+        if not self.gemini_api_key:
+            msg = "Gemini API key not configured"
+            logger.error(msg)
+            raise Exception(msg)
+        try:
+            # Select a Gemini model
+            model = genai.GenerativeModel("gemini-2.0-flash")
+            prompt = self._create_analysis_prompt(pagespeed_data)
+            logger.debug("Generated Gemini prompt: %s", prompt[:200] + "…")
+            response = model.generate_content(prompt)
+            if response and hasattr(response, "text") and response.text:
+                logger.info("Gemini report generated successfully.")
+                return response.text
+            elif response and response.candidates and response.candidates[0].finish_reason == "SAFETY":
+                msg = "Report generation was blocked due to safety settings"
+                logger.error(msg)
+                raise Exception(msg)
+            else:
+                msg = "No report could be generated or the response was empty"
+                logger.error(msg)
+                raise Exception(msg)
+        except Exception as e:
+            msg = f"Error generating report with Gemini: {e}"
+            logger.error(msg, exc_info=True)
+            raise Exception(msg)
+    def _create_analysis_prompt(self, pagespeed_data: Dict[Any, Any]) -> str:
+        """
+        Create the specialized prompt for Gemini analysis.
+        Args:
+            pagespeed_data (Dict[Any, Any]): PageSpeed Insights data
+        Returns:
+            str: Formatted prompt for Gemini
+        """
+        # We do not log full JSON here to avoid huge payload in logs,
+        # but we do log that prompt construction is happening.
+        logger.debug("Building Gemini analysis prompt from PageSpeed data.")
+        return (
+            "**Role:** You are an **Expert Web Performance Optimization Analyst and Senior Full-Stack Engineer** "
+            "with deep expertise in interpreting Google PageSpeed Insights data, diagnosing frontend and "
+            "backend bottlenecks, and devising actionable, high-impact optimization strategies.\n\n"
+            "**Objective:**\n"
+            "Analyze the provided Google PageSpeed Insights JSON data for the analyzed website. "
+            "Your primary goal is to generate a comprehensive, prioritized, and actionable set of strategies "
+            "to significantly improve its performance. These strategies must directly address the specific "
+            "metrics and audit findings within the report, aiming to elevate both Core Web Vitals "
+            "(LCP, INP, CLS) and other key performance indicators (FCP, TTFB, TBT), and ultimately "
+            "improve the `overall_category` to 'FAST' where possible.\n\n"
+            "**Input Data:**\n"
+            "The following JSON object contains the complete PageSpeed Insights report:\n"
+            f"```json\n{json.dumps(pagespeed_data, indent=2)}\n```\n\n"
+            "**Analysis and Strategy Formulation - Instructions:**\n\n"
+            "1.  **Executive Performance Summary:**\n"
+            "    * Begin with a concise overview of the website's current performance status based on the provided data.\n"
+            "    * Highlight the `overall_category` for both `loadingExperience` (specific URL) and `originLoadingExperience` (entire origin).\n"
+            "    * Pinpoint the current values and `category` (e.g., FAST, AVERAGE, SLOW) for each key metric:\n"
+            "        * `CUMULATIVE_LAYOUT_SHIFT_SCORE` (CLS)\n"
+            "        * `EXPERIMENTAL_TIME_TO_FIRST_BYTE` (TTFB)\n"
+            "        * `FIRST_CONTENTFUL_PAINT_MS` (FCP)\n"
+            "        * `INTERACTION_TO_NEXT_PAINT` (INP)\n"
+            "        * `LARGEST_CONTENTFUL_PAINT_MS` (LCP)\n"
+            "        * `total-blocking-time` (TBT) from Lighthouse.\n"
+            "    * Identify any significant `metricSavings` opportunities highlighted in the Lighthouse `audits`.\n\n"
+            "2.  **Deep-Dive into Bottlenecks & Audit Failures:**\n"
+            "    * Systematically go through the `loadingExperience`, `originLoadingExperience`, and `lighthouseResult` (especially the `audits` section).\n"
+            "    * For each underperforming metric or failed/suboptimal audit (e.g., Lighthouse scores less than 1, or `notApplicable` audits with clear improvement paths like `lcp-lazy-loaded`, `critical-request-chains`, `dom-size`, `non-composited-animations`), extract the relevant details, display values, and numeric values.\n\n"
+            "3.  **Develop Prioritized, Actionable Optimization Strategies:**\n"
+            "    For *each* identified performance issue or opportunity, provide the following:\n"
+            "    * **A. Issue & Evidence:** Clearly state the problem (e.g., \"High Total Blocking Time,\" \"Suboptimal Largest Contentful Paint due to unoptimized image,\" \"Excessive DOM Size,\" \"Render-blocking resources in critical request chain\"). Refer directly to the JSON data points and audit IDs that support this finding (e.g., `audits['total-blocking-time'].numericValue`, `audits['critical-request-chains'].details.longestChain`).\n"
+            "    * **B. Root Cause Analysis (Inferred):** Briefly explain the likely technical reasons behind the issue based on the data.\n"
+            "    * **C. Specific, Technical Recommendation(s):** Provide detailed, actionable steps a development team can take. Be specific.\n"
+            "    * **D. Targeted Metric Improvement:** Specify which primary and secondary metrics this strategy will positively impact (e.g., \"This will directly reduce LCP and improve FCP,\" or \"This will significantly lower TBT and improve INP.\").\n"
+            "    * **E. Priority Level:** Assign a priority (High, Medium, Low) based on:\n"
+            "        * Impact on Core Web Vitals.\n"
+            "        * Potential for overall score improvement (consider `metricSavings`).\n"
+            "        * Severity of the issue (e.g., 'SLOW' or 'AVERAGE' categories).\n"
+            "        * Estimated implementation effort (favor high-impact, low/medium-effort tasks for higher priority).\n"
+            "    * **F. Justification for Priority:** Briefly explain why this priority was assigned.\n\n"
+            "4.  **Strategic Grouping (Optional but Recommended):**\n"
+            "    If applicable, group recommendations by area (e.g., Asset Optimization, JavaScript Optimization, Server-Side Improvements, Rendering Path Optimization, CSS Enhancements).\n\n"
+            "5.  **Anticipated Overall Impact:**\n"
+            "    Conclude with a statement on the anticipated overall improvement in performance and user experience if the high and medium-priority recommendations are implemented.\n\n"
+            "**Output Format:**\n"
+            "Please structure your response clearly. Use headings, subheadings, and bullet points to enhance readability and actionability. For example:\n\n"
+            "---\n"
+            "## Executive Performance Summary\n"
+            "* **Overall URL Loading Experience Category:** [e.g., AVERAGE]\n"
+            "* **Overall Origin Loading Experience Category:** [e.g., AVERAGE]\n"
+            "* **Key Metrics:**\n"
+            "    * LCP: [Value] ms ([Category])\n"
+            "    * INP: [Value] ms ([Category])\n"
+            "    * ...etc.\n\n"
+            "---\n"
+            "## Prioritized Optimization Strategies\n\n"
+            "### High Priority\n"
+            "**1. Issue & Evidence:** [e.g., High Total Blocking Time (TBT) of 1200 ms - `audits['total-blocking-time'].numericValue`]\n"
+            "    * **Root Cause Analysis:** [e.g., Long JavaScript tasks on the main thread during page load, likely from unoptimized third-party scripts or complex component rendering.]\n"
+            "    * **Specific, Technical Recommendation(s):**\n"
+            "        * [Action 1]\n"
+            "        * [Action 2]\n"
+            "    * **Targeted Metric Improvement:** [e.g., TBT, INP, FCP]\n"
+            "    * **Justification for Priority:** [e.g., Directly impacts interactivity (INP) and is a significant contributor to a poor lab score.]\n\n"
+            "**(Continue with other High, Medium, and Low priority items)**\n"
+            "---\n\n"
+            "**Ensure your analysis is based *solely* on the provided JSON data and your expert interpretation of it. "
+            "Avoid generic advice; all recommendations must be tied to specific findings within the report. "
+            "Do not add anything irrelevant in the report. Do not write text in the starting of the report**"
+        )
+    def analyze_url(self, url: str) -> Dict[str, Any]:
+        """
+        Perform complete PageSpeed analysis for a given URL.
+        Args:
+            url (str): The URL to analyze
+        Returns:
+            Dict[str, Any]: Complete analysis results
+        """
+        try:
+            # Fetch PageSpeed data
+            pagespeed_data = self.get_pagespeed_data(url)
+            # Generate report with Gemini
+            report = self.generate_report_with_gemini(pagespeed_data)
+            return {
+                "success": True,
+                "url": url,
+                "report": report,
+                "pagespeed_data": pagespeed_data,
+                "error": None
+            }
+        except Exception as e:
+            logger.error("Failed full analyze_url flow: %s", e, exc_info=True)
+            return {
+                "success": False,
+                "url": url,
+                "report": None,
+                "pagespeed_data": None,
+                "error": str(e)
+            }
+    def generate_priority(self, report: str) -> Dict[str, Any]:
+        """
+        Generate a dictionary of prioritized performance recommendations based on the Gemini-generated report.
+        Args:
+            report (str): The Gemini-generated performance report
+        Returns:
+            Dict[str, Any]: Dictionary mapping priority levels to optimization suggestions
+        Raises:
+            Exception: If the priority generation fails
+        """
+        logger.info("Generating prioritized suggestions from the Gemini report.")
+        if not self.gemini_api_key:
+            msg = "Gemini API key not configured"
+            logger.error(msg)
+            raise Exception(msg)
+        try:
+            model = genai.GenerativeModel("gemini-2.0-flash")
+            prompt = (
+                "You are an expert web performance analyst.\n"
+                "Extract and organize the optimization recommendations from the following performance report\n"
+                "into a JSON object with exactly these keys: \"high\", \"medium\", \"low\", and \"unknown\".\n"
+                "Each key’s value should be a list of suggestion strings.\n\n"
+                "Important:\n"
+                "- Respond with *only* a valid JSON object.\n"
+                "- Do NOT include any commentary or explanation outside the JSON.\n\n"
+                "Performance Report:\n"
+                "```\n"
+                + report +
+                "\n```"
+            )
+            response = model.generate_content(prompt)
+            raw = (response.text or "").strip()
+            logger.debug("Raw priority response: %s", raw[:500] + ("…" if len(raw) > 500 else ""))
+            # Locate the JSON portion by finding the first '{' and the last '}'
+            start = raw.find('{')
+            end = raw.rfind('}')
+            if start == -1 or end == -1 or end <= start:
+                raise ValueError("No JSON object found in Gemini response")
+            json_str = raw[start:end+1]
+            logger.debug("Extracted JSON string: %s", json_str)
+            suggestions = json.loads(json_str)
+            if not isinstance(suggestions, dict):
+                raise ValueError("Parsed JSON is not a dictionary")
+            # Ensure all expected keys exist
+            for key in ("high", "medium", "low", "unknown"):
+                suggestions.setdefault(key, [])
+            logger.info("Priority suggestions generated successfully.")
+            return suggestions
+        except json.JSONDecodeError as je:
+            msg = f"Failed to parse JSON from Gemini response: {je}"
+            logger.error(msg, exc_info=True)
+            raise Exception(msg)
+        except Exception as e:
+            msg = f"Error generating priority suggestions: {e}"
+            logger.error(msg, exc_info=True)
+            raise

requirements.txt ADDED Viewed

	@@ -0,0 +1,12 @@

+fastapi==0.104.1
+uvicorn==0.24.0
+python-dotenv==1.0.0
+requests==2.31.0
+google-generativeai==0.3.2
+pydantic==2.5.0
+pydantic_settings
+langchain_groq
+langchain_community
+faiss-cpu
+pymongo
+langchain-mongodb