Spaces:

niddijoris
/

ChatWithData

Sleeping

App Files Files Community

niddijoris commited on Feb 5

Commit

790e0e9

1 Parent(s): 456b98f

Upload Streamlit app

Browse files

Files changed (23) hide show

.gitattributes +1 -0
README.md +84 -14
requirements.txt +6 -3
src/.env.example +16 -0
src/QUICKSTART.md +70 -0
src/README.md +89 -0
src/agent/__init__.py +5 -0
src/agent/ai_agent.py +184 -0
src/agent/tools.py +269 -0
src/config.py +48 -0
src/data/car_prices.csv +3 -0
src/database/__init__.py +5 -0
src/database/db_manager.py +240 -0
src/database/safety_validator.py +72 -0
src/streamlit_app.py +364 -35
src/support/__init__.py +4 -0
src/support/github_integration.py +168 -0
src/tests/__init__.py +1 -0
src/tests/test_safety_validator.py +113 -0
src/ui/__init__.py +15 -0
src/ui/charts.py +216 -0
src/utils/__init__.py +4 -0
src/utils/logger.py +93 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+src/data/car_prices.csv filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,19 +1,89 @@
 ---
-title: ChatWithData
-emoji: 🚀
-colorFrom: red
-colorTo: red
-sdk: docker
-app_port: 8501
-tags:
-- streamlit
-pinned: false
-short_description: Streamlit template space
 ---
-# Welcome to Streamlit!
-Edit `/src/streamlit_app.py` to customize this app to your heart's desire. :heart:
-If you have any questions, checkout our [documentation](https://docs.streamlit.io) and [community
-forums](https://discuss.streamlit.io).

+# 🚗 Data Insights App
+An AI-powered data analysis platform that allows users to explore a massive car auction dataset (558,000+ records) using natural language. The app combines **Streamlit** for the interface, **SQLite** for data management, and **OpenAI's GPT-4** for intelligent analysis and dynamic chart generation.
+---
+[## Hugging Face](https://huggingface.co/spaces/niddijoris/ChatWithData)
+## � Application Gallery
+| Dashboard Overview | AI Chat & Analytics |
+| :---: | :---: |
+| ![Dashboard Overview](/screenshots/1.png) | ![AI Chat & Analytics](/screenshots/2.png) |
+| **Real-time Statistics & Insights** | **Intelligent Querying & Data Analysis** |
+| Dynamic Chart Generation | Safety & Logs |
+| :---: | :---: |
+| ![Chart Generation](/screenshots/3.png) | ![Console Logs](/screenshots/4.png) |
+| **AI-driven Visualizations** | **Security Guardrails & Activity Monitoring** |
 ---
+## 🌟 Key Features
+### 🤖 Intelligent AI Agent
+- **Natural Language Querying**: Ask questions like "What is the average price of a BMW?" or "Compare prices between California and Florida".
+- **Dynamic Chart Generation**: Ask for visualizations (bar, line, pie, scatter) and the AI will generate them instantly.
+- **Context-Aware Support**: If the agent can't help, it offers to create a GitHub support ticket with the chat history.
+### 🛡️ Secure Data Management
+- **ReadOnly Safety**: Strict SQL validation ensures only `SELECT` queries are executed. Dangerous operations (`DELETE`, `DROP`, `UPDATE`) are automatically blocked.
+- **Privacy Guardrails**: The agent never communicates the full dataset, only relevant snippets (limited to 100 rows).
+### 📊 Business Intelligence
+- **Real-time Stats**: Instantly see total inventory, average prices, and price/year ranges in the sidebar.
+- **Automated Insights**: Interactive top-make comparisons and condition distribution charts.
+- **Console Monitoring**: A live developer console in the sidebar shows every action the AI and database are taking.
 ---
+## 🚀 Quick Start
+### 1. Prerequisites
+- Python 3.9+
+- OpenAI API Key
+### 2. Setup
+```bash
+# Clone the repository and enter directory
+cd "Capstone folder"
+# Create and activate virtual environment
+python3 -m venv .venv
+source .venv/bin/activate
+# Install dependencies
+pip install -r requirements.txt
+```
+### 3. Configuration
+Copy `.env.example` to `.env` and fill in your keys:
+```bash
+cp .env.example .env
+```
+**Required**: `OPENAI_API_KEY`
+**Optional**: `GITHUB_TOKEN` and `GITHUB_REPO` (for support tickets)
+### 4. Run the Application
+Use the automated run script to ensure the correct environment is used:
+```bash
+chmod +x run.sh
+./run.sh
+```
+---
+## �️ Project Architecture
+- **`app.py`**: Main Streamlit interface.
+- **`agent/`**: AI logic and tool definitions.
+- **`database/`**: Safe SQL execution and CSV-to-SQLite ingestion.
+- **`support/`**: GitHub API integration for support tickets.
+- **`ui/`**: Chart generation (Plotly) and styling.
+- **`utils/`**: Custom Streamlit-integrated logger.
+---
+## 🛡️ Security Policy
+This application is designed with safety as a priority. The `SafetyValidator` provides a robust whitelist of allowed SQL operations, specifically protecting against SQL injection and unauthorized data modification.
+🛡️ **Active Protections**: Only SELECT | All dangerous keywords blocked | Data remains secure

requirements.txt CHANGED Viewed

@@ -1,3 +1,6 @@
-altair
-pandas
-streamlit

+streamlit==1.31.0
+openai>=2.17.0
+pandas==2.2.0
+plotly==5.18.0
+PyGithub==2.1.1
+python-dotenv==1.0.1

src/.env.example ADDED Viewed

	@@ -0,0 +1,16 @@

+# Copy this file to .env and fill in your actual API keys
+# OpenAI API Configuration (REQUIRED)
+OPENAI_API_KEY=sk-your-openai-api-key-here
+# GitHub Integration (OPTIONAL - for support tickets)
+# Get a token from: https://github.com/settings/tokens
+# Required scopes: repo
+GITHUB_TOKEN=
+GITHUB_REPO=
+# Database Configuration (OPTIONAL - defaults work fine)
+DATABASE_PATH=./database/car_prices.db
+# Application Settings (OPTIONAL)
+LOG_LEVEL=INFO

src/QUICKSTART.md ADDED Viewed

	@@ -0,0 +1,70 @@

+# Quick Start Guide
+## Setup Steps
+1. **Create and activate virtual environment** (if not already done):
+```bash
+python3 -m venv .venv
+source .venv/bin/activate  # On macOS/Linux
+```
+2. **Install dependencies**:
+```bash
+pip install -r requirements.txt
+```
+3. **Configure environment variables**:
+```bash
+cp .env.example .env
+# Edit .env and add your OPENAI_API_KEY
+```
+4. **Run the application**:
+```bash
+# Option 1: Use the run script
+./run.sh
+# Option 2: Run directly with .venv
+.venv/bin/streamlit run app.py
+```
+**IMPORTANT**: Always use `.venv/bin/streamlit` or the `run.sh` script to ensure you're using the virtual environment's packages, not system-wide packages.
+## Troubleshooting
+### TypeError: __init__() got an unexpected keyword argument 'proxies'
+If you encounter this error, reinstall OpenAI with compatible dependencies:
+```bash
+.venv/bin/pip uninstall -y openai httpx httpcore
+.venv/bin/pip install openai==1.54.0
+```
+### GitHub Integration Warning
+If you see "GitHub initialization failed: 401 Bad credentials", this is normal if you haven't configured GitHub support. The app will use mock mode for support tickets. To enable real GitHub integration:
+1. Create a GitHub Personal Access Token at https://github.com/settings/tokens
+2. Add to your `.env` file:
+```
+GITHUB_TOKEN=your_token_here
+GITHUB_REPO=username/repo-name
+```
+## First Run
+On first run, the app will:
+1. Create SQLite database from `data/car_prices.csv` (takes ~10-20 seconds)
+2. Load 558,837 car records
+3. Create indexes for faster queries
+4. Launch web interface at http://localhost:8501
+## Sample Queries to Try
+- "What's the average price of BMW cars?"
+- "Show me the top 5 most expensive models"
+- "How many cars were sold in California?"
+- "What's the price difference between automatic and manual transmission?"
+Enjoy exploring your car data! 🚗

src/README.md ADDED Viewed

	@@ -0,0 +1,89 @@

+# 🚗 Data Insights App
+An AI-powered data analysis platform that allows users to explore a massive car auction dataset (558,000+ records) using natural language. The app combines **Streamlit** for the interface, **SQLite** for data management, and **OpenAI's GPT-4** for intelligent analysis and dynamic chart generation.
+---
+[## Hugging Face](https://huggingface.co/spaces/niddijoris/ChatWithData)
+## � Application Gallery
+| Dashboard Overview | AI Chat & Analytics |
+| :---: | :---: |
+| ![Dashboard Overview](/screenshots/1.png) | ![AI Chat & Analytics](/screenshots/2.png) |
+| **Real-time Statistics & Insights** | **Intelligent Querying & Data Analysis** |
+| Dynamic Chart Generation | Safety & Logs |
+| :---: | :---: |
+| ![Chart Generation](/screenshots/3.png) | ![Console Logs](/screenshots/4.png) |
+| **AI-driven Visualizations** | **Security Guardrails & Activity Monitoring** |
+---
+## 🌟 Key Features
+### 🤖 Intelligent AI Agent
+- **Natural Language Querying**: Ask questions like "What is the average price of a BMW?" or "Compare prices between California and Florida".
+- **Dynamic Chart Generation**: Ask for visualizations (bar, line, pie, scatter) and the AI will generate them instantly.
+- **Context-Aware Support**: If the agent can't help, it offers to create a GitHub support ticket with the chat history.
+### 🛡️ Secure Data Management
+- **ReadOnly Safety**: Strict SQL validation ensures only `SELECT` queries are executed. Dangerous operations (`DELETE`, `DROP`, `UPDATE`) are automatically blocked.
+- **Privacy Guardrails**: The agent never communicates the full dataset, only relevant snippets (limited to 100 rows).
+### 📊 Business Intelligence
+- **Real-time Stats**: Instantly see total inventory, average prices, and price/year ranges in the sidebar.
+- **Automated Insights**: Interactive top-make comparisons and condition distribution charts.
+- **Console Monitoring**: A live developer console in the sidebar shows every action the AI and database are taking.
+---
+## 🚀 Quick Start
+### 1. Prerequisites
+- Python 3.9+
+- OpenAI API Key
+### 2. Setup
+```bash
+# Clone the repository and enter directory
+cd "Capstone folder"
+# Create and activate virtual environment
+python3 -m venv .venv
+source .venv/bin/activate
+# Install dependencies
+pip install -r requirements.txt
+```
+### 3. Configuration
+Copy `.env.example` to `.env` and fill in your keys:
+```bash
+cp .env.example .env
+```
+**Required**: `OPENAI_API_KEY`
+**Optional**: `GITHUB_TOKEN` and `GITHUB_REPO` (for support tickets)
+### 4. Run the Application
+Use the automated run script to ensure the correct environment is used:
+```bash
+chmod +x run.sh
+./run.sh
+```
+---
+## �️ Project Architecture
+- **`app.py`**: Main Streamlit interface.
+- **`agent/`**: AI logic and tool definitions.
+- **`database/`**: Safe SQL execution and CSV-to-SQLite ingestion.
+- **`support/`**: GitHub API integration for support tickets.
+- **`ui/`**: Chart generation (Plotly) and styling.
+- **`utils/`**: Custom Streamlit-integrated logger.
+---
+## 🛡️ Security Policy
+This application is designed with safety as a priority. The `SafetyValidator` provides a robust whitelist of allowed SQL operations, specifically protecting against SQL injection and unauthorized data modification.
+🛡️ **Active Protections**: Only SELECT | All dangerous keywords blocked | Data remains secure

src/agent/__init__.py ADDED Viewed

	@@ -0,0 +1,5 @@

+"""Agent package initialization"""
+from agent.ai_agent import AIAgent
+from agent.tools import AgentTools
+__all__ = ['AIAgent', 'AgentTools']

src/agent/ai_agent.py ADDED Viewed

	@@ -0,0 +1,184 @@

+"""
+AI Agent - OpenAI-powered assistant with function calling
+"""
+import json
+from typing import List, Dict, Any, Optional
+import logging
+from openai import OpenAI
+from config import OPENAI_API_KEY, OPENAI_MODEL
+from agent.tools import AgentTools
+class AIAgent:
+    """AI Agent powered by OpenAI with function calling capabilities"""
+    SYSTEM_PROMPT = """You are a helpful data analyst assistant for a car auction/pricing database.
+Your role is to help users understand and query car pricing data.
+IMPORTANT GUIDELINES:
+1. **Data Privacy**: Never pass the entire dataset to your responses. Only use the tools to query specific data.
+2. **Safety**: You can only execute SELECT queries. Any attempt to modify data (DELETE, UPDATE, INSERT, DROP) will be blocked.
+3. **Tool Usage**:
+   - Use `query_database` for specific data queries
+   - Use `get_database_statistics` for general overviews and statistics
+   - Use `generate_chart` when the user asks for a chart, visualization, or trend analysis. Choose the most appropriate chart type (bar, column, line, pie, scatter).
+   - Use `create_support_ticket` when you cannot help or user requests human assistance
+4. **Support Escalation**: If you cannot answer a question or the user seems frustrated, proactively suggest creating a support ticket.
+5. **Clear Communication**: Explain your findings clearly with relevant numbers and insights.
+DATABASE SCHEMA:
+- Table: cars
+- Columns: year, make, model, trim, body, transmission, vin, state, condition, odometer, color, interior, seller, mmr, sellingprice, saledate
+Be concise, helpful, and data-driven in your responses."""
+    def __init__(self, tools: AgentTools):
+        self.tools = tools
+        self.client = OpenAI(api_key=OPENAI_API_KEY)
+        self.model = OPENAI_MODEL
+        self.logger = logging.getLogger(__name__)
+        self.conversation_history: List[Dict[str, Any]] = []
+        # Initialize with system prompt
+        self.conversation_history.append({
+            "role": "system",
+            "content": self.SYSTEM_PROMPT
+        })
+    def chat(self, user_message: str) -> Dict[str, Any]:
+        """
+        Process a user message and return AI response with metadata
+        Args:
+            user_message: User's question or request
+        Returns:
+            Dictionary with 'content' (str) and optional 'chart' (dict)
+        """
+        try:
+            # Add user message to history
+            self.conversation_history.append({
+                "role": "user",
+                "content": user_message
+            })
+            # Get AI response with function calling
+            return self._get_ai_response()
+        except Exception as e:
+            error_msg = f"Error processing message: {str(e)}"
+            self.logger.error(error_msg)
+            return {
+                "content": f"❌ {error_msg}",
+                "chart": None
+            }
+    def _get_ai_response(self, max_iterations: int = 5) -> Dict[str, Any]:
+        """
+        Get AI response with function calling loop
+        Args:
+            max_iterations: Maximum number of function calling iterations
+        Returns:
+            Dictionary with 'content' and optional 'chart'
+        """
+        iteration = 0
+        last_chart = None
+        while iteration < max_iterations:
+            iteration += 1
+            # Call OpenAI API
+            response = self.client.chat.completions.create(
+                model=self.model,
+                messages=self.conversation_history,
+                tools=AgentTools.get_tool_definitions(),
+                tool_choice="auto"
+            )
+            message = response.choices[0].message
+            # Check if AI wants to call a function
+            if message.tool_calls:
+                # Add assistant message to history
+                self.conversation_history.append({
+                    "role": "assistant",
+                    "content": message.content,
+                    "tool_calls": [
+                        {
+                            "id": tc.id,
+                            "type": tc.type,
+                            "function": {
+                                "name": tc.function.name,
+                                "arguments": tc.function.arguments
+                            }
+                        }
+                        for tc in message.tool_calls
+                    ]
+                })
+                # Execute each tool call
+                for tool_call in message.tool_calls:
+                    function_name = tool_call.function.name
+                    function_args = json.loads(tool_call.function.arguments)
+                    self.logger.info(f"AI calling function: {function_name}")
+                    # Execute the tool
+                    result = self.tools.execute_tool(function_name, function_args)
+                    # Capture chart result if it's a chart
+                    if result.get('is_chart'):
+                        last_chart = result.get('chart_config')
+                    # Add function result to history
+                    self.conversation_history.append({
+                        "role": "tool",
+                        "tool_call_id": tool_call.id,
+                        "content": json.dumps(result)
+                    })
+                # Continue loop to get final response
+                continue
+            else:
+                # No more function calls, return final response
+                final_response = message.content or "I apologize, but I couldn't generate a response."
+                # Add to history
+                self.conversation_history.append({
+                    "role": "assistant",
+                    "content": final_response
+                })
+                return {
+                    "content": final_response,
+                    "chart": last_chart
+                }
+        # Max iterations reached
+        return {
+            "content": "I apologize, but I'm having trouble processing your request. Would you like me to create a support ticket for human assistance?",
+            "chart": None
+        }
+    def reset_conversation(self):
+        """Reset conversation history"""
+        self.conversation_history = [{
+            "role": "system",
+            "content": self.SYSTEM_PROMPT
+        }]
+        self.logger.info("Conversation history reset")
+    def get_conversation_context(self) -> str:
+        """Get conversation history as formatted string for support tickets"""
+        context = []
+        for msg in self.conversation_history:
+            if msg["role"] == "user":
+                context.append(f"User: {msg['content']}")
+            elif msg["role"] == "assistant" and msg.get("content"):
+                context.append(f"Assistant: {msg['content']}")
+        return "\n\n".join(context)

src/agent/tools.py ADDED Viewed

	@@ -0,0 +1,269 @@

+"""
+AI Agent Tools - Function definitions for OpenAI function calling
+"""
+import json
+from typing import Dict, Any, Optional
+import logging
+from database.db_manager import DatabaseManager
+from support.github_integration import GitHubSupport
+class AgentTools:
+    """Tools available to the AI agent via function calling"""
+    def __init__(self, db_manager: DatabaseManager, github_support: Optional[GitHubSupport] = None):
+        self.db_manager = db_manager
+        self.github_support = github_support
+        self.logger = logging.getLogger(__name__)
+    @staticmethod
+    def get_tool_definitions() -> list:
+        """
+        Get OpenAI function definitions for all available tools
+        Returns:
+            List of tool definitions in OpenAI format
+        """
+        return [
+            {
+                "type": "function",
+                "function": {
+                    "name": "query_database",
+                    "description": "Execute a SQL SELECT query on the car prices database. Use this to retrieve specific data based on user questions. Only SELECT queries are allowed for safety. The database contains car auction data with columns: year, make, model, trim, body, transmission, vin, state, condition, odometer, color, interior, seller, mmr, sellingprice, saledate.",
+                    "parameters": {
+                        "type": "object",
+                        "properties": {
+                            "sql_query": {
+                                "type": "string",
+                                "description": "The SQL SELECT query to execute. Must be a valid SELECT statement. Example: 'SELECT AVG(sellingprice) FROM cars WHERE make = \"BMW\"'"
+                            }
+                        },
+                        "required": ["sql_query"]
+                    }
+                }
+            },
+            {
+                "type": "function",
+                "function": {
+                    "name": "get_database_statistics",
+                    "description": "Get comprehensive statistics and aggregated information about the car prices database. Use this when user asks for general information, overview, or statistics about the data. Returns total records, price statistics, top makes/models, condition distribution, and year range.",
+                    "parameters": {
+                        "type": "object",
+                        "properties": {},
+                        "required": []
+                    }
+                }
+            },
+            {
+                "type": "function",
+                "function": {
+                    "name": "create_support_ticket",
+                    "description": "Create a support ticket to reach a human for help. Use this when the user explicitly asks for human support, or when you cannot answer their question adequately. The ticket will be created as a GitHub issue.",
+                    "parameters": {
+                        "type": "object",
+                        "properties": {
+                            "title": {
+                                "type": "string",
+                                "description": "Brief title summarizing the support request"
+                            },
+                            "description": {
+                                "type": "string",
+                                "description": "Detailed description of the issue or question, including conversation context"
+                            },
+                            "priority": {
+                                "type": "string",
+                                "enum": ["low", "medium", "high"],
+                                "description": "Priority level of the support request"
+                            }
+                        },
+                        "required": ["title", "description"]
+                    }
+                }
+            },
+            {
+                "type": "function",
+                "function": {
+                    "name": "generate_chart",
+                    "description": "Generate a dynamic chart based on a SQL query. Use this when the user asks for a chart, visualization, or comparison that would look better as a graph. You must provide a valid SQL SELECT query and chart configurations.",
+                    "parameters": {
+                        "type": "object",
+                        "properties": {
+                            "sql_query": {
+                                "type": "string",
+                                "description": "SQL SELECT query to get data for the chart. Example: 'SELECT make, AVG(sellingprice) FROM cars GROUP BY make'"
+                            },
+                            "chart_type": {
+                                "type": "string",
+                                "enum": ["bar", "column", "line", "pie", "scatter"],
+                                "description": "Type of chart to generate"
+                            },
+                            "title": {
+                                "type": "string",
+                                "description": "Title of the chart"
+                            },
+                            "x_label": {
+                                "type": "string",
+                                "description": "Label for the X-axis (column name from query)"
+                            },
+                            "y_label": {
+                                "type": "string",
+                                "description": "Label for the Y-axis (column name from query)"
+                            }
+                        },
+                        "required": ["sql_query", "chart_type", "title"]
+                    }
+                }
+            }
+        ]
+    def execute_tool(self, tool_name: str, arguments: Dict[str, Any]) -> Dict[str, Any]:
+        """
+        Execute a tool based on function call from OpenAI
+        Args:
+            tool_name: Name of the tool to execute
+            arguments: Arguments for the tool
+        Returns:
+            Result dictionary from the tool execution
+        """
+        self.logger.info(f"Executing tool: {tool_name} with args: {arguments}")
+        if tool_name == "query_database":
+            return self._query_database(arguments.get("sql_query", ""))
+        elif tool_name == "get_database_statistics":
+            return self._get_database_statistics()
+        elif tool_name == "create_support_ticket":
+            return self._create_support_ticket(
+                title=arguments.get("title", ""),
+                description=arguments.get("description", ""),
+                priority=arguments.get("priority", "medium")
+            )
+        elif tool_name == "generate_chart":
+            return self._generate_chart(
+                sql_query=arguments.get("sql_query", ""),
+                chart_type=arguments.get("chart_type", "bar"),
+                title=arguments.get("title", ""),
+                x_label=arguments.get("x_label"),
+                y_label=arguments.get("y_label")
+            )
+        else:
+            return {
+                "success": False,
+                "error": f"Unknown tool: {tool_name}"
+            }
+    def _query_database(self, sql_query: str) -> Dict[str, Any]:
+        """Execute a database query"""
+        self.logger.info(f"Executing query: {sql_query}")
+        result = self.db_manager.execute_query(sql_query)
+        # Format result for AI consumption
+        if result['success']:
+            # Limit data sent to AI to avoid token limits
+            data = result['data']
+            if len(data) > 100:
+                return {
+                    "success": True,
+                    "message": f"Query returned {len(data)} rows (showing first 100)",
+                    "data": data[:100],
+                    "row_count": len(data),
+                    "truncated": True
+                }
+            else:
+                return {
+                    "success": True,
+                    "message": f"Query returned {len(data)} rows",
+                    "data": data,
+                    "row_count": len(data),
+                    "truncated": False
+                }
+        else:
+            return {
+                "success": False,
+                "error": result['error']
+            }
+    def _get_database_statistics(self) -> Dict[str, Any]:
+        """Get database statistics"""
+        self.logger.info("Retrieving database statistics")
+        stats = self.db_manager.get_statistics()
+        if stats:
+            return {
+                "success": True,
+                "statistics": stats
+            }
+        else:
+            return {
+                "success": False,
+                "error": "Failed to retrieve statistics"
+            }
+    def _create_support_ticket(self, title: str, description: str, priority: str = "medium") -> Dict[str, Any]:
+        """Create a support ticket"""
+        self.logger.info(f"Creating support ticket: {title}")
+        if self.github_support:
+            result = self.github_support.create_issue(
+                title=title,
+                body=description,
+                labels=["support", f"priority-{priority}"]
+            )
+            return result
+        else:
+            # Mock support ticket if GitHub not configured
+            return {
+                "success": True,
+                "message": "Support ticket created (mock mode - GitHub not configured)",
+                "ticket_id": "MOCK-001",
+                "title": title,
+                "priority": priority
+            }
+    def _generate_chart(
+        self,
+        sql_query: str,
+        chart_type: str,
+        title: str,
+        x_label: Optional[str] = None,
+        y_label: Optional[str] = None
+    ) -> Dict[str, Any]:
+        """Execute query and return chart configuration"""
+        self.logger.info(f"Generating chart: {chart_type} - {title}")
+        # Execute query first
+        query_result = self._query_database(sql_query)
+        if query_result['success']:
+            data = query_result['data']
+            if not data:
+                return {
+                    "success": False,
+                    "error": "Query returned no data for the chart."
+                }
+            # Use provided labels or infer from data
+            cols = list(data[0].keys())
+            x_axis = x_label if x_label in cols else cols[0]
+            y_axis = y_label if y_label in cols else (cols[1] if len(cols) > 1 else cols[0])
+            return {
+                "success": True,
+                "is_chart": True,
+                "chart_config": {
+                    "type": chart_type,
+                    "title": title,
+                    "x_label": x_axis,
+                    "y_label": y_axis,
+                    "data": data
+                },
+                "message": f"Successfully generated {chart_type} chart: {title}"
+            }
+        else:
+            return query_result

src/config.py ADDED Viewed

	@@ -0,0 +1,48 @@

+"""
+Configuration management for Data Insights App
+"""
+import os
+from pathlib import Path
+from dotenv import load_dotenv
+# Load environment variables from .env file
+load_dotenv()
+# Base paths
+BASE_DIR = Path(__file__).parent
+DATA_DIR = BASE_DIR / "data"
+DATABASE_DIR = BASE_DIR / "database"
+# API Configuration
+OPENAI_API_KEY = os.getenv("OPENAI_API_KEY", "")
+OPENAI_MODEL = "gpt-4-turbo-preview"  # Model with function calling support
+# GitHub Configuration (Optional)
+GITHUB_TOKEN = os.getenv("GITHUB_TOKEN", "")
+GITHUB_REPO = os.getenv("GITHUB_REPO", "")
+GITHUB_FOLDER = os.getenv("GITHUB_FOLDER", "")  # Optional folder/project prefix for issues
+# Database Configuration
+DATABASE_PATH = os.getenv("DATABASE_PATH", str(DATABASE_DIR / "car_prices.db"))
+CSV_DATA_PATH = DATA_DIR / "car_prices.csv"
+# Application Settings
+LOG_LEVEL = os.getenv("LOG_LEVEL", "INFO")
+MAX_LOG_ENTRIES = 100  # Maximum number of log entries to keep in sidebar
+# Sample queries for user guidance
+SAMPLE_QUERIES = [
+    "What's the average selling price of BMW cars?",
+    "Show me the top 5 most expensive car models",
+    "How many cars were sold in California?",
+    "What's the price difference between automatic and manual transmission?",
+    "Show statistics about cars in excellent condition",
+    "Which seller has the most cars in the database?",
+]
+# Safety settings
+ALLOWED_SQL_OPERATIONS = ["SELECT"]
+DANGEROUS_SQL_KEYWORDS = [
+    "DELETE", "DROP", "TRUNCATE", "ALTER",
+    "UPDATE", "INSERT", "CREATE", "REPLACE"
+]

src/data/car_prices.csv ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:32ba3ce51664e6a12c0c927ed193b41e3c4743fdf18bc0317389892aed27f556
+size 88047552

src/database/__init__.py ADDED Viewed

	@@ -0,0 +1,5 @@

+"""Database package initialization"""
+from database.db_manager import DatabaseManager
+from database.safety_validator import SafetyValidator
+__all__ = ['DatabaseManager', 'SafetyValidator']

src/database/db_manager.py ADDED Viewed

	@@ -0,0 +1,240 @@

+"""
+Database Manager - Handles SQLite database operations and CSV data ingestion
+"""
+import sqlite3
+import pandas as pd
+from pathlib import Path
+from typing import List, Dict, Any, Optional
+import logging
+from config import DATABASE_PATH, CSV_DATA_PATH
+from database.safety_validator import SafetyValidator
+class DatabaseManager:
+    """Manages database connections and operations"""
+    def __init__(self, db_path: str = DATABASE_PATH):
+        self.db_path = db_path
+        self.validator = SafetyValidator()
+        self.logger = logging.getLogger(__name__)
+        # Ensure database directory exists
+        Path(db_path).parent.mkdir(parents=True, exist_ok=True)
+        # Initialize database
+        self._initialize_database()
+    def _initialize_database(self):
+        """Initialize database and load data from CSV if needed"""
+        db_exists = Path(self.db_path).exists()
+        if not db_exists:
+            self.logger.info("Database not found. Creating new database from CSV...")
+            self._load_csv_to_database()
+        else:
+            self.logger.info(f"Database found at {self.db_path}")
+    def _load_csv_to_database(self):
+        """Load car_prices.csv into SQLite database"""
+        try:
+            # Check if CSV exists
+            if not CSV_DATA_PATH.exists():
+                raise FileNotFoundError(f"CSV file not found: {CSV_DATA_PATH}")
+            self.logger.info(f"Loading data from {CSV_DATA_PATH}...")
+            # Read CSV with pandas
+            df = pd.read_csv(CSV_DATA_PATH)
+            # Clean column names (remove spaces, lowercase)
+            df.columns = df.columns.str.strip().str.lower().str.replace(' ', '_')
+            # Connect to database
+            conn = sqlite3.connect(self.db_path)
+            # Write to SQLite
+            df.to_sql('cars', conn, if_exists='replace', index=False)
+            # Create indexes for common queries
+            cursor = conn.cursor()
+            cursor.execute("CREATE INDEX IF NOT EXISTS idx_make ON cars(make)")
+            cursor.execute("CREATE INDEX IF NOT EXISTS idx_model ON cars(model)")
+            cursor.execute("CREATE INDEX IF NOT EXISTS idx_year ON cars(year)")
+            cursor.execute("CREATE INDEX IF NOT EXISTS idx_state ON cars(state)")
+            conn.commit()
+            conn.close()
+            self.logger.info(f"Successfully loaded {len(df)} records into database")
+        except Exception as e:
+            self.logger.error(f"Error loading CSV to database: {e}")
+            raise
+    def execute_query(self, query: str, params: Optional[tuple] = None) -> Dict[str, Any]:
+        """
+        Execute a SQL query with safety validation
+        Args:
+            query: SQL query to execute
+            params: Optional parameters for parameterized queries
+        Returns:
+            Dictionary with 'success', 'data', 'error', and 'row_count' keys
+        """
+        # Validate query safety
+        is_valid, error_msg = self.validator.validate_query(query)
+        if not is_valid:
+            self.logger.warning(f"Blocked unsafe query: {query}")
+            return {
+                'success': False,
+                'data': None,
+                'error': error_msg,
+                'row_count': 0
+            }
+        try:
+            conn = sqlite3.connect(self.db_path)
+            conn.row_factory = sqlite3.Row  # Enable column access by name
+            cursor = conn.cursor()
+            # Execute query
+            if params:
+                cursor.execute(query, params)
+            else:
+                cursor.execute(query)
+            # Fetch results
+            rows = cursor.fetchall()
+            # Convert to list of dictionaries
+            data = [dict(row) for row in rows]
+            conn.close()
+            self.logger.info(f"Query executed successfully. Returned {len(data)} rows.")
+            return {
+                'success': True,
+                'data': data,
+                'error': None,
+                'row_count': len(data)
+            }
+        except Exception as e:
+            error_msg = f"Database error: {str(e)}"
+            self.logger.error(error_msg)
+            return {
+                'success': False,
+                'data': None,
+                'error': error_msg,
+                'row_count': 0
+            }
+    def get_statistics(self) -> Dict[str, Any]:
+        """Get aggregated statistics about the database"""
+        try:
+            stats = {}
+            conn = sqlite3.connect(self.db_path)
+            cursor = conn.cursor()
+            # Total records
+            cursor.execute("SELECT COUNT(*) FROM cars")
+            stats['total_records'] = cursor.fetchone()[0]
+            # Price statistics
+            cursor.execute("""
+                SELECT
+                    AVG(sellingprice) as avg_price,
+                    MIN(sellingprice) as min_price,
+                    MAX(sellingprice) as max_price
+                FROM cars
+                WHERE sellingprice IS NOT NULL AND sellingprice > 0
+            """)
+            price_stats = cursor.fetchone()
+            stats['avg_price'] = round(price_stats[0], 2) if price_stats[0] else 0
+            stats['min_price'] = price_stats[1] if price_stats[1] else 0
+            stats['max_price'] = price_stats[2] if price_stats[2] else 0
+            # Top 5 makes by count
+            cursor.execute("""
+                SELECT make, COUNT(*) as count
+                FROM cars
+                GROUP BY make
+                ORDER BY count DESC
+                LIMIT 5
+            """)
+            stats['top_makes'] = [
+                {'make': row[0], 'count': row[1]}
+                for row in cursor.fetchall()
+            ]
+            # Top 5 models by count
+            cursor.execute("""
+                SELECT model, COUNT(*) as count
+                FROM cars
+                GROUP BY model
+                ORDER BY count DESC
+                LIMIT 5
+            """)
+            stats['top_models'] = [
+                {'model': row[0], 'count': row[1]}
+                for row in cursor.fetchall()
+            ]
+            # Condition distribution
+            cursor.execute("""
+                SELECT condition, COUNT(*) as count
+                FROM cars
+                WHERE condition IS NOT NULL
+                GROUP BY condition
+                ORDER BY count DESC
+            """)
+            stats['condition_distribution'] = [
+                {'condition': row[0], 'count': row[1]}
+                for row in cursor.fetchall()
+            ]
+            # Year range
+            cursor.execute("SELECT MIN(year), MAX(year) FROM cars")
+            year_range = cursor.fetchone()
+            stats['year_range'] = {
+                'min': year_range[0],
+                'max': year_range[1]
+            }
+            conn.close()
+            self.logger.info("Statistics retrieved successfully")
+            return stats
+        except Exception as e:
+            self.logger.error(f"Error getting statistics: {e}")
+            return {}
+    def get_table_info(self) -> Dict[str, Any]:
+        """Get information about the database schema"""
+        try:
+            conn = sqlite3.connect(self.db_path)
+            cursor = conn.cursor()
+            # Get column information
+            cursor.execute("PRAGMA table_info(cars)")
+            columns = [
+                {'name': row[1], 'type': row[2]}
+                for row in cursor.fetchall()
+            ]
+            conn.close()
+            return {
+                'table_name': 'cars',
+                'columns': columns
+            }
+        except Exception as e:
+            self.logger.error(f"Error getting table info: {e}")
+            return {}

src/database/safety_validator.py ADDED Viewed

	@@ -0,0 +1,72 @@

+"""
+SQL Safety Validator - Prevents dangerous database operations
+"""
+import re
+from typing import Tuple
+from config import ALLOWED_SQL_OPERATIONS, DANGEROUS_SQL_KEYWORDS
+class SafetyValidator:
+    """Validates SQL queries to prevent dangerous operations"""
+    @staticmethod
+    def validate_query(query: str) -> Tuple[bool, str]:
+        """
+        Validate if a SQL query is safe to execute
+        Args:
+            query: SQL query string to validate
+        Returns:
+            Tuple of (is_valid, error_message)
+            - is_valid: True if query is safe, False otherwise
+            - error_message: Empty string if valid, error description if invalid
+        """
+        if not query or not query.strip():
+            return False, "Empty query provided"
+        # Normalize query for checking
+        normalized_query = query.strip().upper()
+        # Check for dangerous keywords
+        for keyword in DANGEROUS_SQL_KEYWORDS:
+            # Use word boundaries to avoid false positives
+            pattern = r'\b' + keyword + r'\b'
+            if re.search(pattern, normalized_query):
+                return False, (
+                    f"🚫 BLOCKED: Query contains dangerous operation '{keyword}'. "
+                    f"Only SELECT queries are allowed for safety reasons."
+                )
+        # Ensure query starts with SELECT
+        if not normalized_query.startswith('SELECT'):
+            return False, (
+                "🚫 BLOCKED: Only SELECT queries are allowed. "
+                "This application is read-only to prevent accidental data modification."
+            )
+        # Additional checks for SQL injection patterns
+        suspicious_patterns = [
+            r';.*?(DELETE|DROP|UPDATE|INSERT)',  # Multiple statements
+            r'--',  # SQL comments (potential injection)
+            r'/\*.*?\*/',  # Block comments
+        ]
+        for pattern in suspicious_patterns:
+            if re.search(pattern, normalized_query, re.IGNORECASE | re.DOTALL):
+                return False, (
+                    "🚫 BLOCKED: Query contains suspicious patterns that may indicate "
+                    "SQL injection or multiple statements. Please use simple SELECT queries."
+                )
+        return True, ""
+    @staticmethod
+    def get_safety_message() -> str:
+        """Get a message explaining safety restrictions"""
+        return (
+            "🛡️ **Safety Features Active**\n\n"
+            f"✅ Allowed operations: {', '.join(ALLOWED_SQL_OPERATIONS)}\n"
+            f"❌ Blocked operations: {', '.join(DANGEROUS_SQL_KEYWORDS)}\n\n"
+            "This ensures your data remains safe from accidental modifications."
+        )

src/streamlit_app.py CHANGED Viewed

@@ -1,40 +1,369 @@
-import altair as alt
-import numpy as np
-import pandas as pd
 import streamlit as st
-"""
-# Welcome to Streamlit!
-Edit `/streamlit_app.py` to customize this app to your heart's desire :heart:.
-If you have any questions, checkout our [documentation](https://docs.streamlit.io) and [community
-forums](https://discuss.streamlit.io).
-In the meantime, below is an example of what you can do with just a few lines of code:
-"""
-num_points = st.slider("Number of points in spiral", 1, 10000, 1100)
-num_turns = st.slider("Number of turns in spiral", 1, 300, 31)
-indices = np.linspace(0, 1, num_points)
-theta = 2 * np.pi * num_turns * indices
-radius = indices
-x = radius * np.cos(theta)
-y = radius * np.sin(theta)
-df = pd.DataFrame({
-    "x": x,
-    "y": y,
-    "idx": indices,
-    "rand": np.random.randn(num_points),
-})
-st.altair_chart(alt.Chart(df, height=700, width=700)
-    .mark_point(filled=True)
-    .encode(
-        x=alt.X("x", axis=None),
-        y=alt.Y("y", axis=None),
-        color=alt.Color("idx", legend=None, scale=alt.Scale()),
-        size=alt.Size("rand", legend=None, scale=alt.Scale(range=[1, 150])),
-    ))

+"""
+Data Insights App - Main Streamlit Application
+"""
 import streamlit as st
+import pandas as pd
+from datetime import datetime
+from config import OPENAI_API_KEY, SAMPLE_QUERIES, MAX_LOG_ENTRIES
+from database import DatabaseManager
+from agent import AIAgent, AgentTools
+from support import GitHubSupport
+from utils import setup_logging, get_logs, clear_logs
+from ui import (
+    create_price_distribution_chart,
+    create_top_makes_chart,
+    create_condition_pie_chart,
+    create_price_by_make_chart,
+    create_dynamic_chart
+)
+# Page configuration
+st.set_page_config(
+    page_title="Data Insights App",
+    page_icon="🚗",
+    layout="wide",
+    initial_sidebar_state="expanded"
+)
+# Custom CSS for better styling
+st.markdown("""
+<style>
+    .main-header {
+        font-size: 2.5rem;
+        font-weight: bold;
+        color: #1f77b4;
+        margin-bottom: 0.5rem;
+    }
+    .sub-header {
+        font-size: 1.2rem;
+        color: #666;
+        margin-bottom: 2rem;
+    }
+    .stat-card {
+        background-color: #f0f2f6;
+        padding: 1rem;
+        border-radius: 0.5rem;
+        margin-bottom: 1rem;
+    }
+    .stat-value {
+        font-size: 1.8rem;
+        font-weight: bold;
+        color: #1f77b4;
+    }
+    .stat-label {
+        font-size: 0.9rem;
+        color: #666;
+    }
+    .log-entry {
+        font-family: monospace;
+        font-size: 0.85rem;
+        padding: 0.3rem;
+        margin: 0.2rem 0;
+        border-left: 3px solid #ddd;
+        padding-left: 0.5rem;
+    }
+    .log-info {
+        border-left-color: #2ca02c;
+    }
+    .log-warning {
+        border-left-color: #ff7f0e;
+    }
+    .log-error {
+        border-left-color: #d62728;
+    }
+</style>
+""", unsafe_allow_html=True)
+def initialize_session_state():
+    """Initialize Streamlit session state variables"""
+    if 'initialized' not in st.session_state:
+        # Set up logging
+        setup_logging(level="INFO", max_entries=MAX_LOG_ENTRIES)
+        # Initialize database
+        st.session_state.db_manager = DatabaseManager()
+        # Initialize GitHub support
+        st.session_state.github_support = GitHubSupport()
+        # Initialize agent tools and AI agent
+        st.session_state.tools = AgentTools(
+            db_manager=st.session_state.db_manager,
+            github_support=st.session_state.github_support
+        )
+        st.session_state.agent = AIAgent(tools=st.session_state.tools)
+        # Chat history
+        st.session_state.messages = []
+        # Statistics cache
+        st.session_state.stats = None
+        st.session_state.stats_loaded = False
+        st.session_state.initialized = True
+def load_statistics():
+    """Load database statistics (cached)"""
+    if not st.session_state.stats_loaded:
+        st.session_state.stats = st.session_state.db_manager.get_statistics()
+        st.session_state.stats_loaded = True
+    return st.session_state.stats
+def render_sidebar():
+    """Render sidebar with logs, stats, and charts"""
+    with st.sidebar:
+        st.markdown("### 🎛️ Control Panel")
+        # API Key check
+        if not OPENAI_API_KEY:
+            st.error("⚠️ OPENAI_API_KEY not set! Please configure your .env file.")
+            st.stop()
+        else:
+            st.success("✅ OpenAI API Connected")
+        st.divider()
+        # Database Statistics
+        st.markdown("### 📊 Database Overview")
+        stats = load_statistics()
+        if stats:
+            col1, col2 = st.columns(2)
+            with col1:
+                st.markdown(f"""
+                <div class="stat-card">
+                    <div class="stat-value">{stats.get('total_records', 0):,}</div>
+                    <div class="stat-label">Total Cars</div>
+                </div>
+                """, unsafe_allow_html=True)
+            with col2:
+                avg_price = stats.get('avg_price', 0)
+                st.markdown(f"""
+                <div class="stat-card">
+                    <div class="stat-value">${avg_price:,.0f}</div>
+                    <div class="stat-label">Avg Price</div>
+                </div>
+                """, unsafe_allow_html=True)
+            # Price range
+            min_price = stats.get('min_price', 0)
+            max_price = stats.get('max_price', 0)
+            st.markdown(f"**Price Range:** ${min_price:,} - ${max_price:,}")
+            # Year range
+            year_range = stats.get('year_range', {})
+            st.markdown(f"**Year Range:** {year_range.get('min', 'N/A')} - {year_range.get('max', 'N/A')}")
+        st.divider()
+        # Charts
+        st.markdown("### 📈 Insights")
+        if stats:
+            # Top makes chart
+            with st.expander("🏆 Top Makes", expanded=False):
+                fig = create_top_makes_chart(stats)
+                st.plotly_chart(fig, use_container_width=True)
+            # Condition distribution
+            with st.expander("🔍 Condition Distribution", expanded=False):
+                fig = create_condition_pie_chart(stats)
+                st.plotly_chart(fig, use_container_width=True)
+            # Average price by make
+            with st.expander("💰 Avg Price by Make", expanded=False):
+                fig = create_price_by_make_chart(st.session_state.db_manager)
+                st.plotly_chart(fig, use_container_width=True)
+        st.divider()
+        # Sample Queries
+        st.markdown("### 💡 Sample Queries")
+        for i, query in enumerate(SAMPLE_QUERIES[:4]):
+            if st.button(f"📝 {query[:40]}...", key=f"sample_{i}", use_container_width=True):
+                st.session_state.sample_query = query
+                st.rerun()
+        st.divider()
+        # Console Logs
+        st.markdown("### 🖥️ Console Logs")
+        col1, col2 = st.columns([3, 1])
+        with col2:
+            if st.button("🗑️ Clear", use_container_width=True):
+                clear_logs()
+                st.rerun()
+        # Display logs
+        logs = get_logs()
+        if logs:
+            log_container = st.container(height=300)
+            with log_container:
+                for log in reversed(logs[-50:]):  # Show last 50 logs
+                    level = log['level'].lower()
+                    css_class = f"log-{level}"
+                    st.markdown(f"""
+                    <div class="log-entry {css_class}">
+                        <strong>[{log['timestamp']}]</strong> {log['level']}: {log['message']}
+                    </div>
+                    """, unsafe_allow_html=True)
+        else:
+            st.info("No logs yet. Start chatting to see activity!")
+def render_chat_interface():
+    """Render main chat interface"""
+    # Header
+    st.markdown('<div class="main-header">🚗 Car Data Insights Assistant</div>', unsafe_allow_html=True)
+    st.markdown('<div class="sub-header">Ask questions about car auction data powered by AI</div>', unsafe_allow_html=True)
+    # Display chat messages
+    for message in st.session_state.messages:
+        with st.chat_message(message["role"]):
+            st.markdown(message["content"])
+            if message.get("chart"):
+                chart_config = message["chart"]
+                fig = create_dynamic_chart(
+                    data=chart_config['data'],
+                    chart_type=chart_config['type'],
+                    title=chart_config['title'],
+                    x_label=chart_config['x_label'],
+                    y_label=chart_config['y_label']
+                )
+                st.plotly_chart(fig, use_container_width=True)
+    # Handle sample query selection
+    if 'sample_query' in st.session_state:
+        user_input = st.session_state.sample_query
+        del st.session_state.sample_query
+    else:
+        user_input = st.chat_input("Ask me anything about the car data...")
+        # Process user input
+    if user_input:
+        # Add user message to chat
+        st.session_state.messages.append({"role": "user", "content": user_input})
+        with st.chat_message("user"):
+            st.markdown(user_input)
+        # Get AI response
+        with st.chat_message("assistant"):
+            with st.spinner("Thinking..."):
+                response_data = st.session_state.agent.chat(user_input)
+                content = response_data["content"]
+                chart = response_data.get("chart")
+                st.markdown(content)
+                if chart:
+                    fig = create_dynamic_chart(
+                        data=chart['data'],
+                        chart_type=chart['type'],
+                        title=chart['title'],
+                        x_label=chart['x_label'],
+                        y_label=chart['y_label']
+                    )
+                    st.plotly_chart(fig, use_container_width=True)
+        # Add assistant response to chat
+        st.session_state.messages.append({
+            "role": "assistant",
+            "content": content,
+            "chart": chart
+        })
+        st.rerun()
+def render_support_section():
+    """Render support ticket creation section"""
+    st.divider()
+    with st.expander("🎫 Need Human Support?", expanded=False):
+        st.markdown("""
+        If the AI assistant can't help you, create a support ticket to reach a human expert.
+        Your conversation history will be included automatically.
+        """)
+        col1, col2 = st.columns([3, 1])
+        with col1:
+            ticket_title = st.text_input(
+                "Issue Summary",
+                placeholder="Brief description of your issue..."
+            )
+        with col2:
+            priority = st.selectbox("Priority", ["low", "medium", "high"])
+        ticket_description = st.text_area(
+            "Details",
+            placeholder="Provide more details about your issue...",
+            height=100
+        )
+        if st.button("📤 Create Support Ticket", type="primary"):
+            if not ticket_title:
+                st.error("Please provide a ticket title")
+            else:
+                # Get conversation context
+                context = st.session_state.agent.get_conversation_context()
+                # Create full description with context
+                full_description = f"{ticket_description}\n\n---\n\n**Conversation History:**\n\n{context}"
+                # Create ticket
+                result = st.session_state.tools.execute_tool(
+                    "create_support_ticket",
+                    {
+                        "title": ticket_title,
+                        "description": full_description,
+                        "priority": priority
+                    }
+                )
+                if result.get('success'):
+                    st.success(f"✅ {result.get('message')}")
+                    if 'issue_url' in result:
+                        st.markdown(f"**Issue URL:** {result['issue_url']}")
+                    elif 'ticket_id' in result:
+                        st.markdown(f"**Ticket ID:** {result['ticket_id']}")
+                else:
+                    st.error(f"❌ {result.get('error')}")
+def main():
+    """Main application entry point"""
+    # Initialize
+    initialize_session_state()
+    # Render sidebar
+    render_sidebar()
+    # Render main chat interface
+    render_chat_interface()
+    # Render support section
+    render_support_section()
+    # Footer
+    st.divider()
+    st.markdown("""
+    <div style="text-align: center; color: #666; font-size: 0.9rem;">
+        🛡️ <strong>Safety Features Active:</strong> Only SELECT queries allowed |
+        All dangerous operations blocked |
+        Data remains secure
+    </div>
+    """, unsafe_allow_html=True)
+if __name__ == "__main__":
+    main()

src/support/__init__.py ADDED Viewed

	@@ -0,0 +1,4 @@

+"""Support package initialization"""
+from support.github_integration import GitHubSupport
+__all__ = ['GitHubSupport']

src/support/github_integration.py ADDED Viewed

	@@ -0,0 +1,168 @@

+"""
+GitHub Integration for Support Tickets
+"""
+import os
+import logging
+from typing import Dict, Any, Optional, List
+from github import Github, GithubException, Auth
+from config import GITHUB_TOKEN, GITHUB_REPO
+class GitHubSupport:
+    """Handles GitHub issue creation for support tickets"""
+    def __init__(self, token: Optional[str] = None, repo_name: Optional[str] = None, folder: Optional[str] = None):
+        self.token = token or GITHUB_TOKEN
+        self.repo_name = repo_name or GITHUB_REPO
+        self.folder = folder or os.getenv("GITHUB_FOLDER", "")
+        self.logger = logging.getLogger(__name__)
+        self.github = None
+        self.repo = None
+        if self.token and self.repo_name:
+            try:
+                # Use Auth.Token for authentication
+                auth = Auth.Token(self.token)
+                self.github = Github(auth=auth)
+                self.repo = self.github.get_repo(self.repo_name)
+                self.logger.info(f"GitHub integration initialized for repo: {self.repo_name}")
+            except Exception as e:
+                self.logger.warning(f"GitHub initialization failed: {e}")
+    def is_configured(self) -> bool:
+        """Check if GitHub integration is properly configured"""
+        return self.github is not None and self.repo is not None
+    def _ensure_label_exists(self, label_name: str, color: str = "0075ca"):
+        """Ensures a label exists in the repository, creates it if it doesn't"""
+        if not self.repo:
+            return
+        try:
+            self.repo.get_label(label_name)
+        except GithubException:
+            try:
+                self.repo.create_label(name=label_name, color=color)
+                self.logger.info(f"Created new label: {label_name}")
+            except Exception as e:
+                self.logger.warning(f"Could not create label {label_name}: {e}")
+    def create_issue(
+        self,
+        title: str,
+        body: str,
+        labels: Optional[List[str]] = None
+    ) -> Dict[str, Any]:
+        """
+        Create a GitHub issue as a support ticket
+        Args:
+            title: Issue title
+            body: Issue description
+            labels: Optional list of labels to add
+        Returns:
+            Dictionary with success status and issue details
+        """
+        if not self.is_configured():
+            self.logger.warning("GitHub not configured, using mock mode")
+            return self._create_mock_ticket(title, body, labels)
+        try:
+            issue_labels = []
+            # 1. Handle folder label and title decoration
+            if self.folder:
+                folder_label = self.folder.lower().replace("/", "-").replace(" ", "-")
+                self._ensure_label_exists(folder_label, "0075ca") # Blue
+                issue_labels.append(folder_label)
+                # Decorate title
+                prefixed_title = f"[{self.folder}] {title}"
+                # Add details to body
+                full_body = f"**Project:** {self.folder}\n\n**Description:**\n{body}"
+            else:
+                prefixed_title = title
+                full_body = body
+            # 2. Add customer-support label (ensure it exists)
+            self._ensure_label_exists("customer-support", "d73a4a") # Reddish
+            issue_labels.append("customer-support")
+            # 3. Add any additional labels provided
+            if labels:
+                for label in labels:
+                    if label not in issue_labels:
+                        # Ensure these labels exist too
+                        self._ensure_label_exists(label, "e6e6e6") # Light gray
+                        issue_labels.append(label)
+            # Create the issue
+            issue = self.repo.create_issue(
+                title=prefixed_title,
+                body=full_body,
+                labels=issue_labels
+            )
+            self.logger.info(f"Created GitHub issue #{issue.number}: {prefixed_title}")
+            return {
+                "success": True,
+                "message": f"Ticket created successfully! Ticket ID: #{issue.number}",
+                "issue_number": issue.number,
+                "issue_url": issue.html_url,
+                "title": prefixed_title
+            }
+        except GithubException as e:
+            # Handle Validation Failed with more detail
+            error_data = getattr(e, 'data', {})
+            error_msg = error_data.get('message', str(e))
+            errors = error_data.get('errors', [])
+            full_error = f"GitHub API error: {error_msg}"
+            if errors:
+                full_error += f" - Details: {str(errors)}"
+            self.logger.error(full_error)
+            return {
+                "success": False,
+                "error": full_error
+            }
+        except Exception as e:
+            error_msg = f"Error creating issue: {str(e)}"
+            self.logger.error(error_msg)
+            return {
+                "success": False,
+                "error": error_msg
+            }
+        except Exception as e:
+            error_msg = f"Error creating issue: {str(e)}"
+            self.logger.error(error_msg)
+            return {
+                "success": False,
+                "error": error_msg
+            }
+    def _create_mock_ticket(
+        self,
+        title: str,
+        body: str,
+        labels: Optional[List[str]] = None
+    ) -> Dict[str, Any]:
+        """Create a mock support ticket when GitHub is not configured"""
+        import random
+        mock_id = f"MOCK-{random.randint(1000, 9999)}"
+        self.logger.info(f"Created mock support ticket: {mock_id}")
+        return {
+            "success": True,
+            "message": "Support ticket created (Mock Mode - GitHub not configured)",
+            "ticket_id": mock_id,
+            "title": title,
+            "labels": labels or [],
+            "note": "To enable real GitHub integration, set GITHUB_TOKEN and GITHUB_REPO in your .env file"
+        }

src/tests/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ """Tests package initialization"""

src/tests/test_safety_validator.py ADDED Viewed

	@@ -0,0 +1,113 @@

+"""
+Tests for Safety Validator
+"""
+import pytest
+from database.safety_validator import SafetyValidator
+class TestSafetyValidator:
+    """Test cases for SQL safety validation"""
+    def setup_method(self):
+        """Set up test fixtures"""
+        self.validator = SafetyValidator()
+    def test_valid_select_query(self):
+        """Test that valid SELECT queries pass validation"""
+        query = "SELECT * FROM cars WHERE make = 'BMW'"
+        is_valid, error = self.validator.validate_query(query)
+        assert is_valid is True
+        assert error == ""
+    def test_delete_query_blocked(self):
+        """Test that DELETE queries are blocked"""
+        query = "DELETE FROM cars WHERE id = 1"
+        is_valid, error = self.validator.validate_query(query)
+        assert is_valid is False
+        assert "DELETE" in error
+    def test_drop_query_blocked(self):
+        """Test that DROP queries are blocked"""
+        query = "DROP TABLE cars"
+        is_valid, error = self.validator.validate_query(query)
+        assert is_valid is False
+        assert "DROP" in error
+    def test_update_query_blocked(self):
+        """Test that UPDATE queries are blocked"""
+        query = "UPDATE cars SET price = 0"
+        is_valid, error = self.validator.validate_query(query)
+        assert is_valid is False
+        assert "UPDATE" in error
+    def test_insert_query_blocked(self):
+        """Test that INSERT queries are blocked"""
+        query = "INSERT INTO cars VALUES (1, 'test')"
+        is_valid, error = self.validator.validate_query(query)
+        assert is_valid is False
+        assert "INSERT" in error
+    def test_truncate_query_blocked(self):
+        """Test that TRUNCATE queries are blocked"""
+        query = "TRUNCATE TABLE cars"
+        is_valid, error = self.validator.validate_query(query)
+        assert is_valid is False
+        assert "TRUNCATE" in error
+    def test_alter_query_blocked(self):
+        """Test that ALTER queries are blocked"""
+        query = "ALTER TABLE cars ADD COLUMN test VARCHAR(50)"
+        is_valid, error = self.validator.validate_query(query)
+        assert is_valid is False
+        assert "ALTER" in error
+    def test_empty_query(self):
+        """Test that empty queries are rejected"""
+        query = ""
+        is_valid, error = self.validator.validate_query(query)
+        assert is_valid is False
+        assert "Empty query" in error
+    def test_non_select_query(self):
+        """Test that non-SELECT queries are rejected"""
+        query = "SHOW TABLES"
+        is_valid, error = self.validator.validate_query(query)
+        assert is_valid is False
+        assert "Only SELECT" in error
+    def test_sql_injection_attempt(self):
+        """Test that SQL injection patterns are detected"""
+        query = "SELECT * FROM cars; DELETE FROM cars"
+        is_valid, error = self.validator.validate_query(query)
+        assert is_valid is False
+    def test_complex_select_query(self):
+        """Test that complex SELECT queries pass"""
+        query = """
+            SELECT make, model, AVG(sellingprice) as avg_price
+            FROM cars
+            WHERE year > 2010
+            GROUP BY make, model
+            ORDER BY avg_price DESC
+            LIMIT 10
+        """
+        is_valid, error = self.validator.validate_query(query)
+        assert is_valid is True
+        assert error == ""
+    def test_case_insensitive_blocking(self):
+        """Test that dangerous keywords are blocked regardless of case"""
+        queries = [
+            "delete from cars",
+            "DELETE FROM cars",
+            "DeLeTe FrOm cars"
+        ]
+        for query in queries:
+            is_valid, error = self.validator.validate_query(query)
+            assert is_valid is False
+            assert "DELETE" in error.upper()
+if __name__ == "__main__":
+    pytest.main([__file__, "-v"])

src/ui/__init__.py ADDED Viewed

	@@ -0,0 +1,15 @@

+"""UI package initialization"""
+from ui.charts import (
+    create_price_distribution_chart,
+    create_top_makes_chart,
+    create_condition_pie_chart,
+    create_price_by_make_chart,
+    create_dynamic_chart
+)
+__all__ = [
+    'create_price_distribution_chart',
+    'create_top_makes_chart',
+    'create_condition_pie_chart',
+    'create_price_by_make_chart'
+]

src/ui/charts.py ADDED Viewed

	@@ -0,0 +1,216 @@

+"""
+Chart generation for business insights visualization
+"""
+import plotly.express as px
+import plotly.graph_objects as go
+import pandas as pd
+from typing import Dict, Any
+def create_price_distribution_chart(data: pd.DataFrame) -> go.Figure:
+    """
+    Create a histogram showing price distribution
+    Args:
+        data: DataFrame with sellingprice column
+    Returns:
+        Plotly figure
+    """
+    fig = px.histogram(
+        data,
+        x='sellingprice',
+        nbins=50,
+        title='Car Price Distribution',
+        labels={'sellingprice': 'Selling Price ($)', 'count': 'Number of Cars'},
+        color_discrete_sequence=['#1f77b4']
+    )
+    fig.update_layout(
+        showlegend=False,
+        height=300,
+        margin=dict(l=20, r=20, t=40, b=20)
+    )
+    return fig
+def create_top_makes_chart(stats: Dict[str, Any]) -> go.Figure:
+    """
+    Create a bar chart showing top car makes
+    Args:
+        stats: Statistics dictionary with top_makes data
+    Returns:
+        Plotly figure
+    """
+    top_makes = stats.get('top_makes', [])
+    if not top_makes:
+        return go.Figure()
+    makes = [item['make'] for item in top_makes]
+    counts = [item['count'] for item in top_makes]
+    fig = go.Figure(data=[
+        go.Bar(
+            x=makes,
+            y=counts,
+            marker_color='#2ca02c',
+            text=counts,
+            textposition='auto'
+        )
+    ])
+    fig.update_layout(
+        title='Top 5 Car Makes',
+        xaxis_title='Make',
+        yaxis_title='Number of Cars',
+        height=300,
+        margin=dict(l=20, r=20, t=40, b=20)
+    )
+    return fig
+def create_condition_pie_chart(stats: Dict[str, Any]) -> go.Figure:
+    """
+    Create a pie chart showing condition distribution
+    Args:
+        stats: Statistics dictionary with condition_distribution data
+    Returns:
+        Plotly figure
+    """
+    condition_dist = stats.get('condition_distribution', [])
+    if not condition_dist:
+        return go.Figure()
+    # Take top 10 conditions
+    condition_dist = condition_dist[:10]
+    conditions = [str(item['condition']) for item in condition_dist]
+    counts = [item['count'] for item in condition_dist]
+    fig = go.Figure(data=[
+        go.Pie(
+            labels=conditions,
+            values=counts,
+            hole=0.3
+        )
+    ])
+    fig.update_layout(
+        title='Car Condition Distribution',
+        height=300,
+        margin=dict(l=20, r=20, t=40, b=20)
+    )
+    return fig
+def create_price_by_make_chart(db_manager) -> go.Figure:
+    """Create a bar chart of average price by make (top 10)"""
+    query = """
+        SELECT make, AVG(sellingprice) as avg_price
+        FROM cars
+        GROUP BY make
+        ORDER BY avg_price DESC
+        LIMIT 10
+    """
+    result = db_manager.execute_query(query)
+    if result['success'] and result['data']:
+        df = pd.DataFrame(result['data'])
+        fig = px.bar(
+            df,
+            x='make',
+            y='avg_price',
+            title='Top 10 Average Prices by Make',
+            labels={'make': 'Make', 'avg_price': 'Average Price ($)'},
+            template='plotly_white',
+            color='avg_price',
+            color_continuous_scale='Blues'
+        )
+        return fig
+    else:
+        # Return empty figure if data fails
+        return go.Figure()
+def create_dynamic_chart(data: list, chart_type: str, title: str, x_label: str, y_label: str) -> go.Figure:
+    """
+    Create a dynamic chart based on data and configuration provided by the AI agent.
+    Args:
+        data: List of dictionaries containing the data
+        chart_type: Type of chart ('bar', 'column', 'line', 'pie', 'scatter')
+        title: Chart title
+        x_label: Name of the column for X axis
+        y_label: Name of the column for Y axis (or value for pie)
+    Returns:
+        Plotly Figure object
+    """
+    if not data:
+        return go.Figure()
+    df = pd.DataFrame(data)
+    # Ensure labels exist in dataframe, if not, use first columns
+    if x_label not in df.columns:
+        x_label = df.columns[0]
+    if y_label not in df.columns and len(df.columns) > 1:
+        y_label = df.columns[1]
+    elif y_label not in df.columns:
+        y_label = x_label
+    if chart_type.lower() in ['bar', 'column']:
+        fig = px.bar(
+            df,
+            x=x_label,
+            y=y_label,
+            title=title,
+            template='plotly_white',
+            color=y_label if y_label != x_label else None
+        )
+    elif chart_type.lower() == 'line':
+        fig = px.line(
+            df,
+            x=x_label,
+            y=y_label,
+            title=title,
+            template='plotly_white',
+            markers=True
+        )
+    elif chart_type.lower() == 'pie':
+        fig = px.pie(
+            df,
+            names=x_label,
+            values=y_label,
+            title=title,
+            template='plotly_white'
+        )
+    elif chart_type.lower() == 'scatter':
+        fig = px.scatter(
+            df,
+            x=x_label,
+            y=y_label,
+            title=title,
+            template='plotly_white',
+            color=y_label if y_label != x_label else None
+        )
+    else:
+        # Fallback to bar chart
+        fig = px.bar(df, x=x_label, y=y_label, title=title, template='plotly_white')
+    fig.update_layout(
+        margin=dict(l=20, r=20, t=40, b=20),
+        xaxis_title=x_label,
+        yaxis_title=y_label if chart_type.lower() != 'pie' else ""
+    )
+    return fig

src/utils/__init__.py ADDED Viewed

	@@ -0,0 +1,4 @@

+"""Utils package initialization"""
+from utils.logger import setup_logging, get_logs, clear_logs, add_log
+__all__ = ['setup_logging', 'get_logs', 'clear_logs', 'add_log']

src/utils/logger.py ADDED Viewed

	@@ -0,0 +1,93 @@

+"""
+Utility functions for logging in Streamlit
+"""
+import logging
+from datetime import datetime
+from typing import List, Dict
+import streamlit as st
+class StreamlitLogHandler(logging.Handler):
+    """Custom logging handler that stores logs in Streamlit session state"""
+    def __init__(self, max_entries: int = 100):
+        super().__init__()
+        self.max_entries = max_entries
+        # Initialize session state for logs if not exists
+        if 'console_logs' not in st.session_state:
+            st.session_state.console_logs = []
+    def emit(self, record: logging.LogRecord):
+        """Emit a log record to session state"""
+        try:
+            log_entry = {
+                'timestamp': datetime.now().strftime('%H:%M:%S'),
+                'level': record.levelname,
+                'message': self.format(record),
+                'logger': record.name
+            }
+            # Add to session state
+            st.session_state.console_logs.append(log_entry)
+            # Keep only last N entries
+            if len(st.session_state.console_logs) > self.max_entries:
+                st.session_state.console_logs = st.session_state.console_logs[-self.max_entries:]
+        except Exception:
+            self.handleError(record)
+def setup_logging(level: str = "INFO", max_entries: int = 100):
+    """
+    Set up logging with Streamlit handler
+    Args:
+        level: Logging level (DEBUG, INFO, WARNING, ERROR)
+        max_entries: Maximum number of log entries to keep
+    """
+    # Create streamlit handler
+    streamlit_handler = StreamlitLogHandler(max_entries=max_entries)
+    streamlit_handler.setLevel(getattr(logging, level))
+    # Create formatter
+    formatter = logging.Formatter('%(name)s - %(message)s')
+    streamlit_handler.setFormatter(formatter)
+    # Configure root logger
+    root_logger = logging.getLogger()
+    root_logger.setLevel(getattr(logging, level))
+    # Remove existing handlers and add streamlit handler
+    root_logger.handlers = []
+    root_logger.addHandler(streamlit_handler)
+    # Also add console handler for debugging
+    console_handler = logging.StreamHandler()
+    console_handler.setFormatter(formatter)
+    root_logger.addHandler(console_handler)
+def get_logs() -> List[Dict]:
+    """Get all logs from session state"""
+    return st.session_state.get('console_logs', [])
+def clear_logs():
+    """Clear all logs from session state"""
+    st.session_state.console_logs = []
+def add_log(level: str, message: str, logger_name: str = "app"):
+    """
+    Manually add a log entry
+    Args:
+        level: Log level (INFO, WARNING, ERROR, DEBUG)
+        message: Log message
+        logger_name: Name of the logger
+    """
+    logger = logging.getLogger(logger_name)
+    log_method = getattr(logger, level.lower(), logger.info)
+    log_method(message)