Spaces:
Sleeping
Sleeping
| # π± Data Insights App with AI Agent | |
| An intelligent data insights application that helps users query and analyze mobile phone data using an AI agent. The application features function calling, safety mechanisms, Jira integration for support tickets, and a modern Streamlit UI. | |
| ## π Features | |
| - **AI-Powered Assistant**: Uses OpenAI GPT-4o with function calling to intelligently query data | |
| - **Business Dashboard**: Visual metrics and charts showing key business insights | |
| - **Safety Features**: Blocks all write operations (INSERT, UPDATE, DELETE, DROP, TRUNCATE, ALTER) | |
| - **Support Ticket Integration**: Jira integration for creating support tickets | |
| - **Comprehensive Logging**: Detailed console logging for all operations | |
| - **Data Privacy**: Only limited data chunks are sent to LLM, not the entire dataset | |
| - **Interactive UI**: Built with Streamlit for a modern, responsive interface | |
| ## π Requirements | |
| - Python 3.12+ | |
| - OpenAI API key (GPT-5-nano access) | |
| - Jira account and API token (for support ticket feature) | |
| - CSV data file: `Global Mobile Prices 2025.csv` | |
| ## π Installation | |
| ### Step 1: Clone or Download the Project | |
| ```bash | |
| cd /path/to/project/GenerativeAI | |
| ``` | |
| ### Step 2: Install Dependencies | |
| ```bash | |
| uv venv python 3.12 | |
| uv sync | |
| ``` | |
| The `pyproject.toml` includes: | |
| - `streamlit` - Web UI framework | |
| - `openai` - OpenAI API client | |
| - `pandas` - Data manipulation | |
| - `plotly` - Interactive charts | |
| - `python-dotenv` - Environment variable management | |
| - `jira` - Jira API integration | |
| - `numpy` - Numerical operations | |
| ### Step 3: Configure Environment Variables | |
| Create a `.env` file in the project root: | |
| ```bash | |
| cp .env.example .env | |
| ``` | |
| Edit `.env` and add your credentials: | |
| ```env | |
| # OpenAI API Configuration | |
| OPENAI_API_KEY=sk-your-openai-api-key-here | |
| # Jira Configuration | |
| JIRA_SERVER=https://your-domain.atlassian.net | |
| JIRA_EMAIL=your_email@example.com | |
| JIRA_API_TOKEN=your_jira_api_token_here | |
| JIRA_PROJECT_KEY=SUP | |
| # Application Configuration | |
| CSV_FILE_PATH=Global Mobile Prices 2025.csv | |
| MAX_RESULTS_TO_LLM=20 | |
| ``` | |
| #### Getting API Credentials: | |
| **OpenAI API Key:** | |
| 1. Go to https://platform.openai.com/api-keys | |
| 2. Create a new API key | |
| 3. Copy and paste into `.env` | |
| **Jira API Token:** | |
| 1. Go to https://id.atlassian.com/manage-profile/security/api-tokens | |
| 2. Create a new API token | |
| 3. Copy the token to `.env` | |
| 4. Set your Jira server URL and email | |
| ### Step 4: Prepare Data File | |
| Ensure `Global Mobile Prices 2025.csv` is in the project root directory. The CSV should have the following columns: | |
| - brand, model, price_usd, ram_gb, storage_gb, camera_mp, battery_mah | |
| - display_size_inch, charging_watt, 5g_support, os, processor | |
| - rating, release_month, year | |
| ## π― Usage | |
| ### Running the Application | |
| ```bash | |
| streamlit run app.py | |
| ``` | |
| The application will open in your default browser at `http://localhost:8501` | |
| ### Console Logs | |
| The application logs all operations to the console. You'll see: | |
| - Function calls made by the AI agent | |
| - Query execution and results | |
| - Safety checks and blocked operations | |
| - Error messages and warnings | |
| Example log output: | |
| ``` | |
| 2025-11-16 10:30:15 - database - INFO - DataManager initialized with 500 rows | |
| 2025-11-16 10:30:20 - agent - INFO - User message: What are the top 5 most expensive phones? | |
| 2025-11-16 10:30:21 - tools - INFO - Function called: search_phones_by_criteria | |
| 2025-11-16 10:30:21 - tools - INFO - Parameters: {'limit': 5} | |
| 2025-11-16 10:30:21 - database - INFO - Filter returned 5 results (limit: 5) | |
| ``` | |
| ## π Workflow Example | |
| ### 1. Application Startup | |
| When you launch the app, you'll see: | |
| - **Business Overview Dashboard**: Shows total phones, average price, price range, and number of brands | |
| - **Data Visualizations**: Charts showing price distribution, top brands, OS distribution, and 5G support | |
| - **Sample Queries**: Pre-defined example queries to help you get started | |
|  | |
| ### 2. Using Sample Queries | |
| Click on any sample query button (e.g., "What are the top 5 most expensive phones?"): | |
|  | |
| ### 3. AI Agent Processing | |
| The agent processes your query using function calling: | |
| - Agent analyzes the user's intent | |
| - Selects appropriate tool(s) to use | |
| - Executes function calls with parameters | |
| - Returns formatted results | |
|  | |
| **Console Output:** | |
| ``` | |
| 2025-11-16 10:35:42 - agent - INFO - User message: What are the top 5 most expensive phones? | |
| 2025-11-16 10:35:43 - agent - INFO - Agent making 1 tool call(s) | |
| 2025-11-16 10:35:43 - tools - INFO - Function called: search_phones_by_criteria | |
| 2025-11-16 10:35:43 - tools - INFO - Parameters: {'limit': 5} | |
| 2025-11-16 10:35:43 - database - INFO - Filtering data with criteria: {} | |
| 2025-11-16 10:35:43 - database - INFO - Filter returned 5 results (limit: 5) | |
| 2025-11-16 10:35:44 - agent - INFO - Agent final response: Here are the top 5 most expensive phones... | |
| ``` | |
| ### 4. Custom Queries | |
| Type your own questions in the chat input: | |
| - "Show me Apple phones under $1000" | |
| - "What's the average price by brand?" | |
| - "Find phones with 5G support and 12GB RAM" | |
| - "Compare average prices between Android and iOS" | |
|  | |
| ### 5. Viewing Function Calls | |
| Expand the "View Function Calls" section to see technical details: | |
|  | |
| ### 6. Creating Support Tickets | |
| When the agent suggests human help or you explicitly request it: | |
| 1. The agent detects keywords like "help", "support", "bug", "issue" | |
| 2. A suggestion to create a ticket appears | |
| 3. Click "Create Support Ticket" | |
| 4. Fill in the form with your issue details | |
| 5. Submit to create a Jira ticket | |
|  | |
| **Console Output:** | |
| ``` | |
| 2025-11-16 10:40:12 - support_ticket - INFO - Creating support ticket: Unable to find specific data | |
| 2025-11-16 10:40:13 - support_ticket - INFO - Successfully connected to Jira: https://your-domain.atlassian.net | |
| 2025-11-16 10:40:14 - support_ticket - INFO - Successfully created ticket: SUP-123 | |
| ``` | |
|  | |
| ### 7. Safety Features in Action | |
| If someone tries dangerous operations (simulation): | |
| ```python | |
| # Example blocked operation | |
| query = "DELETE FROM phones WHERE price > 1000" | |
| ``` | |
| **Console Output:** | |
| ``` | |
| 2025-11-16 10:45:30 - utils - WARNING - BLOCKED OPERATION: DELETE FROM phones WHERE price > 1000 | |
| 2025-11-16 10:45:30 - utils - WARNING - Reason: Contains dangerous keyword: DELETE | |
| ``` | |
| ## π‘οΈ Safety Features | |
| The application implements multiple safety layers: | |
| ### 1. Read-Only Operations | |
| - Only SELECT/read operations are allowed | |
| - All write operations are blocked | |
| ### 2. Blocked SQL Keywords | |
| The following operations are prevented: | |
| - `INSERT` - Cannot add new records | |
| - `UPDATE` - Cannot modify existing records | |
| - `DELETE` - Cannot delete records | |
| - `DROP` - Cannot drop tables | |
| - `TRUNCATE` - Cannot truncate tables | |
| - `ALTER` - Cannot alter table structure | |
| - `CREATE` - Cannot create new tables | |
| - `REPLACE` - Cannot replace data | |
| - `MERGE` - Cannot merge data | |
| - `EXEC/EXECUTE` - Cannot execute arbitrary code | |
| ### 3. Data Chunking | |
| - Maximum 20 results sent to LLM by default (configurable) | |
| - Prevents overwhelming the LLM context | |
| - Protects data privacy | |
| ### 4. Logging | |
| - All operations logged to console | |
| - Blocked attempts are logged with warnings | |
| - Full audit trail for security review | |
| ## π§ Architecture | |
| ### Components | |
| ``` | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β Streamlit UI (app.py) β | |
| β - Business Dashboard β | |
| β - Chat Interface β | |
| β - Support Ticket Form β | |
| ββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ | |
| β | |
| βΌ | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β AI Agent (agent.py) β | |
| β - OpenAI GPT-4o Integration β | |
| β - Function Calling Logic β | |
| β - Conversation Management β | |
| ββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ | |
| β | |
| βΌ | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β Agent Tools (tools.py) β | |
| β - search_phones_by_criteria() β | |
| β - get_aggregated_statistics() β | |
| β - get_price_analysis() β | |
| β - get_available_brands() β | |
| β - get_dataset_overview() β | |
| ββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ | |
| β | |
| βΌ | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β Data Manager (database.py) β | |
| β - Pandas DataFrame Operations β | |
| β - Safety Checks β | |
| β - Query Filtering β | |
| ββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ | |
| β | |
| βΌ | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β CSV Data Source β | |
| β Global Mobile Prices 2025.csv β | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| Support Ticket Manager (support_ticket.py) | |
| β | |
| βΌ | |
| Jira API | |
| ``` | |
| ### File Structure | |
| ``` | |
| GenerativeAI/ | |
| βββ app.py # Main Streamlit application | |
| βββ agent.py # AI agent with function calling | |
| βββ database.py # Data management with safety checks | |
| βββ tools.py # Function tools for agent | |
| βββ support_ticket.py # Jira ticket creation | |
| βββ utils.py # Utility functions and logging | |
| βββ config.py # Configuration management | |
| βββ requirements.txt # Python dependencies | |
| βββ .env # Environment variables (create from .env.example) | |
| βββ .env.example # Example environment variables | |
| βββ Global Mobile Prices 2025.csv # Data source | |
| βββ README.md # This file | |
| ``` | |
| ## π¨ Available Tools (Function Calling) | |
| The AI agent has access to 5 tools: | |
| ### 1. search_phones_by_criteria | |
| Search for phones matching specific criteria. | |
| **Parameters:** | |
| - `brand` (string): Phone brand | |
| - `min_price`, `max_price` (number): Price range | |
| - `ram_gb` (integer): RAM size | |
| - `storage_gb` (integer): Storage size | |
| - `os` (string): Operating system | |
| - `has_5g` (boolean): 5G support | |
| - `min_rating` (number): Minimum rating | |
| - `limit` (integer): Max results | |
| **Example Usage:** | |
| ``` | |
| User: "Find me Samsung phones under $800 with 5G" | |
| Agent calls: search_phones_by_criteria(brand="Samsung", max_price=800, has_5g=True) | |
| ``` | |
| ### 2. get_aggregated_statistics | |
| Get aggregated statistics by grouping. | |
| **Parameters:** | |
| - `group_by` (string): Column to group by | |
| - `metric` (string): Column to calculate | |
| - `aggregation` (string): Type (mean, sum, count, min, max) | |
| **Example Usage:** | |
| ``` | |
| User: "What's the average price by brand?" | |
| Agent calls: get_aggregated_statistics(group_by="brand", metric="price_usd", aggregation="mean") | |
| ``` | |
| ### 3. get_price_analysis | |
| Get detailed price analysis. | |
| **Parameters:** | |
| - `brand` (string, optional): Filter by brand | |
| - `category` (string, optional): Group by category | |
| **Example Usage:** | |
| ``` | |
| User: "Analyze Apple phone prices" | |
| Agent calls: get_price_analysis(brand="Apple") | |
| ``` | |
| ### 4. get_available_brands | |
| Get list of all brands. | |
| **Example Usage:** | |
| ``` | |
| User: "What brands are available?" | |
| Agent calls: get_available_brands() | |
| ``` | |
| ### 5. get_dataset_overview | |
| Get overview statistics. | |
| **Example Usage:** | |
| ``` | |
| User: "Tell me about the dataset" | |
| Agent calls: get_dataset_overview() | |
| ``` | |
| ## π Troubleshooting | |
| ### Issue: "Failed to load CSV" | |
| **Solution:** Ensure `Global Mobile Prices 2025.csv` exists in the project root directory. | |
| ### Issue: "OpenAI API Error" | |
| **Solution:** | |
| - Check your API key is correct in `.env` or sidebar | |
| - Ensure you have GPT-4o access | |
| - Check your OpenAI account has credits | |
| ### Issue: "Failed to connect to Jira" | |
| **Solution:** | |
| - Verify Jira server URL is correct (https://your-domain.atlassian.net) | |
| - Check API token is valid | |
| - Ensure project key exists in your Jira instance | |
| ### Issue: No data showing in dashboard | |
| **Solution:** Check console logs for errors during CSV loading | |
| ### Issue: Agent not responding | |
| **Solution:** | |
| - Check console for error messages | |
| - Verify API key is entered | |
| - Check internet connection | |
| ## π Data Schema | |
| The CSV file should contain the following columns: | |
| | Column | Type | Description | | |
| |--------|------|-------------| | |
| | brand | string | Phone manufacturer | | |
| | model | string | Phone model name | | |
| | price_usd | float | Price in USD | | |
| | ram_gb | integer | RAM size in GB | | |
| | storage_gb | integer | Storage size in GB | | |
| | camera_mp | integer | Camera megapixels | | |
| | battery_mah | integer | Battery capacity | | |
| | display_size_inch | float | Screen size | | |
| | charging_watt | integer | Charging power | | |
| | 5g_support | string | "Yes" or "No" | | |
| | os | string | Operating system | | |
| | processor | string | Processor model | | |
| | rating | float | User rating (0-5) | | |
| | release_month | string | Release month | | |
| | year | integer | Release year | | |
| ## π€ Contributing | |
| To extend the application: | |
| 1. **Add new tools**: Add methods to `tools.py` and update `get_tool_definitions()` | |
| 2. **Enhance UI**: Modify `app.py` Streamlit components | |
| 3. **Add data sources**: Extend `database.py` to support multiple CSVs | |
| 4. **Custom safety rules**: Update `DANGEROUS_SQL_KEYWORDS` in `config.py` | |
| ## π License | |
| This project is for educational and demonstration purposes. | |
| ## π Acknowledgments | |
| - Built with [Streamlit](https://streamlit.io/) | |
| - Charts by [Plotly](https://plotly.com/) | |
| - Issue tracking by [Jira](https://www.atlassian.com/software/jira) | |
| ## π§ Support | |
| For issues or questions: | |
| 1. Use the in-app support ticket feature | |
| 2. Check console logs for detailed error messages | |
| 3. Review this README for troubleshooting steps | |
| --- | |