Spaces:
Sleeping
π± Data Insights App with AI Agent
An intelligent data insights application that helps users query and analyze mobile phone data using an AI agent. The application features function calling, safety mechanisms, Jira integration for support tickets, and a modern Streamlit UI.
π Features
- AI-Powered Assistant: Uses OpenAI GPT-4o with function calling to intelligently query data
- Business Dashboard: Visual metrics and charts showing key business insights
- Safety Features: Blocks all write operations (INSERT, UPDATE, DELETE, DROP, TRUNCATE, ALTER)
- Support Ticket Integration: Jira integration for creating support tickets
- Comprehensive Logging: Detailed console logging for all operations
- Data Privacy: Only limited data chunks are sent to LLM, not the entire dataset
- Interactive UI: Built with Streamlit for a modern, responsive interface
π Requirements
- Python 3.12+
- OpenAI API key (GPT-5-nano access)
- Jira account and API token (for support ticket feature)
- CSV data file:
Global Mobile Prices 2025.csv
π Installation
Step 1: Clone or Download the Project
cd /path/to/project/GenerativeAI
Step 2: Install Dependencies
uv venv python 3.12
uv sync
The pyproject.toml includes:
streamlit- Web UI frameworkopenai- OpenAI API clientpandas- Data manipulationplotly- Interactive chartspython-dotenv- Environment variable managementjira- Jira API integrationnumpy- Numerical operations
Step 3: Configure Environment Variables
Create a .env file in the project root:
cp .env.example .env
Edit .env and add your credentials:
# OpenAI API Configuration
OPENAI_API_KEY=sk-your-openai-api-key-here
# Jira Configuration
JIRA_SERVER=https://your-domain.atlassian.net
JIRA_EMAIL=your_email@example.com
JIRA_API_TOKEN=your_jira_api_token_here
JIRA_PROJECT_KEY=SUP
# Application Configuration
CSV_FILE_PATH=Global Mobile Prices 2025.csv
MAX_RESULTS_TO_LLM=20
Getting API Credentials:
OpenAI API Key:
- Go to https://platform.openai.com/api-keys
- Create a new API key
- Copy and paste into
.env
Jira API Token:
- Go to https://id.atlassian.com/manage-profile/security/api-tokens
- Create a new API token
- Copy the token to
.env - Set your Jira server URL and email
Step 4: Prepare Data File
Ensure Global Mobile Prices 2025.csv is in the project root directory. The CSV should have the following columns:
- brand, model, price_usd, ram_gb, storage_gb, camera_mp, battery_mah
- display_size_inch, charging_watt, 5g_support, os, processor
- rating, release_month, year
π― Usage
Running the Application
streamlit run app.py
The application will open in your default browser at http://localhost:8501
Console Logs
The application logs all operations to the console. You'll see:
- Function calls made by the AI agent
- Query execution and results
- Safety checks and blocked operations
- Error messages and warnings
Example log output:
2025-11-16 10:30:15 - database - INFO - DataManager initialized with 500 rows
2025-11-16 10:30:20 - agent - INFO - User message: What are the top 5 most expensive phones?
2025-11-16 10:30:21 - tools - INFO - Function called: search_phones_by_criteria
2025-11-16 10:30:21 - tools - INFO - Parameters: {'limit': 5}
2025-11-16 10:30:21 - database - INFO - Filter returned 5 results (limit: 5)
π Workflow Example
1. Application Startup
When you launch the app, you'll see:
- Business Overview Dashboard: Shows total phones, average price, price range, and number of brands
- Data Visualizations: Charts showing price distribution, top brands, OS distribution, and 5G support
- Sample Queries: Pre-defined example queries to help you get started
2. Using Sample Queries
Click on any sample query button (e.g., "What are the top 5 most expensive phones?"):
3. AI Agent Processing
The agent processes your query using function calling:
- Agent analyzes the user's intent
- Selects appropriate tool(s) to use
- Executes function calls with parameters
- Returns formatted results
Console Output:
2025-11-16 10:35:42 - agent - INFO - User message: What are the top 5 most expensive phones?
2025-11-16 10:35:43 - agent - INFO - Agent making 1 tool call(s)
2025-11-16 10:35:43 - tools - INFO - Function called: search_phones_by_criteria
2025-11-16 10:35:43 - tools - INFO - Parameters: {'limit': 5}
2025-11-16 10:35:43 - database - INFO - Filtering data with criteria: {}
2025-11-16 10:35:43 - database - INFO - Filter returned 5 results (limit: 5)
2025-11-16 10:35:44 - agent - INFO - Agent final response: Here are the top 5 most expensive phones...
4. Custom Queries
Type your own questions in the chat input:
- "Show me Apple phones under $1000"
- "What's the average price by brand?"
- "Find phones with 5G support and 12GB RAM"
- "Compare average prices between Android and iOS"
5. Viewing Function Calls
Expand the "View Function Calls" section to see technical details:
6. Creating Support Tickets
When the agent suggests human help or you explicitly request it:
- The agent detects keywords like "help", "support", "bug", "issue"
- A suggestion to create a ticket appears
- Click "Create Support Ticket"
- Fill in the form with your issue details
- Submit to create a Jira ticket
Console Output:
2025-11-16 10:40:12 - support_ticket - INFO - Creating support ticket: Unable to find specific data
2025-11-16 10:40:13 - support_ticket - INFO - Successfully connected to Jira: https://your-domain.atlassian.net
2025-11-16 10:40:14 - support_ticket - INFO - Successfully created ticket: SUP-123
7. Safety Features in Action
If someone tries dangerous operations (simulation):
# Example blocked operation
query = "DELETE FROM phones WHERE price > 1000"
Console Output:
2025-11-16 10:45:30 - utils - WARNING - BLOCKED OPERATION: DELETE FROM phones WHERE price > 1000
2025-11-16 10:45:30 - utils - WARNING - Reason: Contains dangerous keyword: DELETE
π‘οΈ Safety Features
The application implements multiple safety layers:
1. Read-Only Operations
- Only SELECT/read operations are allowed
- All write operations are blocked
2. Blocked SQL Keywords
The following operations are prevented:
INSERT- Cannot add new recordsUPDATE- Cannot modify existing recordsDELETE- Cannot delete recordsDROP- Cannot drop tablesTRUNCATE- Cannot truncate tablesALTER- Cannot alter table structureCREATE- Cannot create new tablesREPLACE- Cannot replace dataMERGE- Cannot merge dataEXEC/EXECUTE- Cannot execute arbitrary code
3. Data Chunking
- Maximum 20 results sent to LLM by default (configurable)
- Prevents overwhelming the LLM context
- Protects data privacy
4. Logging
- All operations logged to console
- Blocked attempts are logged with warnings
- Full audit trail for security review
π§ Architecture
Components
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Streamlit UI (app.py) β
β - Business Dashboard β
β - Chat Interface β
β - Support Ticket Form β
ββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AI Agent (agent.py) β
β - OpenAI GPT-4o Integration β
β - Function Calling Logic β
β - Conversation Management β
ββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Agent Tools (tools.py) β
β - search_phones_by_criteria() β
β - get_aggregated_statistics() β
β - get_price_analysis() β
β - get_available_brands() β
β - get_dataset_overview() β
ββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Data Manager (database.py) β
β - Pandas DataFrame Operations β
β - Safety Checks β
β - Query Filtering β
ββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CSV Data Source β
β Global Mobile Prices 2025.csv β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Support Ticket Manager (support_ticket.py)
β
βΌ
Jira API
File Structure
GenerativeAI/
βββ app.py # Main Streamlit application
βββ agent.py # AI agent with function calling
βββ database.py # Data management with safety checks
βββ tools.py # Function tools for agent
βββ support_ticket.py # Jira ticket creation
βββ utils.py # Utility functions and logging
βββ config.py # Configuration management
βββ requirements.txt # Python dependencies
βββ .env # Environment variables (create from .env.example)
βββ .env.example # Example environment variables
βββ Global Mobile Prices 2025.csv # Data source
βββ README.md # This file
π¨ Available Tools (Function Calling)
The AI agent has access to 5 tools:
1. search_phones_by_criteria
Search for phones matching specific criteria.
Parameters:
brand(string): Phone brandmin_price,max_price(number): Price rangeram_gb(integer): RAM sizestorage_gb(integer): Storage sizeos(string): Operating systemhas_5g(boolean): 5G supportmin_rating(number): Minimum ratinglimit(integer): Max results
Example Usage:
User: "Find me Samsung phones under $800 with 5G"
Agent calls: search_phones_by_criteria(brand="Samsung", max_price=800, has_5g=True)
2. get_aggregated_statistics
Get aggregated statistics by grouping.
Parameters:
group_by(string): Column to group bymetric(string): Column to calculateaggregation(string): Type (mean, sum, count, min, max)
Example Usage:
User: "What's the average price by brand?"
Agent calls: get_aggregated_statistics(group_by="brand", metric="price_usd", aggregation="mean")
3. get_price_analysis
Get detailed price analysis.
Parameters:
brand(string, optional): Filter by brandcategory(string, optional): Group by category
Example Usage:
User: "Analyze Apple phone prices"
Agent calls: get_price_analysis(brand="Apple")
4. get_available_brands
Get list of all brands.
Example Usage:
User: "What brands are available?"
Agent calls: get_available_brands()
5. get_dataset_overview
Get overview statistics.
Example Usage:
User: "Tell me about the dataset"
Agent calls: get_dataset_overview()
π Troubleshooting
Issue: "Failed to load CSV"
Solution: Ensure Global Mobile Prices 2025.csv exists in the project root directory.
Issue: "OpenAI API Error"
Solution:
- Check your API key is correct in
.envor sidebar - Ensure you have GPT-4o access
- Check your OpenAI account has credits
Issue: "Failed to connect to Jira"
Solution:
- Verify Jira server URL is correct (https://your-domain.atlassian.net)
- Check API token is valid
- Ensure project key exists in your Jira instance
Issue: No data showing in dashboard
Solution: Check console logs for errors during CSV loading
Issue: Agent not responding
Solution:
- Check console for error messages
- Verify API key is entered
- Check internet connection
π Data Schema
The CSV file should contain the following columns:
| Column | Type | Description |
|---|---|---|
| brand | string | Phone manufacturer |
| model | string | Phone model name |
| price_usd | float | Price in USD |
| ram_gb | integer | RAM size in GB |
| storage_gb | integer | Storage size in GB |
| camera_mp | integer | Camera megapixels |
| battery_mah | integer | Battery capacity |
| display_size_inch | float | Screen size |
| charging_watt | integer | Charging power |
| 5g_support | string | "Yes" or "No" |
| os | string | Operating system |
| processor | string | Processor model |
| rating | float | User rating (0-5) |
| release_month | string | Release month |
| year | integer | Release year |
π€ Contributing
To extend the application:
- Add new tools: Add methods to
tools.pyand updateget_tool_definitions() - Enhance UI: Modify
app.pyStreamlit components - Add data sources: Extend
database.pyto support multiple CSVs - Custom safety rules: Update
DANGEROUS_SQL_KEYWORDSinconfig.py
π License
This project is for educational and demonstration purposes.
π Acknowledgments
π§ Support
For issues or questions:
- Use the in-app support ticket feature
- Check console logs for detailed error messages
- Review this README for troubleshooting steps






