genAI-demo / src /README.md
Nazim Tairov
initial commit
b821944
# πŸ“± Data Insights App with AI Agent
An intelligent data insights application that helps users query and analyze mobile phone data using an AI agent. The application features function calling, safety mechanisms, Jira integration for support tickets, and a modern Streamlit UI.
## 🌟 Features
- **AI-Powered Assistant**: Uses OpenAI GPT-4o with function calling to intelligently query data
- **Business Dashboard**: Visual metrics and charts showing key business insights
- **Safety Features**: Blocks all write operations (INSERT, UPDATE, DELETE, DROP, TRUNCATE, ALTER)
- **Support Ticket Integration**: Jira integration for creating support tickets
- **Comprehensive Logging**: Detailed console logging for all operations
- **Data Privacy**: Only limited data chunks are sent to LLM, not the entire dataset
- **Interactive UI**: Built with Streamlit for a modern, responsive interface
## πŸ“‹ Requirements
- Python 3.12+
- OpenAI API key (GPT-5-nano access)
- Jira account and API token (for support ticket feature)
- CSV data file: `Global Mobile Prices 2025.csv`
## πŸš€ Installation
### Step 1: Clone or Download the Project
```bash
cd /path/to/project/GenerativeAI
```
### Step 2: Install Dependencies
```bash
uv venv python 3.12
uv sync
```
The `pyproject.toml` includes:
- `streamlit` - Web UI framework
- `openai` - OpenAI API client
- `pandas` - Data manipulation
- `plotly` - Interactive charts
- `python-dotenv` - Environment variable management
- `jira` - Jira API integration
- `numpy` - Numerical operations
### Step 3: Configure Environment Variables
Create a `.env` file in the project root:
```bash
cp .env.example .env
```
Edit `.env` and add your credentials:
```env
# OpenAI API Configuration
OPENAI_API_KEY=sk-your-openai-api-key-here
# Jira Configuration
JIRA_SERVER=https://your-domain.atlassian.net
JIRA_EMAIL=your_email@example.com
JIRA_API_TOKEN=your_jira_api_token_here
JIRA_PROJECT_KEY=SUP
# Application Configuration
CSV_FILE_PATH=Global Mobile Prices 2025.csv
MAX_RESULTS_TO_LLM=20
```
#### Getting API Credentials:
**OpenAI API Key:**
1. Go to https://platform.openai.com/api-keys
2. Create a new API key
3. Copy and paste into `.env`
**Jira API Token:**
1. Go to https://id.atlassian.com/manage-profile/security/api-tokens
2. Create a new API token
3. Copy the token to `.env`
4. Set your Jira server URL and email
### Step 4: Prepare Data File
Ensure `Global Mobile Prices 2025.csv` is in the project root directory. The CSV should have the following columns:
- brand, model, price_usd, ram_gb, storage_gb, camera_mp, battery_mah
- display_size_inch, charging_watt, 5g_support, os, processor
- rating, release_month, year
## 🎯 Usage
### Running the Application
```bash
streamlit run app.py
```
The application will open in your default browser at `http://localhost:8501`
### Console Logs
The application logs all operations to the console. You'll see:
- Function calls made by the AI agent
- Query execution and results
- Safety checks and blocked operations
- Error messages and warnings
Example log output:
```
2025-11-16 10:30:15 - database - INFO - DataManager initialized with 500 rows
2025-11-16 10:30:20 - agent - INFO - User message: What are the top 5 most expensive phones?
2025-11-16 10:30:21 - tools - INFO - Function called: search_phones_by_criteria
2025-11-16 10:30:21 - tools - INFO - Parameters: {'limit': 5}
2025-11-16 10:30:21 - database - INFO - Filter returned 5 results (limit: 5)
```
## πŸ“– Workflow Example
### 1. Application Startup
When you launch the app, you'll see:
- **Business Overview Dashboard**: Shows total phones, average price, price range, and number of brands
- **Data Visualizations**: Charts showing price distribution, top brands, OS distribution, and 5G support
- **Sample Queries**: Pre-defined example queries to help you get started
![App Startup - Dashboard with metrics and charts](screenshots/UI_screenshot_1.png)
### 2. Using Sample Queries
Click on any sample query button (e.g., "What are the top 5 most expensive phones?"):
![Sample Query Click](screenshots/UI_screenshot_2.png)
### 3. AI Agent Processing
The agent processes your query using function calling:
- Agent analyzes the user's intent
- Selects appropriate tool(s) to use
- Executes function calls with parameters
- Returns formatted results
![Agent Response with Results](screenshots/UI_screenshot_3.png)
**Console Output:**
```
2025-11-16 10:35:42 - agent - INFO - User message: What are the top 5 most expensive phones?
2025-11-16 10:35:43 - agent - INFO - Agent making 1 tool call(s)
2025-11-16 10:35:43 - tools - INFO - Function called: search_phones_by_criteria
2025-11-16 10:35:43 - tools - INFO - Parameters: {'limit': 5}
2025-11-16 10:35:43 - database - INFO - Filtering data with criteria: {}
2025-11-16 10:35:43 - database - INFO - Filter returned 5 results (limit: 5)
2025-11-16 10:35:44 - agent - INFO - Agent final response: Here are the top 5 most expensive phones...
```
### 4. Custom Queries
Type your own questions in the chat input:
- "Show me Apple phones under $1000"
- "What's the average price by brand?"
- "Find phones with 5G support and 12GB RAM"
- "Compare average prices between Android and iOS"
![Custom Query Example](screenshots/most_expensive_phones_function_call.png)
### 5. Viewing Function Calls
Expand the "View Function Calls" section to see technical details:
![Function Call Details](screenshots/terminal_log_1.png)
### 6. Creating Support Tickets
When the agent suggests human help or you explicitly request it:
1. The agent detects keywords like "help", "support", "bug", "issue"
2. A suggestion to create a ticket appears
3. Click "Create Support Ticket"
4. Fill in the form with your issue details
5. Submit to create a Jira ticket
![Support Ticket Creation](screenshots/support_ticket_creation_flow.png)
**Console Output:**
```
2025-11-16 10:40:12 - support_ticket - INFO - Creating support ticket: Unable to find specific data
2025-11-16 10:40:13 - support_ticket - INFO - Successfully connected to Jira: https://your-domain.atlassian.net
2025-11-16 10:40:14 - support_ticket - INFO - Successfully created ticket: SUP-123
```
![Ticket Success Message](screenshots/jira_ticket.png)
### 7. Safety Features in Action
If someone tries dangerous operations (simulation):
```python
# Example blocked operation
query = "DELETE FROM phones WHERE price > 1000"
```
**Console Output:**
```
2025-11-16 10:45:30 - utils - WARNING - BLOCKED OPERATION: DELETE FROM phones WHERE price > 1000
2025-11-16 10:45:30 - utils - WARNING - Reason: Contains dangerous keyword: DELETE
```
## πŸ›‘οΈ Safety Features
The application implements multiple safety layers:
### 1. Read-Only Operations
- Only SELECT/read operations are allowed
- All write operations are blocked
### 2. Blocked SQL Keywords
The following operations are prevented:
- `INSERT` - Cannot add new records
- `UPDATE` - Cannot modify existing records
- `DELETE` - Cannot delete records
- `DROP` - Cannot drop tables
- `TRUNCATE` - Cannot truncate tables
- `ALTER` - Cannot alter table structure
- `CREATE` - Cannot create new tables
- `REPLACE` - Cannot replace data
- `MERGE` - Cannot merge data
- `EXEC/EXECUTE` - Cannot execute arbitrary code
### 3. Data Chunking
- Maximum 20 results sent to LLM by default (configurable)
- Prevents overwhelming the LLM context
- Protects data privacy
### 4. Logging
- All operations logged to console
- Blocked attempts are logged with warnings
- Full audit trail for security review
## πŸ”§ Architecture
### Components
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Streamlit UI (app.py) β”‚
β”‚ - Business Dashboard β”‚
β”‚ - Chat Interface β”‚
β”‚ - Support Ticket Form β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ AI Agent (agent.py) β”‚
β”‚ - OpenAI GPT-4o Integration β”‚
β”‚ - Function Calling Logic β”‚
β”‚ - Conversation Management β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Agent Tools (tools.py) β”‚
β”‚ - search_phones_by_criteria() β”‚
β”‚ - get_aggregated_statistics() β”‚
β”‚ - get_price_analysis() β”‚
β”‚ - get_available_brands() β”‚
β”‚ - get_dataset_overview() β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Data Manager (database.py) β”‚
β”‚ - Pandas DataFrame Operations β”‚
β”‚ - Safety Checks β”‚
β”‚ - Query Filtering β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ CSV Data Source β”‚
β”‚ Global Mobile Prices 2025.csv β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Support Ticket Manager (support_ticket.py)
β”‚
β–Ό
Jira API
```
### File Structure
```
GenerativeAI/
β”œβ”€β”€ app.py # Main Streamlit application
β”œβ”€β”€ agent.py # AI agent with function calling
β”œβ”€β”€ database.py # Data management with safety checks
β”œβ”€β”€ tools.py # Function tools for agent
β”œβ”€β”€ support_ticket.py # Jira ticket creation
β”œβ”€β”€ utils.py # Utility functions and logging
β”œβ”€β”€ config.py # Configuration management
β”œβ”€β”€ requirements.txt # Python dependencies
β”œβ”€β”€ .env # Environment variables (create from .env.example)
β”œβ”€β”€ .env.example # Example environment variables
β”œβ”€β”€ Global Mobile Prices 2025.csv # Data source
└── README.md # This file
```
## 🎨 Available Tools (Function Calling)
The AI agent has access to 5 tools:
### 1. search_phones_by_criteria
Search for phones matching specific criteria.
**Parameters:**
- `brand` (string): Phone brand
- `min_price`, `max_price` (number): Price range
- `ram_gb` (integer): RAM size
- `storage_gb` (integer): Storage size
- `os` (string): Operating system
- `has_5g` (boolean): 5G support
- `min_rating` (number): Minimum rating
- `limit` (integer): Max results
**Example Usage:**
```
User: "Find me Samsung phones under $800 with 5G"
Agent calls: search_phones_by_criteria(brand="Samsung", max_price=800, has_5g=True)
```
### 2. get_aggregated_statistics
Get aggregated statistics by grouping.
**Parameters:**
- `group_by` (string): Column to group by
- `metric` (string): Column to calculate
- `aggregation` (string): Type (mean, sum, count, min, max)
**Example Usage:**
```
User: "What's the average price by brand?"
Agent calls: get_aggregated_statistics(group_by="brand", metric="price_usd", aggregation="mean")
```
### 3. get_price_analysis
Get detailed price analysis.
**Parameters:**
- `brand` (string, optional): Filter by brand
- `category` (string, optional): Group by category
**Example Usage:**
```
User: "Analyze Apple phone prices"
Agent calls: get_price_analysis(brand="Apple")
```
### 4. get_available_brands
Get list of all brands.
**Example Usage:**
```
User: "What brands are available?"
Agent calls: get_available_brands()
```
### 5. get_dataset_overview
Get overview statistics.
**Example Usage:**
```
User: "Tell me about the dataset"
Agent calls: get_dataset_overview()
```
## πŸ› Troubleshooting
### Issue: "Failed to load CSV"
**Solution:** Ensure `Global Mobile Prices 2025.csv` exists in the project root directory.
### Issue: "OpenAI API Error"
**Solution:**
- Check your API key is correct in `.env` or sidebar
- Ensure you have GPT-4o access
- Check your OpenAI account has credits
### Issue: "Failed to connect to Jira"
**Solution:**
- Verify Jira server URL is correct (https://your-domain.atlassian.net)
- Check API token is valid
- Ensure project key exists in your Jira instance
### Issue: No data showing in dashboard
**Solution:** Check console logs for errors during CSV loading
### Issue: Agent not responding
**Solution:**
- Check console for error messages
- Verify API key is entered
- Check internet connection
## πŸ“Š Data Schema
The CSV file should contain the following columns:
| Column | Type | Description |
|--------|------|-------------|
| brand | string | Phone manufacturer |
| model | string | Phone model name |
| price_usd | float | Price in USD |
| ram_gb | integer | RAM size in GB |
| storage_gb | integer | Storage size in GB |
| camera_mp | integer | Camera megapixels |
| battery_mah | integer | Battery capacity |
| display_size_inch | float | Screen size |
| charging_watt | integer | Charging power |
| 5g_support | string | "Yes" or "No" |
| os | string | Operating system |
| processor | string | Processor model |
| rating | float | User rating (0-5) |
| release_month | string | Release month |
| year | integer | Release year |
## 🀝 Contributing
To extend the application:
1. **Add new tools**: Add methods to `tools.py` and update `get_tool_definitions()`
2. **Enhance UI**: Modify `app.py` Streamlit components
3. **Add data sources**: Extend `database.py` to support multiple CSVs
4. **Custom safety rules**: Update `DANGEROUS_SQL_KEYWORDS` in `config.py`
## πŸ“ License
This project is for educational and demonstration purposes.
## πŸ™ Acknowledgments
- Built with [Streamlit](https://streamlit.io/)
- Charts by [Plotly](https://plotly.com/)
- Issue tracking by [Jira](https://www.atlassian.com/software/jira)
## πŸ“§ Support
For issues or questions:
1. Use the in-app support ticket feature
2. Check console logs for detailed error messages
3. Review this README for troubleshooting steps
---