genAI-demo / src /README.md
Nazim Tairov
initial commit
b821944

πŸ“± Data Insights App with AI Agent

An intelligent data insights application that helps users query and analyze mobile phone data using an AI agent. The application features function calling, safety mechanisms, Jira integration for support tickets, and a modern Streamlit UI.

🌟 Features

  • AI-Powered Assistant: Uses OpenAI GPT-4o with function calling to intelligently query data
  • Business Dashboard: Visual metrics and charts showing key business insights
  • Safety Features: Blocks all write operations (INSERT, UPDATE, DELETE, DROP, TRUNCATE, ALTER)
  • Support Ticket Integration: Jira integration for creating support tickets
  • Comprehensive Logging: Detailed console logging for all operations
  • Data Privacy: Only limited data chunks are sent to LLM, not the entire dataset
  • Interactive UI: Built with Streamlit for a modern, responsive interface

πŸ“‹ Requirements

  • Python 3.12+
  • OpenAI API key (GPT-5-nano access)
  • Jira account and API token (for support ticket feature)
  • CSV data file: Global Mobile Prices 2025.csv

πŸš€ Installation

Step 1: Clone or Download the Project

cd /path/to/project/GenerativeAI

Step 2: Install Dependencies

uv venv python 3.12
uv sync

The pyproject.toml includes:

  • streamlit - Web UI framework
  • openai - OpenAI API client
  • pandas - Data manipulation
  • plotly - Interactive charts
  • python-dotenv - Environment variable management
  • jira - Jira API integration
  • numpy - Numerical operations

Step 3: Configure Environment Variables

Create a .env file in the project root:

cp .env.example .env

Edit .env and add your credentials:

# OpenAI API Configuration
OPENAI_API_KEY=sk-your-openai-api-key-here

# Jira Configuration
JIRA_SERVER=https://your-domain.atlassian.net
JIRA_EMAIL=your_email@example.com
JIRA_API_TOKEN=your_jira_api_token_here
JIRA_PROJECT_KEY=SUP

# Application Configuration
CSV_FILE_PATH=Global Mobile Prices 2025.csv
MAX_RESULTS_TO_LLM=20

Getting API Credentials:

OpenAI API Key:

  1. Go to https://platform.openai.com/api-keys
  2. Create a new API key
  3. Copy and paste into .env

Jira API Token:

  1. Go to https://id.atlassian.com/manage-profile/security/api-tokens
  2. Create a new API token
  3. Copy the token to .env
  4. Set your Jira server URL and email

Step 4: Prepare Data File

Ensure Global Mobile Prices 2025.csv is in the project root directory. The CSV should have the following columns:

  • brand, model, price_usd, ram_gb, storage_gb, camera_mp, battery_mah
  • display_size_inch, charging_watt, 5g_support, os, processor
  • rating, release_month, year

🎯 Usage

Running the Application

streamlit run app.py

The application will open in your default browser at http://localhost:8501

Console Logs

The application logs all operations to the console. You'll see:

  • Function calls made by the AI agent
  • Query execution and results
  • Safety checks and blocked operations
  • Error messages and warnings

Example log output:

2025-11-16 10:30:15 - database - INFO - DataManager initialized with 500 rows
2025-11-16 10:30:20 - agent - INFO - User message: What are the top 5 most expensive phones?
2025-11-16 10:30:21 - tools - INFO - Function called: search_phones_by_criteria
2025-11-16 10:30:21 - tools - INFO - Parameters: {'limit': 5}
2025-11-16 10:30:21 - database - INFO - Filter returned 5 results (limit: 5)

πŸ“– Workflow Example

1. Application Startup

When you launch the app, you'll see:

  • Business Overview Dashboard: Shows total phones, average price, price range, and number of brands
  • Data Visualizations: Charts showing price distribution, top brands, OS distribution, and 5G support
  • Sample Queries: Pre-defined example queries to help you get started

App Startup - Dashboard with metrics and charts

2. Using Sample Queries

Click on any sample query button (e.g., "What are the top 5 most expensive phones?"):

Sample Query Click

3. AI Agent Processing

The agent processes your query using function calling:

  • Agent analyzes the user's intent
  • Selects appropriate tool(s) to use
  • Executes function calls with parameters
  • Returns formatted results

Agent Response with Results

Console Output:

2025-11-16 10:35:42 - agent - INFO - User message: What are the top 5 most expensive phones?
2025-11-16 10:35:43 - agent - INFO - Agent making 1 tool call(s)
2025-11-16 10:35:43 - tools - INFO - Function called: search_phones_by_criteria
2025-11-16 10:35:43 - tools - INFO - Parameters: {'limit': 5}
2025-11-16 10:35:43 - database - INFO - Filtering data with criteria: {}
2025-11-16 10:35:43 - database - INFO - Filter returned 5 results (limit: 5)
2025-11-16 10:35:44 - agent - INFO - Agent final response: Here are the top 5 most expensive phones...

4. Custom Queries

Type your own questions in the chat input:

  • "Show me Apple phones under $1000"
  • "What's the average price by brand?"
  • "Find phones with 5G support and 12GB RAM"
  • "Compare average prices between Android and iOS"

Custom Query Example

5. Viewing Function Calls

Expand the "View Function Calls" section to see technical details:

Function Call Details

6. Creating Support Tickets

When the agent suggests human help or you explicitly request it:

  1. The agent detects keywords like "help", "support", "bug", "issue"
  2. A suggestion to create a ticket appears
  3. Click "Create Support Ticket"
  4. Fill in the form with your issue details
  5. Submit to create a Jira ticket

Support Ticket Creation

Console Output:

2025-11-16 10:40:12 - support_ticket - INFO - Creating support ticket: Unable to find specific data
2025-11-16 10:40:13 - support_ticket - INFO - Successfully connected to Jira: https://your-domain.atlassian.net
2025-11-16 10:40:14 - support_ticket - INFO - Successfully created ticket: SUP-123

Ticket Success Message

7. Safety Features in Action

If someone tries dangerous operations (simulation):

# Example blocked operation
query = "DELETE FROM phones WHERE price > 1000"

Console Output:

2025-11-16 10:45:30 - utils - WARNING - BLOCKED OPERATION: DELETE FROM phones WHERE price > 1000
2025-11-16 10:45:30 - utils - WARNING - Reason: Contains dangerous keyword: DELETE

πŸ›‘οΈ Safety Features

The application implements multiple safety layers:

1. Read-Only Operations

  • Only SELECT/read operations are allowed
  • All write operations are blocked

2. Blocked SQL Keywords

The following operations are prevented:

  • INSERT - Cannot add new records
  • UPDATE - Cannot modify existing records
  • DELETE - Cannot delete records
  • DROP - Cannot drop tables
  • TRUNCATE - Cannot truncate tables
  • ALTER - Cannot alter table structure
  • CREATE - Cannot create new tables
  • REPLACE - Cannot replace data
  • MERGE - Cannot merge data
  • EXEC/EXECUTE - Cannot execute arbitrary code

3. Data Chunking

  • Maximum 20 results sent to LLM by default (configurable)
  • Prevents overwhelming the LLM context
  • Protects data privacy

4. Logging

  • All operations logged to console
  • Blocked attempts are logged with warnings
  • Full audit trail for security review

πŸ”§ Architecture

Components

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Streamlit UI (app.py)                 β”‚
β”‚  - Business Dashboard                                    β”‚
β”‚  - Chat Interface                                        β”‚
β”‚  - Support Ticket Form                                   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 β”‚
                 β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              AI Agent (agent.py)                         β”‚
β”‚  - OpenAI GPT-4o Integration                            β”‚
β”‚  - Function Calling Logic                               β”‚
β”‚  - Conversation Management                              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 β”‚
                 β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Agent Tools (tools.py)                      β”‚
β”‚  - search_phones_by_criteria()                          β”‚
β”‚  - get_aggregated_statistics()                          β”‚
β”‚  - get_price_analysis()                                 β”‚
β”‚  - get_available_brands()                               β”‚
β”‚  - get_dataset_overview()                               β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 β”‚
                 β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚          Data Manager (database.py)                      β”‚
β”‚  - Pandas DataFrame Operations                          β”‚
β”‚  - Safety Checks                                        β”‚
β”‚  - Query Filtering                                      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 β”‚
                 β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              CSV Data Source                             β”‚
β”‚  Global Mobile Prices 2025.csv                          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

        Support Ticket Manager (support_ticket.py)
                    β”‚
                    β–Ό
              Jira API

File Structure

GenerativeAI/
β”œβ”€β”€ app.py                    # Main Streamlit application
β”œβ”€β”€ agent.py                  # AI agent with function calling
β”œβ”€β”€ database.py               # Data management with safety checks
β”œβ”€β”€ tools.py                  # Function tools for agent
β”œβ”€β”€ support_ticket.py         # Jira ticket creation
β”œβ”€β”€ utils.py                  # Utility functions and logging
β”œβ”€β”€ config.py                 # Configuration management
β”œβ”€β”€ requirements.txt          # Python dependencies
β”œβ”€β”€ .env                      # Environment variables (create from .env.example)
β”œβ”€β”€ .env.example              # Example environment variables
β”œβ”€β”€ Global Mobile Prices 2025.csv  # Data source
└── README.md                 # This file

🎨 Available Tools (Function Calling)

The AI agent has access to 5 tools:

1. search_phones_by_criteria

Search for phones matching specific criteria.

Parameters:

  • brand (string): Phone brand
  • min_price, max_price (number): Price range
  • ram_gb (integer): RAM size
  • storage_gb (integer): Storage size
  • os (string): Operating system
  • has_5g (boolean): 5G support
  • min_rating (number): Minimum rating
  • limit (integer): Max results

Example Usage:

User: "Find me Samsung phones under $800 with 5G"
Agent calls: search_phones_by_criteria(brand="Samsung", max_price=800, has_5g=True)

2. get_aggregated_statistics

Get aggregated statistics by grouping.

Parameters:

  • group_by (string): Column to group by
  • metric (string): Column to calculate
  • aggregation (string): Type (mean, sum, count, min, max)

Example Usage:

User: "What's the average price by brand?"
Agent calls: get_aggregated_statistics(group_by="brand", metric="price_usd", aggregation="mean")

3. get_price_analysis

Get detailed price analysis.

Parameters:

  • brand (string, optional): Filter by brand
  • category (string, optional): Group by category

Example Usage:

User: "Analyze Apple phone prices"
Agent calls: get_price_analysis(brand="Apple")

4. get_available_brands

Get list of all brands.

Example Usage:

User: "What brands are available?"
Agent calls: get_available_brands()

5. get_dataset_overview

Get overview statistics.

Example Usage:

User: "Tell me about the dataset"
Agent calls: get_dataset_overview()

πŸ› Troubleshooting

Issue: "Failed to load CSV"

Solution: Ensure Global Mobile Prices 2025.csv exists in the project root directory.

Issue: "OpenAI API Error"

Solution:

  • Check your API key is correct in .env or sidebar
  • Ensure you have GPT-4o access
  • Check your OpenAI account has credits

Issue: "Failed to connect to Jira"

Solution:

Issue: No data showing in dashboard

Solution: Check console logs for errors during CSV loading

Issue: Agent not responding

Solution:

  • Check console for error messages
  • Verify API key is entered
  • Check internet connection

πŸ“Š Data Schema

The CSV file should contain the following columns:

Column Type Description
brand string Phone manufacturer
model string Phone model name
price_usd float Price in USD
ram_gb integer RAM size in GB
storage_gb integer Storage size in GB
camera_mp integer Camera megapixels
battery_mah integer Battery capacity
display_size_inch float Screen size
charging_watt integer Charging power
5g_support string "Yes" or "No"
os string Operating system
processor string Processor model
rating float User rating (0-5)
release_month string Release month
year integer Release year

🀝 Contributing

To extend the application:

  1. Add new tools: Add methods to tools.py and update get_tool_definitions()
  2. Enhance UI: Modify app.py Streamlit components
  3. Add data sources: Extend database.py to support multiple CSVs
  4. Custom safety rules: Update DANGEROUS_SQL_KEYWORDS in config.py

πŸ“ License

This project is for educational and demonstration purposes.

πŸ™ Acknowledgments

πŸ“§ Support

For issues or questions:

  1. Use the in-app support ticket feature
  2. Check console logs for detailed error messages
  3. Review this README for troubleshooting steps