Phase 2.2: Advanced Analytics Dashboard
Overview
Phase 2.2 transforms the application into an analytics-driven platform with comprehensive insights into system usage, clinical outcomes, compliance, and knowledge gaps.
Prerequisite: Phase 2.1 (PostgreSQL Database) must be completed
Features
1. π Analytics Overview
- Real-time key metrics
- Active users count
- Active sessions
- Total messages processed
- 24-hour event summary
2. π Usage Analytics
- Daily active user trends
- Top users by message count
- Usage patterns over time
- Feature adoption metrics
3. π Compliance Reporting
- Login/logout audit trails
- Failed login tracking
- Data access audit logs
- HIPAA/GDPR compliance ready
- 90-day historical reporting
4. π Knowledge Gap Analysis
- Question distribution by topic
- Unanswered query tracking
- Low-confidence answer detection
- Content recommendations
5. π₯ Clinical Outcomes
- Care plan duration metrics
- Goal achievement rates
- Patient satisfaction scores
- Clinical outcome trends
6. π₯ User Activity Report
- Per-user engagement metrics
- Last login tracking
- Message and session counts
- Role-based usage analysis
7. βοΈ System Health
- Database status
- API response times
- Vector database readiness
- System uptime tracking
Setup Instructions
Prerequisites
Phase 2.1 Completed
- PostgreSQL database running
- Chat history in database
- Audit logs populated
Python Packages
pip install pandas plotly streamlit(Already in requirements.txt)
Configuration
# Update .env.production
export APP_ENV=production
export USE_DATABASE=true
export DB_HOST=localhost
export DB_PORT=5432
export DB_NAME=nursing_validator
export DB_USER=nursing_admin
export DB_PASSWORD=your_password
Running Analytics
# Create analytics-only dashboard
streamlit run analytics_dashboard.py
# Or add analytics tab to main app
streamlit run app_phase2.py
# Then navigate to Analytics tab (admin only)
Architecture
ββββββββββββββββββββββββββββββββββββββββ
β Streamlit Analytics Frontend β
β (analytics_dashboard.py) β
ββββββββββββββββββββββββββββββββββββββββ€
β OverviewβUsageβComplianceβGaps... β
ββββββββββββββββββββββββββββββββββββββββ€
β AnalyticsDashboard Class β
β - display_overview() β
β - display_usage_dashboard() β
β - display_compliance_report() β
β - display_knowledge_gaps() β
β - display_clinical_outcomes() β
β - display_user_activity() β
β - display_system_health() β
ββββββββββββββββββββββββββββββββββββββββ€
β Data Processing Layer β
β (pandas + plotly visualization) β
ββββββββββββββββββββββββββββββββββββββββ€
β Database Query Layer β
β (database.py - via SQL) β
ββββββββββββββββββββββββββββββββββββββββ€
β PostgreSQL Database β
β ββ chat_history β
β ββ audit_logs β
β ββ analytics_events β
β ββ users β
β ββ sessions β
ββββββββββββββββββββββββββββββββββββββββ
API Reference
AnalyticsDashboard Class
from analytics_dashboard import AnalyticsDashboard
# Initialize
dashboard = AnalyticsDashboard()
# Display sections
dashboard.display_overview() # Key metrics
dashboard.display_usage_dashboard() # Usage trends
dashboard.display_compliance_report() # Audit logs
dashboard.display_knowledge_gaps() # Content analysis
dashboard.display_clinical_outcomes() # Patient metrics
dashboard.display_user_activity() # Per-user stats
dashboard.display_system_health() # System metrics
dashboard.display_export_options() # Data export
Integration with Main App
# In app_phase2.py, add analytics tab
with tab_analytics:
from analytics_dashboard import display_analytics_dashboard
# Only for admin users
if st.session_state.role == "admin":
display_analytics_dashboard()
else:
st.warning("Analytics available for admins only")
Reports
1. Overview Report
Metrics:
- Active Users: Count of users with is_active = TRUE
- Active Sessions: Count where expires_at > NOW
- Total Messages: COUNT(*) from chat_history
- Events (24h): Audit logs in last 24 hours
SELECT COUNT(*) FROM users WHERE is_active = TRUE;
SELECT COUNT(*) FROM sessions WHERE expires_at > CURRENT_TIMESTAMP;
SELECT COUNT(*) FROM chat_history;
SELECT COUNT(*) FROM audit_logs
WHERE created_at > CURRENT_TIMESTAMP - INTERVAL '24 hours';
2. Usage Report
Data:
- Daily active users (line chart)
- Top 10 users by message count (bar chart)
- Time-range filtering (default: 30 days)
-- Daily active users
SELECT DATE(created_at), COUNT(DISTINCT user_id)
FROM chat_history
GROUP BY DATE(created_at);
-- Top users
SELECT u.username, COUNT(*)
FROM chat_history ch
JOIN users u ON ch.user_id = u.id
GROUP BY u.username
ORDER BY COUNT(*) DESC LIMIT 10;
3. Compliance Report
Audit Events:
- Login/logout tracking
- Failed login counts
- Data access by resource
- 90-day historical logs
- HIPAA compliance ready
-- Authentication events
SELECT action, COUNT(*), COUNT(DISTINCT user_id)
FROM audit_logs
WHERE action IN ('login', 'logout', 'failed_login')
GROUP BY action;
-- Data access
SELECT resource_type, COUNT(*), COUNT(DISTINCT user_id)
FROM audit_logs
WHERE resource_type IS NOT NULL
GROUP BY resource_type;
4. Knowledge Gap Analysis
Analysis:
- Question distribution by topic
- Topics: Care, Assessment, Interventions, Goals, Medications
- Pie chart of question types
- Identifies training needs
SELECT
CASE
WHEN content ILIKE '%care%' THEN 'Care Planning'
WHEN content ILIKE '%assessment%' THEN 'Assessment'
WHEN content ILIKE '%intervention%' THEN 'Interventions'
WHEN content ILIKE '%goal%' THEN 'Goals'
WHEN content ILIKE '%medication%' THEN 'Medications'
ELSE 'Other'
END as topic,
COUNT(*)
FROM chat_history
WHERE role = 'user'
GROUP BY topic;
5. User Activity Report
Metrics:
- Username, role, last login
- Message count per user
- Session count
- Active in last 7 days count
- Average messages per user
SELECT
u.username, u.role, u.last_login,
COUNT(DISTINCT ch.id) as messages,
COUNT(DISTINCT s.id) as sessions
FROM users u
LEFT JOIN chat_history ch ON u.id = ch.user_id
LEFT JOIN sessions s ON u.id = s.user_id
WHERE u.is_active = TRUE
GROUP BY u.id, u.username, u.role, u.last_login;
Visualizations
Line Chart: Daily Active Users
- X-axis: Date
- Y-axis: Number of unique users
- Time Range: Selectable (default: 30 days)
- Type: Line chart with markers
Bar Chart: Top Users
- X-axis: Username
- Y-axis: Message count
- Color: Blue gradient (by message count)
- Limit: Top 10 users
Pie Chart: Question Topics
- Segments: By topic (Care, Assessment, etc.)
- Size: Proportion of questions
- Interactive: Hover for details
DataFrames: Audit Logs
- Columns: Timestamp, User, Action, Resource, IP
- Sorting: Reverse chronological
- Limit: 50 most recent
- Filterable: By date range
Compliance Features
HIPAA Compliance Ready
- β Audit trails for all data access
- β User authentication logging
- β Encryption ready (TLS)
- β Data retention policies configurable
- β IP address logging for accountability
GDPR Compliance Ready
- β User activity tracking
- β Data access audit logs
- β Right to be forgotten support (can delete user)
- β Data export capabilities
- β Consent management ready
Audit Trail
All user actions logged with:
- Timestamp (UTC)
- User ID
- Action type
- Resource type/ID
- IP address
- Changes (JSONB format)
Data Export
Export Formats (Ready for Phase 2.3)
CSV Export
df.to_csv('analytics_report.csv', index=False)
PDF Export
# Uses plotly for static image export
fig.write_image("report.pdf")
Excel Export
df.to_excel('analytics_report.xlsx', sheet_name='Analytics')
Sample Export Query
import pandas as pd
from database import get_connection
# Export user activity
with get_connection() as conn:
df = pd.read_sql_query("""
SELECT u.username, u.role, COUNT(*) as messages
FROM chat_history ch
JOIN users u ON ch.user_id = u.id
GROUP BY u.id, u.username, u.role
""", conn)
df.to_excel('user_activity.xlsx')
df.to_csv('user_activity.csv')
Troubleshooting
"Database required for analytics"
Error: Database module not available
Solution: Install psycopg2-binary
$ pip install psycopg2-binary
No data showing in charts
Possible causes:
1. No chat history yet (new database)
2. Time range filters with no data
3. Database connection issue
Debug:
SELECT COUNT(*) FROM chat_history;
SELECT COUNT(*) FROM users;
SELECT COUNT(*) FROM audit_logs;
Slow dashboard loading
Solutions:
1. Add indexes (already done in Phase 2.1)
2. Limit time range (default 30 days)
3. Increase database pool size
4. Cache results for 5 minutes
Connection timeouts
Fix timeout:
DB_HOST=localhost
DB_PORT=5432
# Verify PostgreSQL running:
sudo service postgresql status
sudo service postgresql start
Performance Optimization
Database Indexes
Already created in Phase 2.1:
CREATE INDEX idx_chat_history_created_at ON chat_history(created_at);
CREATE INDEX idx_analytics_user_id ON analytics_events(user_id);
CREATE INDEX idx_audit_logs_user_id ON audit_logs(user_id);
Query Caching
@st.cache_data(ttl=300) # Cache for 5 minutes
def get_dashboard_data():
# Expensive query here
pass
Pagination
# Limit audit logs to 50 most recent
LIMIT 50 OFFSET (page - 1) * 50
Next Steps
- Deploy Phase 2.2 to staging
- Test analytics queries with sample data
- Validate compliance reports meet requirements
- Proceed to Phase 2.3 - EHR/FHIR Integration
Files Created/Modified
New Files:
analytics_dashboard.py(400+ lines)
Integration Points:
- Update
app_phase2.pyto add analytics tab - Add analytics logging to chat module
Documentation:
PHASE2_ANALYTICS.md(this file)
Phase 2.2 Implementation - November 29, 2025