Open-Nursing-Validator / PHASE2_ANALYTICS.md
NurseCitizenDeveloper's picture
Deploy Open Nursing Validator (Docker)
6d12932 verified

Phase 2.2: Advanced Analytics Dashboard

Overview

Phase 2.2 transforms the application into an analytics-driven platform with comprehensive insights into system usage, clinical outcomes, compliance, and knowledge gaps.

Prerequisite: Phase 2.1 (PostgreSQL Database) must be completed

Features

1. πŸ“Š Analytics Overview

  • Real-time key metrics
  • Active users count
  • Active sessions
  • Total messages processed
  • 24-hour event summary

2. πŸ“ˆ Usage Analytics

  • Daily active user trends
  • Top users by message count
  • Usage patterns over time
  • Feature adoption metrics

3. πŸ“‹ Compliance Reporting

  • Login/logout audit trails
  • Failed login tracking
  • Data access audit logs
  • HIPAA/GDPR compliance ready
  • 90-day historical reporting

4. πŸ” Knowledge Gap Analysis

  • Question distribution by topic
  • Unanswered query tracking
  • Low-confidence answer detection
  • Content recommendations

5. πŸ₯ Clinical Outcomes

  • Care plan duration metrics
  • Goal achievement rates
  • Patient satisfaction scores
  • Clinical outcome trends

6. πŸ‘₯ User Activity Report

  • Per-user engagement metrics
  • Last login tracking
  • Message and session counts
  • Role-based usage analysis

7. βš™οΈ System Health

  • Database status
  • API response times
  • Vector database readiness
  • System uptime tracking

Setup Instructions

Prerequisites

  1. Phase 2.1 Completed

    • PostgreSQL database running
    • Chat history in database
    • Audit logs populated
  2. Python Packages

    pip install pandas plotly streamlit
    

    (Already in requirements.txt)

Configuration

# Update .env.production
export APP_ENV=production
export USE_DATABASE=true
export DB_HOST=localhost
export DB_PORT=5432
export DB_NAME=nursing_validator
export DB_USER=nursing_admin
export DB_PASSWORD=your_password

Running Analytics

# Create analytics-only dashboard
streamlit run analytics_dashboard.py

# Or add analytics tab to main app
streamlit run app_phase2.py
# Then navigate to Analytics tab (admin only)

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Streamlit Analytics Frontend       β”‚
β”‚   (analytics_dashboard.py)           β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Overviewβ”‚Usageβ”‚Complianceβ”‚Gaps...  β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚      AnalyticsDashboard Class        β”‚
β”‚   - display_overview()               β”‚
β”‚   - display_usage_dashboard()        β”‚
β”‚   - display_compliance_report()      β”‚
β”‚   - display_knowledge_gaps()         β”‚
β”‚   - display_clinical_outcomes()      β”‚
β”‚   - display_user_activity()          β”‚
β”‚   - display_system_health()          β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚        Data Processing Layer         β”‚
β”‚   (pandas + plotly visualization)    β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚      Database Query Layer            β”‚
β”‚   (database.py - via SQL)            β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚      PostgreSQL Database             β”‚
β”‚   β”œβ”€ chat_history                   β”‚
β”‚   β”œβ”€ audit_logs                     β”‚
β”‚   β”œβ”€ analytics_events               β”‚
β”‚   β”œβ”€ users                          β”‚
β”‚   └─ sessions                       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

API Reference

AnalyticsDashboard Class

from analytics_dashboard import AnalyticsDashboard

# Initialize
dashboard = AnalyticsDashboard()

# Display sections
dashboard.display_overview()           # Key metrics
dashboard.display_usage_dashboard()    # Usage trends
dashboard.display_compliance_report()  # Audit logs
dashboard.display_knowledge_gaps()     # Content analysis
dashboard.display_clinical_outcomes()  # Patient metrics
dashboard.display_user_activity()      # Per-user stats
dashboard.display_system_health()      # System metrics
dashboard.display_export_options()     # Data export

Integration with Main App

# In app_phase2.py, add analytics tab

with tab_analytics:
    from analytics_dashboard import display_analytics_dashboard
    
    # Only for admin users
    if st.session_state.role == "admin":
        display_analytics_dashboard()
    else:
        st.warning("Analytics available for admins only")

Reports

1. Overview Report

Metrics:

  • Active Users: Count of users with is_active = TRUE
  • Active Sessions: Count where expires_at > NOW
  • Total Messages: COUNT(*) from chat_history
  • Events (24h): Audit logs in last 24 hours
SELECT COUNT(*) FROM users WHERE is_active = TRUE;
SELECT COUNT(*) FROM sessions WHERE expires_at > CURRENT_TIMESTAMP;
SELECT COUNT(*) FROM chat_history;
SELECT COUNT(*) FROM audit_logs 
WHERE created_at > CURRENT_TIMESTAMP - INTERVAL '24 hours';

2. Usage Report

Data:

  • Daily active users (line chart)
  • Top 10 users by message count (bar chart)
  • Time-range filtering (default: 30 days)
-- Daily active users
SELECT DATE(created_at), COUNT(DISTINCT user_id)
FROM chat_history
GROUP BY DATE(created_at);

-- Top users
SELECT u.username, COUNT(*)
FROM chat_history ch
JOIN users u ON ch.user_id = u.id
GROUP BY u.username
ORDER BY COUNT(*) DESC LIMIT 10;

3. Compliance Report

Audit Events:

  • Login/logout tracking
  • Failed login counts
  • Data access by resource
  • 90-day historical logs
  • HIPAA compliance ready
-- Authentication events
SELECT action, COUNT(*), COUNT(DISTINCT user_id)
FROM audit_logs
WHERE action IN ('login', 'logout', 'failed_login')
GROUP BY action;

-- Data access
SELECT resource_type, COUNT(*), COUNT(DISTINCT user_id)
FROM audit_logs
WHERE resource_type IS NOT NULL
GROUP BY resource_type;

4. Knowledge Gap Analysis

Analysis:

  • Question distribution by topic
  • Topics: Care, Assessment, Interventions, Goals, Medications
  • Pie chart of question types
  • Identifies training needs
SELECT
    CASE
        WHEN content ILIKE '%care%' THEN 'Care Planning'
        WHEN content ILIKE '%assessment%' THEN 'Assessment'
        WHEN content ILIKE '%intervention%' THEN 'Interventions'
        WHEN content ILIKE '%goal%' THEN 'Goals'
        WHEN content ILIKE '%medication%' THEN 'Medications'
        ELSE 'Other'
    END as topic,
    COUNT(*)
FROM chat_history
WHERE role = 'user'
GROUP BY topic;

5. User Activity Report

Metrics:

  • Username, role, last login
  • Message count per user
  • Session count
  • Active in last 7 days count
  • Average messages per user
SELECT
    u.username, u.role, u.last_login,
    COUNT(DISTINCT ch.id) as messages,
    COUNT(DISTINCT s.id) as sessions
FROM users u
LEFT JOIN chat_history ch ON u.id = ch.user_id
LEFT JOIN sessions s ON u.id = s.user_id
WHERE u.is_active = TRUE
GROUP BY u.id, u.username, u.role, u.last_login;

Visualizations

Line Chart: Daily Active Users

  • X-axis: Date
  • Y-axis: Number of unique users
  • Time Range: Selectable (default: 30 days)
  • Type: Line chart with markers

Bar Chart: Top Users

  • X-axis: Username
  • Y-axis: Message count
  • Color: Blue gradient (by message count)
  • Limit: Top 10 users

Pie Chart: Question Topics

  • Segments: By topic (Care, Assessment, etc.)
  • Size: Proportion of questions
  • Interactive: Hover for details

DataFrames: Audit Logs

  • Columns: Timestamp, User, Action, Resource, IP
  • Sorting: Reverse chronological
  • Limit: 50 most recent
  • Filterable: By date range

Compliance Features

HIPAA Compliance Ready

  • βœ… Audit trails for all data access
  • βœ… User authentication logging
  • βœ… Encryption ready (TLS)
  • βœ… Data retention policies configurable
  • βœ… IP address logging for accountability

GDPR Compliance Ready

  • βœ… User activity tracking
  • βœ… Data access audit logs
  • βœ… Right to be forgotten support (can delete user)
  • βœ… Data export capabilities
  • βœ… Consent management ready

Audit Trail

All user actions logged with:

  • Timestamp (UTC)
  • User ID
  • Action type
  • Resource type/ID
  • IP address
  • Changes (JSONB format)

Data Export

Export Formats (Ready for Phase 2.3)

CSV Export

df.to_csv('analytics_report.csv', index=False)

PDF Export

# Uses plotly for static image export
fig.write_image("report.pdf")

Excel Export

df.to_excel('analytics_report.xlsx', sheet_name='Analytics')

Sample Export Query

import pandas as pd
from database import get_connection

# Export user activity
with get_connection() as conn:
    df = pd.read_sql_query("""
        SELECT u.username, u.role, COUNT(*) as messages
        FROM chat_history ch
        JOIN users u ON ch.user_id = u.id
        GROUP BY u.id, u.username, u.role
    """, conn)
    
    df.to_excel('user_activity.xlsx')
    df.to_csv('user_activity.csv')

Troubleshooting

"Database required for analytics"

Error: Database module not available
Solution: Install psycopg2-binary
$ pip install psycopg2-binary

No data showing in charts

Possible causes:
1. No chat history yet (new database)
2. Time range filters with no data
3. Database connection issue

Debug:
SELECT COUNT(*) FROM chat_history;
SELECT COUNT(*) FROM users;
SELECT COUNT(*) FROM audit_logs;

Slow dashboard loading

Solutions:
1. Add indexes (already done in Phase 2.1)
2. Limit time range (default 30 days)
3. Increase database pool size
4. Cache results for 5 minutes

Connection timeouts

Fix timeout:
DB_HOST=localhost
DB_PORT=5432
# Verify PostgreSQL running:
sudo service postgresql status
sudo service postgresql start

Performance Optimization

Database Indexes

Already created in Phase 2.1:

CREATE INDEX idx_chat_history_created_at ON chat_history(created_at);
CREATE INDEX idx_analytics_user_id ON analytics_events(user_id);
CREATE INDEX idx_audit_logs_user_id ON audit_logs(user_id);

Query Caching

@st.cache_data(ttl=300)  # Cache for 5 minutes
def get_dashboard_data():
    # Expensive query here
    pass

Pagination

# Limit audit logs to 50 most recent
LIMIT 50 OFFSET (page - 1) * 50

Next Steps

  1. Deploy Phase 2.2 to staging
  2. Test analytics queries with sample data
  3. Validate compliance reports meet requirements
  4. Proceed to Phase 2.3 - EHR/FHIR Integration

Files Created/Modified

New Files:

  • analytics_dashboard.py (400+ lines)

Integration Points:

  • Update app_phase2.py to add analytics tab
  • Add analytics logging to chat module

Documentation:

  • PHASE2_ANALYTICS.md (this file)

Phase 2.2 Implementation - November 29, 2025