Spaces:

MusoraProductDepartment
/

Sentiment_analysis

Sleeping

File size: 9,283 Bytes
# Visualization Agents

## Overview
This folder contains AI-powered agents that enhance the sentiment analysis dashboard with intelligent, context-aware insights and analysis capabilities.

## Architecture

### Base Agent Pattern
All agents inherit from `BaseVisualizationAgent` which provides:
- Common interface (`process()`, `validate_input()`)
- Error handling
- Logging functionality
- Consistent configuration

### LLM Helper
`utils/llm_helper.py` provides:
- OpenAI API integration
- Retry logic with exponential backoff
- JSON mode support
- Token usage tracking

## Available Agents

### 1. ContentSummaryAgent

**Purpose**: Analyze and summarize comments for content pieces

**Location**: `agents/content_summary_agent.py`

**Input**:
```python
{
    'content_sk': str,              # Content identifier
    'content_description': str,      # Content title/description
    'comments': DataFrame or list    # Comments data
}
```

**Output**:
```python
{
    'success': bool,
    'content_sk': str,
    'summary': {
        'executive_summary': str,              # 2-3 sentence overview
        'main_themes': [                       # Top themes discussed
            {
                'theme': str,
                'sentiment': str,  # positive/negative/mixed
                'description': str
            }
        ],
        'praise_points': [str],                # What users love
        'key_complaints': [str],               # Main concerns
        'frequently_asked_questions': [str],   # Common questions
        'unexpected_insights': [str],          # Surprising patterns
        'action_recommendations': [            # Suggested actions
            {
                'priority': str,   # high/medium/low
                'action': str
            }
        ]
    },
    'metadata': {
        'total_comments_analyzed': int,
        'model_used': str,
        'tokens_used': int
    }
}
```

**Configuration**:
- Model: `gpt-5-nano` (configurable)
- Temperature: 0.3 (lower for focused summaries)
- Sampling: All negative comments + up to 50 positive/neutral (if >100 total)

**Features**:
- **Smart sampling**: Prioritizes negative comments, samples others
- **Context preservation**: Includes sentiment and intent metadata
- **Token optimization**: Truncates long comments to 300 chars
- **Structured output**: JSON format with guaranteed fields
- **Error handling**: Graceful failures with retry capability

## UI Integration

### Poor Sentiment Contents Page

**Location**: `components/poor_sentiment_contents.py`

**User Flow**:
1. User views content cards on Poor Sentiment Contents page
2. Clicks "🔍 Generate AI Analysis" button
3. Agent processes comments (with spinner indicator)
4. Summary displays in expandable section
5. Result cached in session state

**Display Sections**:
- **Executive Summary**: High-level overview (info box)
- **Main Themes**: Key topics with sentiment indicators
- **Praise Points** ✅ & **Key Complaints** ⚠️ (side-by-side)
- **FAQs** ❓ & **Unexpected Insights** 💡 (side-by-side)
- **Recommended Actions** 🎯 (priority-coded)
- **Analysis Metadata** ℹ️ (expandable details)

**Session Caching**:
- Summaries stored in `st.session_state.content_summaries`
- Key: `content_sk`
- Persists during session, cleared on page reload
- Prevents redundant API calls

## Usage Example

```python
from agents.content_summary_agent import ContentSummaryAgent
import pandas as pd

# Initialize agent
agent = ContentSummaryAgent(model="gpt-5-nano", temperature=0.3)

# Prepare input
input_data = {
    'content_sk': '12345',
    'content_description': 'Advanced Drum Fills Tutorial',
    'comments': comments_df  # DataFrame with comments
}

# Generate summary
result = agent.process(input_data)

if result['success']:
    summary = result['summary']
    print(summary['executive_summary'])

    for theme in summary['main_themes']:
        print(f"Theme: {theme['theme']} ({theme['sentiment']})")
        print(f"  {theme['description']}")
else:
    print(f"Error: {result['error']}")
```

## Environment Setup

### Required Environment Variables
Add to `.env` file (parent directory):
```bash
OPENAI_API_KEY=your_openai_api_key_here
```

### Dependencies
All dependencies already in `visualization/requirements.txt`:
- `streamlit>=1.28.0`
- `pandas>=2.0.0`
- `python-dotenv>=1.0.0`
- OpenAI library (inherited from parent project)

## Error Handling

### Agent-Level Errors
- **Invalid input**: Returns `{'success': False, 'error': 'Invalid input data'}`
- **LLM API failure**: Retries up to 3 times with exponential backoff
- **JSON parsing error**: Returns error with raw content
- **Exception**: Catches all exceptions, logs, returns error dict

### UI-Level Errors
- Displays error message in red box
- Provides "🔄 Retry Analysis" button
- Clears cache and regenerates on retry
- Logs errors to agent logger

## Performance Considerations

### API Costs
- Model: `gpt-5-nano` (cost-effective)
- Sampling strategy: Reduces tokens by up to 50% for large comment sets
- Comment truncation: Max 300 chars per comment
- Session caching: Eliminates duplicate API calls

### Response Time
- Average: 5-10 seconds for 50-100 comments
- Depends on: Comment count, OpenAI API latency
- User feedback: Spinner shows "Analyzing comments with AI..."

### Scalability
- Handles up to 100 comments per analysis (after sampling)
- Parallel requests: Each content analyzed independently
- Session state: Memory usage scales with number of analyzed contents

## Extending Agents

### Adding New Agents

1. **Create agent file**:
```python
# agents/new_agent.py
from agents.base_agent import BaseVisualizationAgent
from utils.llm_helper import LLMHelper

class NewAgent(BaseVisualizationAgent):
    def __init__(self, model="gpt-5-nano", temperature=0.7):
        super().__init__(name="NewAgent", model=model, temperature=temperature)
        self.llm_helper = LLMHelper(model=model, temperature=temperature)

    def validate_input(self, input_data):
        # Validation logic
        return True

    def process(self, input_data):
        # Processing logic
        pass
```

2. **Update `__init__.py`**:
```python
from .new_agent import NewAgent

__all__ = ['ContentSummaryAgent', 'NewAgent']
```

3. **Integrate in UI**:
- Import agent in component file
- Add UI controls (buttons, inputs)
- Display results
- Handle caching if needed

### Best Practices

1. **Input Validation**: Always validate required fields
2. **Error Handling**: Use `handle_error()` method
3. **Logging**: Use `log_processing()` for debugging
4. **Structured Output**: Return consistent dict format
5. **Caching**: Use session state for expensive operations
6. **Token Optimization**: Sample/truncate data for large inputs
7. **User Feedback**: Show spinners for async operations
8. **Graceful Degradation**: Provide fallbacks for failures

## Testing

### Manual Testing
1. Start dashboard: `streamlit run app.py`
2. Navigate to "⚠️ Poor Sentiment Contents" page
3. Click "🔍 Generate AI Analysis" for any content
4. Verify summary displays correctly
5. Check session caching (click button again)
6. Test error handling (disconnect network)

### Unit Testing
```python
# tests/test_content_summary_agent.py
import pytest
from agents.content_summary_agent import ContentSummaryAgent

def test_validate_input():
    agent = ContentSummaryAgent()

    # Valid input
    valid_input = {
        'content_sk': '123',
        'content_description': 'Test',
        'comments': []
    }
    assert agent.validate_input(valid_input) == True

    # Missing field
    invalid_input = {'content_sk': '123'}
    assert agent.validate_input(invalid_input) == False
```

## Future Enhancements

### Planned Features
1. **Batch Analysis**: Analyze multiple contents at once
2. **Trend Detection**: Compare with historical summaries
3. **Export Summaries**: Download as PDF/CSV
4. **Custom Prompts**: User-defined analysis focus
5. **Multi-language Support**: Summaries in user's language

### Additional Agents (Roadmap)
- **InsightsSummaryAgent**: Overall dataset insights
- **InteractiveChatbotAgent**: Conversational analysis
- **ComparativeContentAgent**: Content comparison
- **ReplySuggestionAgent**: Generate reply suggestions
- **TrendForecastingAgent**: Predict sentiment trends

## Troubleshooting

### Common Issues

**Issue**: `OPENAI_API_KEY not found`
- **Solution**: Add key to `.env` file in parent directory

**Issue**: Import error for `agents` module
- **Solution**: Ensure `__init__.py` exists in `visualization/agents/`

**Issue**: LLM timeout errors
- **Solution**: Reduce comment count or increase retry limit

**Issue**: JSON parsing errors
- **Solution**: Check LLM prompt format, ensure JSON mode enabled

**Issue**: Cached summaries not showing
- **Solution**: Check `st.session_state.content_summaries` initialization

## Support

For issues or questions:
1. Check this README
2. Review agent logs in console
3. Inspect session state in Streamlit
4. Verify environment variables
5. Check OpenAI API status

## Version History

### v1.0.0 (Current)
- Initial release
- ContentSummaryAgent implementation
- Poor Sentiment Contents page integration
- Session-based caching
- Error handling and retry logic
- Comprehensive UI display