Spaces:

MusoraProductDepartment
/

Sentiment_analysis

Running

App Files Files Community

Sentiment_analysis / visualization /agents /README.md

Danialebrat

Deploying sentiment analysis project

9858829 13 days ago

preview code

raw

history blame contribute delete

9.28 kB

	# Visualization Agents

	## Overview
	This folder contains AI-powered agents that enhance the sentiment analysis dashboard with intelligent, context-aware insights and analysis capabilities.

	## Architecture

	### Base Agent Pattern
	All agents inherit from `BaseVisualizationAgent` which provides:
	- Common interface (`process()`, `validate_input()`)
	- Error handling
	- Logging functionality
	- Consistent configuration

	### LLM Helper
	`utils/llm_helper.py` provides:
	- OpenAI API integration
	- Retry logic with exponential backoff
	- JSON mode support
	- Token usage tracking

	## Available Agents

	### 1. ContentSummaryAgent

	Purpose: Analyze and summarize comments for content pieces

	Location: `agents/content_summary_agent.py`

	Input:
	```python
	{
	'content_sk': str, # Content identifier
	'content_description': str, # Content title/description
	'comments': DataFrame or list # Comments data
	}
	```

	Output:
	```python
	{
	'success': bool,
	'content_sk': str,
	'summary': {
	'executive_summary': str, # 2-3 sentence overview
	'main_themes': [ # Top themes discussed
	{
	'theme': str,
	'sentiment': str, # positive/negative/mixed
	'description': str
	}
	],
	'praise_points': [str], # What users love
	'key_complaints': [str], # Main concerns
	'frequently_asked_questions': [str], # Common questions
	'unexpected_insights': [str], # Surprising patterns
	'action_recommendations': [ # Suggested actions
	{
	'priority': str, # high/medium/low
	'action': str
	}
	]
	},
	'metadata': {
	'total_comments_analyzed': int,
	'model_used': str,
	'tokens_used': int
	}
	}
	```

	Configuration:
	- Model: `gpt-5-nano` (configurable)
	- Temperature: 0.3 (lower for focused summaries)
	- Sampling: All negative comments + up to 50 positive/neutral (if >100 total)

	Features:
	- Smart sampling: Prioritizes negative comments, samples others
	- Context preservation: Includes sentiment and intent metadata
	- Token optimization: Truncates long comments to 300 chars
	- Structured output: JSON format with guaranteed fields
	- Error handling: Graceful failures with retry capability

	## UI Integration

	### Poor Sentiment Contents Page

	Location: `components/poor_sentiment_contents.py`

	User Flow:
	1. User views content cards on Poor Sentiment Contents page
	2. Clicks "🔍 Generate AI Analysis" button
	3. Agent processes comments (with spinner indicator)
	4. Summary displays in expandable section
	5. Result cached in session state

	Display Sections:
	- Executive Summary: High-level overview (info box)
	- Main Themes: Key topics with sentiment indicators
	- Praise Points ✅ & Key Complaints ⚠️ (side-by-side)
	- FAQs ❓ & Unexpected Insights 💡 (side-by-side)
	- Recommended Actions 🎯 (priority-coded)
	- Analysis Metadata ℹ️ (expandable details)

	Session Caching:
	- Summaries stored in `st.session_state.content_summaries`
	- Key: `content_sk`
	- Persists during session, cleared on page reload
	- Prevents redundant API calls

	## Usage Example

	```python
	from agents.content_summary_agent import ContentSummaryAgent
	import pandas as pd

	# Initialize agent
	agent = ContentSummaryAgent(model="gpt-5-nano", temperature=0.3)

	# Prepare input
	input_data = {
	'content_sk': '12345',
	'content_description': 'Advanced Drum Fills Tutorial',
	'comments': comments_df # DataFrame with comments
	}

	# Generate summary
	result = agent.process(input_data)

	if result['success']:
	summary = result['summary']
	print(summary['executive_summary'])

	for theme in summary['main_themes']:
	print(f"Theme: {theme['theme']} ({theme['sentiment']})")
	print(f" {theme['description']}")
	else:
	print(f"Error: {result['error']}")
	```

	## Environment Setup

	### Required Environment Variables
	Add to `.env` file (parent directory):
	```bash
	OPENAI_API_KEY=your_openai_api_key_here
	```

	### Dependencies
	All dependencies already in `visualization/requirements.txt`:
	- `streamlit>=1.28.0`
	- `pandas>=2.0.0`
	- `python-dotenv>=1.0.0`
	- OpenAI library (inherited from parent project)

	## Error Handling

	### Agent-Level Errors
	- Invalid input: Returns `{'success': False, 'error': 'Invalid input data'}`
	- LLM API failure: Retries up to 3 times with exponential backoff
	- JSON parsing error: Returns error with raw content
	- Exception: Catches all exceptions, logs, returns error dict

	### UI-Level Errors
	- Displays error message in red box
	- Provides "🔄 Retry Analysis" button
	- Clears cache and regenerates on retry
	- Logs errors to agent logger

	## Performance Considerations

	### API Costs
	- Model: `gpt-5-nano` (cost-effective)
	- Sampling strategy: Reduces tokens by up to 50% for large comment sets
	- Comment truncation: Max 300 chars per comment
	- Session caching: Eliminates duplicate API calls

	### Response Time
	- Average: 5-10 seconds for 50-100 comments
	- Depends on: Comment count, OpenAI API latency
	- User feedback: Spinner shows "Analyzing comments with AI..."

	### Scalability
	- Handles up to 100 comments per analysis (after sampling)
	- Parallel requests: Each content analyzed independently
	- Session state: Memory usage scales with number of analyzed contents

	## Extending Agents

	### Adding New Agents

	1. Create agent file:
	```python
	# agents/new_agent.py
	from agents.base_agent import BaseVisualizationAgent
	from utils.llm_helper import LLMHelper

	class NewAgent(BaseVisualizationAgent):
	def __init__(self, model="gpt-5-nano", temperature=0.7):
	super().__init__(name="NewAgent", model=model, temperature=temperature)
	self.llm_helper = LLMHelper(model=model, temperature=temperature)

	def validate_input(self, input_data):
	# Validation logic
	return True

	def process(self, input_data):
	# Processing logic
	pass
	```

	2. Update `__init__.py`:
	```python
	from .new_agent import NewAgent

	__all__ = ['ContentSummaryAgent', 'NewAgent']
	```

	3. Integrate in UI:
	- Import agent in component file
	- Add UI controls (buttons, inputs)
	- Display results
	- Handle caching if needed

	### Best Practices

	1. Input Validation: Always validate required fields
	2. Error Handling: Use `handle_error()` method
	3. Logging: Use `log_processing()` for debugging
	4. Structured Output: Return consistent dict format
	5. Caching: Use session state for expensive operations
	6. Token Optimization: Sample/truncate data for large inputs
	7. User Feedback: Show spinners for async operations
	8. Graceful Degradation: Provide fallbacks for failures

	## Testing

	### Manual Testing
	1. Start dashboard: `streamlit run app.py`
	2. Navigate to "⚠️ Poor Sentiment Contents" page
	3. Click "🔍 Generate AI Analysis" for any content
	4. Verify summary displays correctly
	5. Check session caching (click button again)
	6. Test error handling (disconnect network)

	### Unit Testing
	```python
	# tests/test_content_summary_agent.py
	import pytest
	from agents.content_summary_agent import ContentSummaryAgent

	def test_validate_input():
	agent = ContentSummaryAgent()

	# Valid input
	valid_input = {
	'content_sk': '123',
	'content_description': 'Test',
	'comments': []
	}
	assert agent.validate_input(valid_input) == True

	# Missing field
	invalid_input = {'content_sk': '123'}
	assert agent.validate_input(invalid_input) == False
	```

	## Future Enhancements

	### Planned Features
	1. Batch Analysis: Analyze multiple contents at once
	2. Trend Detection: Compare with historical summaries
	3. Export Summaries: Download as PDF/CSV
	4. Custom Prompts: User-defined analysis focus
	5. Multi-language Support: Summaries in user's language

	### Additional Agents (Roadmap)
	- InsightsSummaryAgent: Overall dataset insights
	- InteractiveChatbotAgent: Conversational analysis
	- ComparativeContentAgent: Content comparison
	- ReplySuggestionAgent: Generate reply suggestions
	- TrendForecastingAgent: Predict sentiment trends

	## Troubleshooting

	### Common Issues

	Issue: `OPENAI_API_KEY not found`
	- Solution: Add key to `.env` file in parent directory

	Issue: Import error for `agents` module
	- Solution: Ensure `__init__.py` exists in `visualization/agents/`

	Issue: LLM timeout errors
	- Solution: Reduce comment count or increase retry limit

	Issue: JSON parsing errors
	- Solution: Check LLM prompt format, ensure JSON mode enabled

	Issue: Cached summaries not showing
	- Solution: Check `st.session_state.content_summaries` initialization

	## Support

	For issues or questions:
	1. Check this README
	2. Review agent logs in console
	3. Inspect session state in Streamlit
	4. Verify environment variables
	5. Check OpenAI API status

	## Version History

	### v1.0.0 (Current)
	- Initial release
	- ContentSummaryAgent implementation
	- Poor Sentiment Contents page integration
	- Session-based caching
	- Error handling and retry logic
	- Comprehensive UI display