Spaces:
Sleeping
Sleeping
File size: 7,696 Bytes
ddabbe4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 |
"""
Task definitions for NBA data analysis workflow.
"""
from crewai import Task
from config import NBA_DATA_PATH
def create_data_engineering_task(engineer_agent, csv_path: str = None) -> Task:
"""
Create the data engineering task for processing and cleaning data.
Args:
engineer_agent: The Engineer Agent to assign this task to
csv_path: Path to CSV file (defaults to NBA_DATA_PATH from config)
Returns:
Task: Configured data engineering task
"""
data_path = csv_path or NBA_DATA_PATH
return Task(
description=f"""
Quickly examine the dataset located at {data_path}.
Your tasks (BE EFFICIENT - use tools only once):
1. Get a brief summary of the dataset structure (use get_nba_data_summary ONCE)
2. Note the key columns available
3. Verify the data is ready for analysis
IMPORTANT:
- Use get_nba_data_summary ONCE only - it provides all needed info
- Do NOT call read_nba_data or analyze_nba_data multiple times
- Keep your report concise (2-3 sentences)
- The data is already clean and ready for analysis
Provide a brief confirmation that the dataset is loaded and ready for analysis.
""",
agent=engineer_agent,
expected_output="A brief confirmation (2-3 sentences) that the dataset is loaded and ready for analysis, including key column names."
)
def create_data_analysis_task(analyst_agent, data_engineering_task: Task) -> Task:
"""
Create the data analysis task for extracting insights from NBA data.
Args:
analyst_agent: The Analyst Agent to assign this task to
data_engineering_task: The data engineering task for context
Returns:
Task: Configured data analysis task
"""
return Task(
description=f"""
Using the cleaned NBA 2024-25 dataset, perform comprehensive analysis:
Your tasks:
1. Analyze player performance metrics:
- Top performers by points, assists, rebounds
- Shooting efficiency analysis (FG%, 3P%, FT%)
- Player efficiency ratings
2. Team performance analysis:
- Win/loss records by team
- Team offensive and defensive statistics
- Team performance trends
3. Game insights:
- High-scoring games
- Close games vs blowouts
- Performance by date/period
4. Identify key patterns and trends:
- Best performing players
- Most efficient teams
- Statistical outliers
5. Provide actionable insights and recommendations
Create a comprehensive analysis report with key findings and insights.
""",
agent=analyst_agent,
expected_output="A detailed analysis report with key insights, statistical findings, top performers, team analysis, and actionable recommendations based on the NBA 2024-25 data.",
context=[data_engineering_task]
)
def create_custom_analysis_task(analyst_agent, user_query: str, data_engineering_task: Task = None, csv_path: str = None) -> Task:
"""
Create a custom data analysis task based on user input.
Args:
analyst_agent: The Analyst Agent to assign this task to
user_query: The user's custom analysis query/task
data_engineering_task: The data engineering task for context (optional for parallel execution)
csv_path: Path to CSV file (for reference in description)
Returns:
Task: Configured custom analysis task
"""
data_path = csv_path or NBA_DATA_PATH
context = [data_engineering_task] if data_engineering_task else []
return Task(
description=f"""
Using the dataset located at {data_path}, perform the following analysis as requested by the user:
{user_query}
IMPORTANT INSTRUCTIONS:
1. For queries requiring aggregations (sum, count, average, top N, etc.), you MUST use the 'analyze_nba_data' tool.
2. The 'analyze_nba_data' tool allows you to execute pandas code for grouping, aggregating, sorting, and filtering.
3. Examples of when to use 'analyze_nba_data':
- Finding top players by statistics (e.g., "top 5 three-point shooters")
- Calculating totals or averages per player/team
- Grouping and aggregating data
- Statistical analysis requiring groupby operations
4. Use 'semantic_search_nba_data' only for finding specific game records or examples, NOT for aggregations.
5. Plan your analysis: First understand what data you need, then use the appropriate tool to get aggregated results.
Steps to follow:
1. If the query asks for "top N" or aggregations, use analyze_nba_data with pandas groupby operations
2. For "top 5 three-point shooters": group by Player, sum the '3P' column, sort descending, take top 5
3. Present the results clearly with player names and their statistics
Provide a clear, comprehensive answer with relevant statistics, insights, and any supporting data from the dataset.
""",
agent=analyst_agent,
expected_output="A detailed analysis report addressing the user's query with relevant insights, statistics, and findings from the data.",
context=context
)
def create_storyteller_task(storyteller_agent, analysis_task: Task) -> Task:
"""
Create a storyteller task that creates headlines and storylines from the analysis results.
Args:
storyteller_agent: The Storyteller Agent to assign this task to
analysis_task: The analysis task whose output will be used to create headlines and content
Returns:
Task: Configured storyteller task
"""
return Task(
description="""
Review the data analysis results and create engaging headlines and compelling storylines that bring the data to life.
Your tasks:
1. Read and understand the analysis results thoroughly
2. Identify the most important and interesting findings
3. Create 3-5 compelling headlines that:
- Are catchy and attention-grabbing
- Accurately reflect the key insights
- Use engaging sports journalism language
- Are suitable for display to users
4. Write engaging storylines/content for each headline that:
- Tells a story about the findings
- Provides context and narrative around the statistics
- Makes the data come alive with compelling prose
- Explains why these insights matter
- Uses vivid language and storytelling techniques
- Is 2-3 paragraphs per storyline (enough to be engaging but concise)
5. Format your output as follows:
HEADLINES:
[List of 3-5 headlines, one per line]
STORYLINES:
[For each headline, write 2-3 paragraphs of engaging content that tells the story behind the data]
Make both the headlines and storylines exciting, memorable, and true to the data insights.
Write like a sports journalist who knows how to make statistics compelling and human.
""",
agent=storyteller_agent,
expected_output="A formatted output with 3-5 engaging headlines followed by detailed storylines (2-3 paragraphs each) that bring the data analysis to life with compelling narrative and context.",
context=[analysis_task]
)
|