Spaces:

shekkari21
/

NBA_Analysis

Sleeping

App Files Files Community

NBA_Analysis / README.md

shekkari21

added readme

b34fde9 about 2 months ago

preview code

raw

history blame contribute delete

10.2 kB

A newer version of the Gradio SDK is available: 6.3.0

Upgrade

metadata

title: NBA Analysis
emoji: 🔥
colorFrom: red
colorTo: indigo
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false

🏀 NBA Data Analysis with CrewAI

An intelligent NBA data analysis application powered by CrewAI multi-agent framework. Upload your NBA CSV data and get comprehensive analysis with insights, statistics, and engaging storylines generated by AI agents.

✨ Features

🤖 Multi-Agent AI System: Three specialized agents (Engineer, Analyst, Storyteller) work together
📊 Data Engineering: Automatic data cleaning and preparation
🔍 Intelligent Analysis: AI-powered insights and pattern detection
📈 Statistical Analysis: Top performers, trends, and key metrics
🔎 Semantic Search: Natural language queries on your data using vector embeddings
📝 Storytelling: Engaging headlines and narratives from data
🎯 Parallel Processing: Tasks run in parallel for faster results
🌐 Web Interface: Easy-to-use Gradio web app
🆓 Free & Open Source: Uses free-tier open-source LLM models

🏗️ Architecture

The application uses a multi-agent system with the following components:

Data Engineer Agent: Processes and validates data
Data Analyst Agent: Performs statistical analysis and extracts insights
Storyteller Agent: Creates engaging narratives from analysis results

Tech Stack

CrewAI: Multi-agent AI framework
Gradio: Web interface
Pandas: Data analysis
ChromaDB: Vector database for semantic search
Sentence Transformers: Embeddings for semantic search
Hugging Face / Ollama: Open-source LLM providers

📋 Prerequisites

Python 3.11 or 3.12
pip or uv package manager
(Optional) Ollama for local testing

🚀 Installation

1. Clone the Repository

git clone <your-repo-url>
cd NBA_Analysis

2. Install Dependencies

Using uv (recommended):

uv sync

Using pip:

pip install -r requirements.txt

3. Prepare Your Data

Place your NBA CSV file in the project directory, or upload it through the web interface.

⚙️ Configuration

LLM Provider Setup

The application supports multiple LLM providers. Configure via environment variables:

Option 1: Hugging Face (Recommended for Deployment)

Get a free API token from Hugging Face

Set environment variables:

export LLM_PROVIDER=huggingface
export HF_API_KEY=your-hf-token
export HF_MODEL=meta-llama/Llama-3.1-8B-Instruct  # or any HF model

Available Models:

meta-llama/Llama-3.1-8B-Instruct (default, best quality)
mistralai/Mistral-7B-Instruct-v0.2 (excellent quality)
Qwen/Qwen2.5-7B-Instruct (multilingual, great quality)
meta-llama/Llama-3.2-3B-Instruct (faster, smaller)

Option 2: Ollama (For Local Testing)

Install Ollama: https://ollama.ai
Start Ollama service:
```
ollama serve
```

Download a model:

ollama pull mistral  # or llama3.2, qwen2.5:7b, etc.

Set environment variables:

export LLM_PROVIDER=ollama
export OLLAMA_MODEL=mistral
export OLLAMA_BASE_URL=http://localhost:11434/v1

Option 3: OpenRouter (Alternative Free Option)

Get a free API key from OpenRouter

Set environment variables:

export LLM_PROVIDER=openrouter
export OPENROUTER_API_KEY=your-key
export OPENROUTER_MODEL=google/gemma-2-2b-it:free

Default Configuration

The application defaults to Hugging Face with Llama 3.1 8B Instruct model. No configuration needed if you set HF_API_KEY.

🎮 Usage

Web Interface (Recommended)

python app.py

Then open your browser to the URL shown (usually http://localhost:7860).

Features:

Upload CSV file
Enter analysis query (or leave blank for comprehensive analysis)
Click "Analyze Dataset" for full analysis
Click "Analyze with Question" for quick queries

Command Line

python main.py

📖 Example Queries

"Who are the top 5 three-point shooters?"
"Show me the best scoring games this season"
"Which players have the highest field goal percentage?"
"Analyze team performance trends"
"Find games with triple doubles"
"What are the most efficient shooters?"

🛠️ Project Structure

NBA_Analysis/
├── app.py                 # Gradio web interface
├── main.py                # Command-line entry point
├── config.py              # LLM and configuration settings
├── agents.py              # AI agent definitions
├── crew.py                # CrewAI crew orchestration
├── tasks.py               # Task definitions
├── tools.py               # Data access tools for agents
├── vector_db.py           # Vector database for semantic search
├── requirements.txt       # Python dependencies
├── pyproject.toml        # Project configuration
├── test_local.sh          # Script for local testing with Ollama
├── EXECUTION_FLOW.md      # Detailed execution flow documentation
└── README.md              # This file

🔧 Available Tools

The agents have access to 5 data tools:

read_nba_data: Read sample rows to understand structure
search_nba_data: Filter and search CSV data
get_nba_data_summary: Get comprehensive dataset overview
semantic_search_nba_data: Natural language semantic search
analyze_nba_data: Execute pandas operations for advanced analysis

🚀 Deployment

Hugging Face Spaces (Free)

Get API Keys:
- Hugging Face token: https://huggingface.co/settings/tokens
- (Optional) OpenRouter key: https://openrouter.ai
Create Space:
- Go to https://huggingface.co/spaces
- Create new Space with Gradio SDK
- Push your code
Set Secrets:
- Space Settings → Repository secrets
- Add HF_API_KEY = your Hugging Face token
- (Optional) Add LLM_PROVIDER = huggingface
- (Optional) Add HF_MODEL = your preferred model

Deploy:

git remote add hf https://huggingface.co/spaces/yourusername/nba-analysis
git push hf main

See EXECUTION_FLOW.md for detailed deployment instructions.

🧪 Local Testing

Quick Test with Ollama

# Make sure Ollama is running
ollama serve

# Run test script
./test_local.sh

Or manually:

export LLM_PROVIDER=ollama
export OLLAMA_MODEL=mistral
export OLLAMA_BASE_URL=http://localhost:11434/v1
python app.py

📊 How It Works

User Input: Upload CSV + enter query
Crew Creation: Three agents are initialized with their roles
Parallel Execution:
- Engineer validates data
- Analyst performs analysis (runs in parallel)
- Storyteller creates narrative (waits for Analyst)
Tool Execution: Agents use tools to access and analyze data
LLM Processing: AI generates insights and responses
Result Aggregation: All outputs are combined and formatted
Display: Results shown to user

See EXECUTION_FLOW.md for detailed flow documentation.

🎯 Key Features Explained

Semantic Search

Uses vector embeddings to find semantically similar records. First run indexes the CSV, subsequent runs use cached embeddings.

Parallel Processing

Engineer and Analyst tasks run simultaneously for faster results. Storyteller waits for Analyst to complete.

Multi-Agent Collaboration

Each agent has a specialized role:

Engineer: Data quality and structure
Analyst: Statistical analysis and insights
Storyteller: Narrative and presentation

🔒 Environment Variables

Variable	Description	Default
`LLM_PROVIDER`	LLM provider (`huggingface`, `ollama`, `openrouter`)	`huggingface`
`HF_API_KEY`	Hugging Face API token	Required if using HF
`HF_MODEL`	Hugging Face model name	`meta-llama/Llama-3.1-8B-Instruct`
`OLLAMA_MODEL`	Ollama model name	`mistral`
`OLLAMA_BASE_URL`	Ollama server URL	`http://localhost:11434/v1`
`OPENROUTER_API_KEY`	OpenRouter API key	Required if using OpenRouter
`OPENROUTER_MODEL`	OpenRouter model name	`google/gemma-2-2b-it:free`

🐛 Troubleshooting

"ModuleNotFoundError: No module named 'crewai'"

Install dependencies: pip install -r requirements.txt or uv sync

"HF_API_KEY not set"

Set your Hugging Face token as environment variable or in Space secrets

"Connection refused" (Ollama)

Make sure ollama serve is running
Check port 11434 is available

"Model not found" (Ollama)

Download the model: ollama pull mistral
List models: ollama list

Slow responses

Use smaller models (Llama 3.2 3B instead of 8B)
Check your internet connection for API calls
For local: Use faster models like llama3.2

📝 License

This project is open source. Check individual dependencies for their licenses.

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📚 Documentation

Execution Flow: See EXECUTION_FLOW.md for detailed flow
CrewAI Docs: https://docs.crewai.com
Gradio Docs: https://gradio.app/docs

🎓 What Was Built

This project demonstrates:

Multi-agent AI systems with CrewAI
Parallel task execution
Semantic search with vector databases
Integration with multiple LLM providers
Web interface with Gradio
Free-tier deployment on Hugging Face Spaces

💡 Tips

First Run: Vector DB indexing takes time on first use
Large Files: Use semantic search for large datasets
Complex Queries: Use "Analyze with Question" for specific queries
Model Selection: Larger models = better quality, slower speed
Local Testing: Use Ollama for faster iteration

🔗 Links

Hugging Face: https://huggingface.co
Ollama: https://ollama.ai
OpenRouter: https://openrouter.ai
CrewAI: https://docs.crewai.com

Built with ❤️ using CrewAI and open-source LLMs