Spaces:
Sleeping
Sleeping
| title: NBA Analysis | |
| emoji: 🔥 | |
| colorFrom: red | |
| colorTo: indigo | |
| sdk: gradio | |
| sdk_version: 5.49.1 | |
| app_file: app.py | |
| pinned: false | |
| # 🏀 NBA Data Analysis with CrewAI | |
| An intelligent NBA data analysis application powered by CrewAI multi-agent framework. Upload your NBA CSV data and get comprehensive analysis with insights, statistics, and engaging storylines generated by AI agents. | |
| ## ✨ Features | |
| - 🤖 **Multi-Agent AI System**: Three specialized agents (Engineer, Analyst, Storyteller) work together | |
| - 📊 **Data Engineering**: Automatic data cleaning and preparation | |
| - 🔍 **Intelligent Analysis**: AI-powered insights and pattern detection | |
| - 📈 **Statistical Analysis**: Top performers, trends, and key metrics | |
| - 🔎 **Semantic Search**: Natural language queries on your data using vector embeddings | |
| - 📝 **Storytelling**: Engaging headlines and narratives from data | |
| - 🎯 **Parallel Processing**: Tasks run in parallel for faster results | |
| - 🌐 **Web Interface**: Easy-to-use Gradio web app | |
| - 🆓 **Free & Open Source**: Uses free-tier open-source LLM models | |
| ## 🏗️ Architecture | |
| The application uses a multi-agent system with the following components: | |
| - **Data Engineer Agent**: Processes and validates data | |
| - **Data Analyst Agent**: Performs statistical analysis and extracts insights | |
| - **Storyteller Agent**: Creates engaging narratives from analysis results | |
| ### Tech Stack | |
| - **CrewAI**: Multi-agent AI framework | |
| - **Gradio**: Web interface | |
| - **Pandas**: Data analysis | |
| - **ChromaDB**: Vector database for semantic search | |
| - **Sentence Transformers**: Embeddings for semantic search | |
| - **Hugging Face / Ollama**: Open-source LLM providers | |
| ## 📋 Prerequisites | |
| - Python 3.11 or 3.12 | |
| - pip or uv package manager | |
| - (Optional) Ollama for local testing | |
| ## 🚀 Installation | |
| ### 1. Clone the Repository | |
| ```bash | |
| git clone <your-repo-url> | |
| cd NBA_Analysis | |
| ``` | |
| ### 2. Install Dependencies | |
| **Using uv (recommended):** | |
| ```bash | |
| uv sync | |
| ``` | |
| **Using pip:** | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| ### 3. Prepare Your Data | |
| Place your NBA CSV file in the project directory, or upload it through the web interface. | |
| ## ⚙️ Configuration | |
| ### LLM Provider Setup | |
| The application supports multiple LLM providers. Configure via environment variables: | |
| #### Option 1: Hugging Face (Recommended for Deployment) | |
| 1. Get a free API token from [Hugging Face](https://huggingface.co/settings/tokens) | |
| 2. Set environment variables: | |
| ```bash | |
| export LLM_PROVIDER=huggingface | |
| export HF_API_KEY=your-hf-token | |
| export HF_MODEL=meta-llama/Llama-3.1-8B-Instruct # or any HF model | |
| ``` | |
| **Available Models:** | |
| - `meta-llama/Llama-3.1-8B-Instruct` (default, best quality) | |
| - `mistralai/Mistral-7B-Instruct-v0.2` (excellent quality) | |
| - `Qwen/Qwen2.5-7B-Instruct` (multilingual, great quality) | |
| - `meta-llama/Llama-3.2-3B-Instruct` (faster, smaller) | |
| #### Option 2: Ollama (For Local Testing) | |
| 1. Install Ollama: https://ollama.ai | |
| 2. Start Ollama service: | |
| ```bash | |
| ollama serve | |
| ``` | |
| 3. Download a model: | |
| ```bash | |
| ollama pull mistral # or llama3.2, qwen2.5:7b, etc. | |
| ``` | |
| 4. Set environment variables: | |
| ```bash | |
| export LLM_PROVIDER=ollama | |
| export OLLAMA_MODEL=mistral | |
| export OLLAMA_BASE_URL=http://localhost:11434/v1 | |
| ``` | |
| #### Option 3: OpenRouter (Alternative Free Option) | |
| 1. Get a free API key from [OpenRouter](https://openrouter.ai) | |
| 2. Set environment variables: | |
| ```bash | |
| export LLM_PROVIDER=openrouter | |
| export OPENROUTER_API_KEY=your-key | |
| export OPENROUTER_MODEL=google/gemma-2-2b-it:free | |
| ``` | |
| ### Default Configuration | |
| The application defaults to **Hugging Face** with **Llama 3.1 8B Instruct** model. No configuration needed if you set `HF_API_KEY`. | |
| ## 🎮 Usage | |
| ### Web Interface (Recommended) | |
| ```bash | |
| python app.py | |
| ``` | |
| Then open your browser to the URL shown (usually `http://localhost:7860`). | |
| **Features:** | |
| - Upload CSV file | |
| - Enter analysis query (or leave blank for comprehensive analysis) | |
| - Click "Analyze Dataset" for full analysis | |
| - Click "Analyze with Question" for quick queries | |
| ### Command Line | |
| ```bash | |
| python main.py | |
| ``` | |
| ## 📖 Example Queries | |
| - "Who are the top 5 three-point shooters?" | |
| - "Show me the best scoring games this season" | |
| - "Which players have the highest field goal percentage?" | |
| - "Analyze team performance trends" | |
| - "Find games with triple doubles" | |
| - "What are the most efficient shooters?" | |
| ## 🛠️ Project Structure | |
| ``` | |
| NBA_Analysis/ | |
| ├── app.py # Gradio web interface | |
| ├── main.py # Command-line entry point | |
| ├── config.py # LLM and configuration settings | |
| ├── agents.py # AI agent definitions | |
| ├── crew.py # CrewAI crew orchestration | |
| ├── tasks.py # Task definitions | |
| ├── tools.py # Data access tools for agents | |
| ├── vector_db.py # Vector database for semantic search | |
| ├── requirements.txt # Python dependencies | |
| ├── pyproject.toml # Project configuration | |
| ├── test_local.sh # Script for local testing with Ollama | |
| ├── EXECUTION_FLOW.md # Detailed execution flow documentation | |
| └── README.md # This file | |
| ``` | |
| ## 🔧 Available Tools | |
| The agents have access to 5 data tools: | |
| 1. **read_nba_data**: Read sample rows to understand structure | |
| 2. **search_nba_data**: Filter and search CSV data | |
| 3. **get_nba_data_summary**: Get comprehensive dataset overview | |
| 4. **semantic_search_nba_data**: Natural language semantic search | |
| 5. **analyze_nba_data**: Execute pandas operations for advanced analysis | |
| ## 🚀 Deployment | |
| ### Hugging Face Spaces (Free) | |
| 1. **Get API Keys:** | |
| - Hugging Face token: https://huggingface.co/settings/tokens | |
| - (Optional) OpenRouter key: https://openrouter.ai | |
| 2. **Create Space:** | |
| - Go to https://huggingface.co/spaces | |
| - Create new Space with Gradio SDK | |
| - Push your code | |
| 3. **Set Secrets:** | |
| - Space Settings → Repository secrets | |
| - Add `HF_API_KEY` = your Hugging Face token | |
| - (Optional) Add `LLM_PROVIDER` = `huggingface` | |
| - (Optional) Add `HF_MODEL` = your preferred model | |
| 4. **Deploy:** | |
| ```bash | |
| git remote add hf https://huggingface.co/spaces/yourusername/nba-analysis | |
| git push hf main | |
| ``` | |
| See `EXECUTION_FLOW.md` for detailed deployment instructions. | |
| ## 🧪 Local Testing | |
| ### Quick Test with Ollama | |
| ```bash | |
| # Make sure Ollama is running | |
| ollama serve | |
| # Run test script | |
| ./test_local.sh | |
| ``` | |
| Or manually: | |
| ```bash | |
| export LLM_PROVIDER=ollama | |
| export OLLAMA_MODEL=mistral | |
| export OLLAMA_BASE_URL=http://localhost:11434/v1 | |
| python app.py | |
| ``` | |
| ## 📊 How It Works | |
| 1. **User Input**: Upload CSV + enter query | |
| 2. **Crew Creation**: Three agents are initialized with their roles | |
| 3. **Parallel Execution**: | |
| - Engineer validates data | |
| - Analyst performs analysis (runs in parallel) | |
| - Storyteller creates narrative (waits for Analyst) | |
| 4. **Tool Execution**: Agents use tools to access and analyze data | |
| 5. **LLM Processing**: AI generates insights and responses | |
| 6. **Result Aggregation**: All outputs are combined and formatted | |
| 7. **Display**: Results shown to user | |
| See `EXECUTION_FLOW.md` for detailed flow documentation. | |
| ## 🎯 Key Features Explained | |
| ### Semantic Search | |
| Uses vector embeddings to find semantically similar records. First run indexes the CSV, subsequent runs use cached embeddings. | |
| ### Parallel Processing | |
| Engineer and Analyst tasks run simultaneously for faster results. Storyteller waits for Analyst to complete. | |
| ### Multi-Agent Collaboration | |
| Each agent has a specialized role: | |
| - **Engineer**: Data quality and structure | |
| - **Analyst**: Statistical analysis and insights | |
| - **Storyteller**: Narrative and presentation | |
| ## 🔒 Environment Variables | |
| | Variable | Description | Default | | |
| |----------|-------------|---------| | |
| | `LLM_PROVIDER` | LLM provider (`huggingface`, `ollama`, `openrouter`) | `huggingface` | | |
| | `HF_API_KEY` | Hugging Face API token | Required if using HF | | |
| | `HF_MODEL` | Hugging Face model name | `meta-llama/Llama-3.1-8B-Instruct` | | |
| | `OLLAMA_MODEL` | Ollama model name | `mistral` | | |
| | `OLLAMA_BASE_URL` | Ollama server URL | `http://localhost:11434/v1` | | |
| | `OPENROUTER_API_KEY` | OpenRouter API key | Required if using OpenRouter | | |
| | `OPENROUTER_MODEL` | OpenRouter model name | `google/gemma-2-2b-it:free` | | |
| ## 🐛 Troubleshooting | |
| ### "ModuleNotFoundError: No module named 'crewai'" | |
| - Install dependencies: `pip install -r requirements.txt` or `uv sync` | |
| ### "HF_API_KEY not set" | |
| - Set your Hugging Face token as environment variable or in Space secrets | |
| ### "Connection refused" (Ollama) | |
| - Make sure `ollama serve` is running | |
| - Check port 11434 is available | |
| ### "Model not found" (Ollama) | |
| - Download the model: `ollama pull mistral` | |
| - List models: `ollama list` | |
| ### Slow responses | |
| - Use smaller models (Llama 3.2 3B instead of 8B) | |
| - Check your internet connection for API calls | |
| - For local: Use faster models like `llama3.2` | |
| ## 📝 License | |
| This project is open source. Check individual dependencies for their licenses. | |
| ## 🤝 Contributing | |
| Contributions are welcome! Please feel free to submit a Pull Request. | |
| ## 📚 Documentation | |
| - **Execution Flow**: See `EXECUTION_FLOW.md` for detailed flow | |
| - **CrewAI Docs**: https://docs.crewai.com | |
| - **Gradio Docs**: https://gradio.app/docs | |
| ## 🎓 What Was Built | |
| This project demonstrates: | |
| - Multi-agent AI systems with CrewAI | |
| - Parallel task execution | |
| - Semantic search with vector databases | |
| - Integration with multiple LLM providers | |
| - Web interface with Gradio | |
| - Free-tier deployment on Hugging Face Spaces | |
| ## 💡 Tips | |
| - **First Run**: Vector DB indexing takes time on first use | |
| - **Large Files**: Use semantic search for large datasets | |
| - **Complex Queries**: Use "Analyze with Question" for specific queries | |
| - **Model Selection**: Larger models = better quality, slower speed | |
| - **Local Testing**: Use Ollama for faster iteration | |
| ## 🔗 Links | |
| - **Hugging Face**: https://huggingface.co | |
| - **Ollama**: https://ollama.ai | |
| - **OpenRouter**: https://openrouter.ai | |
| - **CrewAI**: https://docs.crewai.com | |
| --- | |
| **Built with ❤️ using CrewAI and open-source LLMs** | |