--- title: Chat with GitHub Repository emoji: 🤖 colorFrom: blue colorTo: purple sdk: gradio python_version: 3.11 sdk_version: 4.44.0 app_file: app.py pinned: false --- # 🤖 Chat with GitHub Repository A powerful AI-powered application that allows you to analyze any GitHub repository and ask questions about the codebase in natural language! ## 🌟 Features - **Repository Analysis**: Clone and process any public GitHub repository - **AI-Powered Chat**: Ask questions about the code using natural language - **Smart Code Understanding**: Uses advanced embeddings to understand code structure and context - **Source References**: Get direct references to relevant code files - **Multiple File Types**: Supports Python, JavaScript, TypeScript, Java, C++, Go, Rust, PHP, Ruby, Swift, Kotlin, Scala, Markdown, JSON, YAML, and more ## 🚀 How It Works 1. **Enter Repository URL**: Paste any public GitHub repository URL 2. **Processing**: The app clones the repo, extracts code files, and creates embeddings 3. **Ask Questions**: Chat with the AI about the codebase using natural language 4. **Get Answers**: Receive detailed answers with references to specific code files ## 💡 Example Questions - "What is this project about?" - "How is the code structured?" - "What are the main functions/classes?" - "How does authentication work?" - "What dependencies does this project use?" - "Are there any tests in this codebase?" - "How is error handling implemented?" - "What are the main API endpoints?" ## 🛠️ Technology Stack - **Frontend**: Gradio for the user interface - **AI/ML**: Groq API for fast LLM inference, Sentence Transformers for embeddings - **Vector Database**: ChromaDB for storing code embeddings - **Code Processing**: GitPython for repository cloning - **Language Models**: Groq API with DeepSeek-R1-Distill-Llama-70B ## 📁 Supported File Types The application processes the following file types: - **Programming Languages**: `.py`, `.js`, `.ts`, `.jsx`, `.tsx`, `.java`, `.cpp`, `.c`, `.cs`, `.go`, `.rs`, `.php`, `.rb`, `.swift`, `.kt`, `.scala` - **Configuration**: `.json`, `.yaml`, `.yml`, `.toml` - **Documentation**: `.md`, `.txt` ## 🔧 Configuration The app uses Groq's fast and reliable API for LLM inference and Sentence Transformers for embeddings. To get started: 1. Get a free API key from [Groq](https://console.groq.com/) 2. Set your API key in the environment variable `GROQ_API_KEY` 3. The app is pre-configured to use the powerful `deepseek-r1-distill-llama-70b` model No additional setup required - just add your Groq API key! ## 📝 Usage Tips - **Repository Size**: Works best with small to medium-sized repositories - **Processing Time**: Larger repositories may take longer to process - **Question Quality**: More specific questions tend to get better answers - **File Limits**: Files larger than 1MB are skipped to ensure optimal performance ## 🤝 Contributing This project is open source and contributions are welcome! Feel free to: - Report bugs - Suggest new features - Submit pull requests - Improve documentation ## 📄 License This project is licensed under the MIT License. --- **Note**: This application processes public repositories only. Private repositories require authentication tokens.