| --- |
| title: Chat with GitHub Repository |
| emoji: π€ |
| colorFrom: blue |
| colorTo: purple |
| sdk: gradio |
| python_version: 3.11 |
| sdk_version: 4.44.0 |
| app_file: app.py |
| pinned: false |
| --- |
| |
| # π€ Chat with GitHub Repository |
|
|
| A powerful AI-powered application that allows you to analyze any GitHub repository and ask questions about the codebase in natural language! |
|
|
| ## π Features |
|
|
| - **Repository Analysis**: Clone and process any public GitHub repository |
| - **AI-Powered Chat**: Ask questions about the code using natural language |
| - **Smart Code Understanding**: Uses advanced embeddings to understand code structure and context |
| - **Source References**: Get direct references to relevant code files |
| - **Multiple File Types**: Supports Python, JavaScript, TypeScript, Java, C++, Go, Rust, PHP, Ruby, Swift, Kotlin, Scala, Markdown, JSON, YAML, and more |
|
|
| ## π How It Works |
|
|
| 1. **Enter Repository URL**: Paste any public GitHub repository URL |
| 2. **Processing**: The app clones the repo, extracts code files, and creates embeddings |
| 3. **Ask Questions**: Chat with the AI about the codebase using natural language |
| 4. **Get Answers**: Receive detailed answers with references to specific code files |
|
|
| ## π‘ Example Questions |
|
|
| - "What is this project about?" |
| - "How is the code structured?" |
| - "What are the main functions/classes?" |
| - "How does authentication work?" |
| - "What dependencies does this project use?" |
| - "Are there any tests in this codebase?" |
| - "How is error handling implemented?" |
| - "What are the main API endpoints?" |
|
|
| ## π οΈ Technology Stack |
|
|
| - **Frontend**: Gradio for the user interface |
| - **AI/ML**: Groq API for fast LLM inference, Sentence Transformers for embeddings |
| - **Vector Database**: ChromaDB for storing code embeddings |
| - **Code Processing**: GitPython for repository cloning |
| - **Language Models**: Groq API with DeepSeek-R1-Distill-Llama-70B |
|
|
| ## π Supported File Types |
|
|
| The application processes the following file types: |
| - **Programming Languages**: `.py`, `.js`, `.ts`, `.jsx`, `.tsx`, `.java`, `.cpp`, `.c`, `.cs`, `.go`, `.rs`, `.php`, `.rb`, `.swift`, `.kt`, `.scala` |
| - **Configuration**: `.json`, `.yaml`, `.yml`, `.toml` |
| - **Documentation**: `.md`, `.txt` |
|
|
| ## π§ Configuration |
|
|
| The app uses Groq's fast and reliable API for LLM inference and Sentence Transformers for embeddings. To get started: |
|
|
| 1. Get a free API key from [Groq](https://console.groq.com/) |
| 2. Set your API key in the environment variable `GROQ_API_KEY` |
| 3. The app is pre-configured to use the powerful `deepseek-r1-distill-llama-70b` model |
|
|
| No additional setup required - just add your Groq API key! |
|
|
| ## π Usage Tips |
|
|
| - **Repository Size**: Works best with small to medium-sized repositories |
| - **Processing Time**: Larger repositories may take longer to process |
| - **Question Quality**: More specific questions tend to get better answers |
| - **File Limits**: Files larger than 1MB are skipped to ensure optimal performance |
|
|
| ## π€ Contributing |
|
|
| This project is open source and contributions are welcome! Feel free to: |
| - Report bugs |
| - Suggest new features |
| - Submit pull requests |
| - Improve documentation |
|
|
| ## π License |
|
|
| This project is licensed under the MIT License. |
|
|
| --- |
|
|
| **Note**: This application processes public repositories only. Private repositories require authentication tokens. |
|
|