git-chat / README.md
Chintala Venkatesh
Pin Hugging Face Space to Python 3.11
1c8b6ce
---
title: Chat with GitHub Repository
emoji: πŸ€–
colorFrom: blue
colorTo: purple
sdk: gradio
python_version: 3.11
sdk_version: 4.44.0
app_file: app.py
pinned: false
---
# πŸ€– Chat with GitHub Repository
A powerful AI-powered application that allows you to analyze any GitHub repository and ask questions about the codebase in natural language!
## 🌟 Features
- **Repository Analysis**: Clone and process any public GitHub repository
- **AI-Powered Chat**: Ask questions about the code using natural language
- **Smart Code Understanding**: Uses advanced embeddings to understand code structure and context
- **Source References**: Get direct references to relevant code files
- **Multiple File Types**: Supports Python, JavaScript, TypeScript, Java, C++, Go, Rust, PHP, Ruby, Swift, Kotlin, Scala, Markdown, JSON, YAML, and more
## πŸš€ How It Works
1. **Enter Repository URL**: Paste any public GitHub repository URL
2. **Processing**: The app clones the repo, extracts code files, and creates embeddings
3. **Ask Questions**: Chat with the AI about the codebase using natural language
4. **Get Answers**: Receive detailed answers with references to specific code files
## πŸ’‘ Example Questions
- "What is this project about?"
- "How is the code structured?"
- "What are the main functions/classes?"
- "How does authentication work?"
- "What dependencies does this project use?"
- "Are there any tests in this codebase?"
- "How is error handling implemented?"
- "What are the main API endpoints?"
## πŸ› οΈ Technology Stack
- **Frontend**: Gradio for the user interface
- **AI/ML**: Groq API for fast LLM inference, Sentence Transformers for embeddings
- **Vector Database**: ChromaDB for storing code embeddings
- **Code Processing**: GitPython for repository cloning
- **Language Models**: Groq API with DeepSeek-R1-Distill-Llama-70B
## πŸ“ Supported File Types
The application processes the following file types:
- **Programming Languages**: `.py`, `.js`, `.ts`, `.jsx`, `.tsx`, `.java`, `.cpp`, `.c`, `.cs`, `.go`, `.rs`, `.php`, `.rb`, `.swift`, `.kt`, `.scala`
- **Configuration**: `.json`, `.yaml`, `.yml`, `.toml`
- **Documentation**: `.md`, `.txt`
## πŸ”§ Configuration
The app uses Groq's fast and reliable API for LLM inference and Sentence Transformers for embeddings. To get started:
1. Get a free API key from [Groq](https://console.groq.com/)
2. Set your API key in the environment variable `GROQ_API_KEY`
3. The app is pre-configured to use the powerful `deepseek-r1-distill-llama-70b` model
No additional setup required - just add your Groq API key!
## πŸ“ Usage Tips
- **Repository Size**: Works best with small to medium-sized repositories
- **Processing Time**: Larger repositories may take longer to process
- **Question Quality**: More specific questions tend to get better answers
- **File Limits**: Files larger than 1MB are skipped to ensure optimal performance
## 🀝 Contributing
This project is open source and contributions are welcome! Feel free to:
- Report bugs
- Suggest new features
- Submit pull requests
- Improve documentation
## πŸ“„ License
This project is licensed under the MIT License.
---
**Note**: This application processes public repositories only. Private repositories require authentication tokens.