git-chat / HUGGINGFACE_SPACE_CONFIG.md
lakkiroy's picture
Upload folder using huggingface_hub
200bf6d verified
# Hugging Face Space Configuration
This document contains the configuration needed to deploy this application as a Hugging Face Space.
## Space Configuration
### Basic Settings
- **Space Name**: `chat-with-github-repo`
- **Space Type**: `Gradio`
- **Python Version**: `3.11`
- **Visibility**: `Public`
### Environment Variables (Optional)
Set these in your Hugging Face Space settings for better performance:
```
HUGGINGFACE_API_KEY=your_hf_token_here
GITHUB_TOKEN=your_github_token_here
```
### Hardware Requirements
- **CPU**: Basic (free tier works)
- **RAM**: 8GB+ recommended for larger repositories
- **Storage**: 10GB+ for model caching
## Deployment Steps
1. **Create a new Hugging Face Space**:
- Go to https://huggingface.co/new-space
- Choose "Gradio" as the Space SDK
- Set the space name and visibility
2. **Upload files**:
- Upload all files from this directory to your space
- Ensure the main `app.py` file is in the root directory
3. **Configure environment variables** (optional):
- Go to your space settings
- Add the environment variables listed above
- This improves rate limits and enables private repo access
4. **Deploy**:
- The space will automatically build and deploy
- First deployment may take 5-10 minutes due to model downloads
## File Structure for Hugging Face Space
```
your-space/
β”œβ”€β”€ app.py # Main Gradio application
β”œβ”€β”€ requirements.txt # Python dependencies
β”œβ”€β”€ README.md # Space documentation
β”œβ”€β”€ config.py # Configuration settings
β”œβ”€β”€ services/ # Service modules
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ github_service.py
β”‚ β”œβ”€β”€ embedding_service.py
β”‚ └── chat_service.py
β”œβ”€β”€ utils/ # Utility modules
β”‚ β”œβ”€β”€ __init__.py
β”‚ └── file_processor.py
└── models/ # Data models
β”œβ”€β”€ __init__.py
└── schemas.py
```
## Performance Optimization
### For Free Tier:
- Uses lightweight embedding model (`all-MiniLM-L6-v2`)
- Processes files in batches
- Implements file size limits
- Caches models locally
### For Better Performance:
- Upgrade to paid hardware
- Use larger embedding models
- Increase batch sizes
- Add Redis caching
## Troubleshooting
### Common Issues:
1. **Out of Memory**:
- Reduce batch size in embedding service
- Use smaller embedding model
- Upgrade hardware
2. **Slow Processing**:
- Add Hugging Face API token for better rate limits
- Use GPU hardware
- Optimize chunk sizes
3. **Git Clone Failures**:
- Add GitHub token for private repos
- Check repository URL format
- Ensure repository is public
### Debug Mode:
Set `debug=True` in `demo.launch()` for detailed error messages.
## Monitoring
Monitor your space performance:
- Check space logs for errors
- Monitor memory usage
- Track processing times
- Review user feedback
## Updates
To update your space:
1. Modify files locally
2. Upload changed files to your space
3. Space will automatically rebuild
4. Test functionality after deployment