Spaces:
Sleeping
Sleeping
File size: 4,452 Bytes
91cfe57 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 |
# Doctra Hugging Face Spaces Deployment Guide
## π Quick Deployment
### Option 1: Direct Upload to Hugging Face Spaces
1. **Create a new Space**:
- Go to [Hugging Face Spaces](https://huggingface.co/spaces)
- Click "Create new Space"
- Choose "Gradio" as the SDK
- Set the title to "Doctra - Document Parser"
2. **Upload files**:
- Upload all files from this `hf_space` folder to your Space
- Make sure `app.py` is in the root directory
3. **Configure environment**:
- Go to Settings β Secrets
- Add `VLM_API_KEY` if you want to use VLM features
- Set the value to your API key (OpenAI, Anthropic, Google, etc.)
### Option 2: Git Repository Deployment
1. **Create a Git repository**:
```bash
git init
git add .
git commit -m "Initial Doctra HF Space deployment"
git remote add origin <your-repo-url>
git push -u origin main
```
2. **Connect to Hugging Face Spaces**:
- Create a new Space
- Choose "Git repository" as the source
- Enter your repository URL
- Set the app file to `app.py`
### Option 3: Docker Deployment
1. **Build the Docker image**:
```bash
docker build -t doctra-hf-space .
```
2. **Run the container**:
```bash
docker run -p 7860:7860 doctra-hf-space
```
## π§ Configuration
### Environment Variables
Set these in your Hugging Face Space settings:
- `VLM_API_KEY`: Your API key for VLM providers
- `GRADIO_SERVER_NAME`: Server hostname (default: 0.0.0.0)
- `GRADIO_SERVER_PORT`: Server port (default: 7860)
### Hardware Requirements
- **CPU**: Minimum 2 cores recommended
- **RAM**: Minimum 4GB, 8GB+ recommended
- **Storage**: 10GB+ for models and dependencies
- **GPU**: Optional but recommended for faster processing
## π Performance Optimization
### For Hugging Face Spaces
1. **Use CPU-optimized models** when GPU is not available
2. **Reduce DPI settings** for faster processing
3. **Process smaller documents** to avoid memory issues
4. **Enable caching** for repeated operations
### For Local Deployment
1. **Use GPU acceleration** when available
2. **Increase memory limits** for large documents
3. **Use SSD storage** for better I/O performance
4. **Configure proper logging** for debugging
## π Troubleshooting
### Common Issues
1. **Import Errors**:
- Check that all dependencies are in `requirements.txt`
- Verify Python version compatibility
2. **Memory Issues**:
- Reduce DPI settings
- Process smaller documents
- Increase available memory
3. **API Key Issues**:
- Verify API key is correctly set
- Check provider-specific requirements
- Test API connectivity
4. **File Upload Issues**:
- Check file size limits
- Verify file format support
- Ensure proper permissions
### Debug Mode
To enable debug mode, set:
```bash
export GRADIO_DEBUG=1
```
## π Monitoring
### Health Checks
- Monitor CPU and memory usage
- Check disk space availability
- Verify API key validity
- Test document processing pipeline
### Logs
- Application logs: Check Gradio output
- Error logs: Monitor for exceptions
- Performance logs: Track processing times
- User logs: Monitor usage patterns
## π Updates
### Updating the Application
1. **Code updates**: Push changes to your repository
2. **Dependency updates**: Update `requirements.txt`
3. **Model updates**: Download new model versions
4. **Configuration updates**: Modify environment variables
### Version Control
- Use semantic versioning
- Tag releases appropriately
- Maintain changelog
- Test before deployment
## π‘οΈ Security
### Best Practices
1. **API Keys**: Store securely, never commit to code
2. **File Uploads**: Validate file types and sizes
3. **Rate Limiting**: Implement to prevent abuse
4. **Input Validation**: Sanitize all user inputs
### Privacy
- No data is stored permanently
- Files are processed in temporary directories
- API calls are made securely
- User data is not logged
## π Support
For issues and questions:
1. **GitHub Issues**: Report bugs and feature requests
2. **Documentation**: Check the main README.md
3. **Community**: Join discussions on Hugging Face
4. **Email**: Contact the development team
## π― Next Steps
After successful deployment:
1. **Test all features** with sample documents
2. **Configure monitoring** and alerting
3. **Set up backups** for important data
4. **Plan for scaling** based on usage
5. **Gather user feedback** for improvements
|