HR-Assistant / README.md
HassanJalil's picture
Upload 13 files
0a9f9c2 verified
---
title: RAG-Based-HR-Assistant
emoji: 🎯
colorFrom: blue
colorTo: purple
sdk: streamlit
sdk_version: 1.28.0
app_file: app.py
pinned: false
license: mit
---
# BLUESCARF AI HR Assistant
A sophisticated RAG-based HR Assistant powered by Google Gemini AI, designed specifically for BLUESCARF ARTIFICIAL INTELLIGENCE. This system provides intelligent, context-aware responses to HR-related queries using company documents and policies.
## πŸš€ Features
### Core Capabilities
- **RAG-Powered Intelligence**: Advanced retrieval-augmented generation using company documents
- **Google Gemini Integration**: State-of-the-art AI responses with company context
- **Document Learning**: Processes PDF policies, handbooks, and HR documents
- **Semantic Search**: Intelligent document retrieval with ChromaDB vector storage
- **Admin Management**: Secure document upload and knowledge base management
### Key Benefits
- **One-Time Learning**: Documents processed once, knowledge persists
- **Scope-Focused**: Only answers HR-related questions using company documents
- **Enterprise-Ready**: Built for production deployment with security features
- **Minimal Design**: Clean, professional interface optimized for efficiency
- **Real-Time Updates**: Add/remove documents after deployment
## πŸ“‹ Prerequisites
### Required
- Python 3.8 or higher
- Google Gemini API key ([Get yours here](https://makersuite.google.com/app/apikey))
- Minimum 2GB RAM for optimal performance
- 500MB storage space for vector database
### Recommended
- 4GB+ RAM for large document processing
- SSD storage for faster vector operations
- Stable internet connection for API calls
## πŸ› οΈ Installation & Setup
### Method 1: Hugging Face Spaces (Recommended)
1. **Clone or Download** this repository
2. **Upload files** to your Hugging Face Space
3. **Add your company logo** as `logo.png` (200x200px recommended)
4. **Deploy** - the app will automatically install dependencies
### Method 2: Local Development
```bash
# Clone the repository
git clone <repository-url>
cd bluescarf-hr-assistant
# Install dependencies
pip install -r requirements.txt
# Run the application
streamlit run app.py
```
### Method 3: Docker Deployment
```dockerfile
FROM python:3.9-slim
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
EXPOSE 8501
CMD ["streamlit", "run", "app.py", "--server.port=8501", "--server.address=0.0.0.0"]
```
## βš™οΈ Configuration
### Environment Variables
Create a `.env` file for custom configuration:
```env
# Application Settings
COMPANY_NAME="BLUESCARF ARTIFICIAL INTELLIGENCE"
ENVIRONMENT=production
# Document Processing
CHUNK_SIZE=1000
CHUNK_OVERLAP=200
MAX_FILE_SIZE=52428800 # 50MB
# Vector Database
MAX_CONTEXT_CHUNKS=5
SIMILARITY_THRESHOLD=0.5
# API Configuration
GEMINI_MODEL=gemini-pro
TEMPERATURE=0.3
```
### Admin Access
**Default Admin Password**: `bluescarf_admin_2024`
⚠️ **IMPORTANT**: Change this password immediately after deployment!
## πŸ“š Usage Guide
### For End Users
1. **Enter API Key**: Provide your Google Gemini API key
2. **Ask HR Questions**: Query about policies, benefits, procedures
3. **Get Contextual Answers**: Receive responses based on company documents
**Example Queries:**
- "What is our vacation policy?"
- "How do I apply for health insurance?"
- "What are the performance review procedures?"
- "Tell me about our remote work policy"
### For Administrators
1. **Access Admin Panel**: Click "Admin Access" and enter password
2. **Upload Documents**: Add PDF policies, handbooks, procedures
3. **Manage Knowledge Base**: View, delete, or update documents
4. **Monitor System**: Check health status and analytics
## πŸ“ Project Structure
```
bluescarf-hr-assistant/
β”œβ”€β”€ app.py # Main Streamlit application
β”œβ”€β”€ document_processor.py # PDF processing and chunking
β”œβ”€β”€ vector_store.py # ChromaDB vector operations
β”œβ”€β”€ admin.py # Administrative interface
β”œβ”€β”€ config.py # Configuration management
β”œβ”€β”€ utils.py # Utility functions
β”œβ”€β”€ requirements.txt # Python dependencies
β”œβ”€β”€ README.md # This documentation
β”œβ”€β”€ logo.png # Company logo (add yours)
└── vector_db/ # Vector database storage (auto-created)
β”œβ”€β”€ chroma.sqlite3 # ChromaDB database
└── metadata/ # Document metadata
```
## πŸ”’ Security Features
### Authentication
- Password-protected admin panel
- API key validation and secure storage
- Session-based access control
### Data Protection
- Local vector storage (no external data sharing)
- Secure document hashing for deduplication
- Audit logging for administrative actions
### Access Control
- HR-only query filtering
- Document source validation
- Secure file upload handling
## πŸš€ Deployment Guide
### Hugging Face Spaces Deployment
1. **Create Space**: Visit [Hugging Face Spaces](https://huggingface.co/spaces)
2. **Choose Streamlit**: Select Streamlit as the SDK
3. **Upload Files**: Upload all project files
4. **Add Logo**: Replace `logo.png` with your company logo
5. **Configure Secrets**: Set environment variables if needed
6. **Deploy**: Space will build and deploy automatically
### Environment-Specific Optimizations
#### For Hugging Face Spaces:
- Automatic resource optimization
- Reduced memory footprint
- Optimized chunk sizes
#### For Private Servers:
- Full resource utilization
- Enhanced caching
- Advanced logging
## πŸ“Š Performance Optimization
### Document Processing
- Intelligent chunking with semantic awareness
- Batch embedding generation
- Efficient vector storage with ChromaDB
### Response Generation
- Context-aware retrieval
- Optimized prompt engineering
- Relevance scoring and ranking
### System Resources
- Lazy loading of AI models
- Memory-efficient vector operations
- Automatic garbage collection
## πŸ”§ Customization
### Branding
- Replace `logo.png` with your company logo
- Update company name in `config.py`
- Customize colors in the CSS section of `app.py`
### Functionality
- Modify HR keywords in `utils.py`
- Adjust chunk sizes in `config.py`
- Customize response templates in `app.py`
### Integration
- Add SSO authentication
- Integrate with HR systems
- Connect to document management platforms
## πŸ“ˆ Monitoring & Analytics
### Built-in Analytics
- Query classification and tracking
- Response quality metrics
- Document usage statistics
- Performance monitoring
### Health Checks
- Vector database integrity
- API connectivity status
- Storage availability
- Processing pipeline health
## πŸ› Troubleshooting
### Common Issues
**API Key Invalid**
- Verify key format and permissions
- Check Gemini API quotas
- Ensure internet connectivity
**Document Processing Fails**
- Verify PDF is text-based (not scanned)
- Check file size limits (50MB default)
- Ensure readable content exists
**Vector Search Returns No Results**
- Check document relevance to HR domain
- Verify embedding model availability
- Restart application to refresh cache
**Admin Panel Access Denied**
- Use correct password: `bluescarf_admin_2024`
- Clear browser cache/cookies
- Check for session timeouts
### Performance Issues
**Slow Document Processing**
- Reduce chunk size in configuration
- Process documents in smaller batches
- Increase available memory
**API Response Timeouts**
- Check internet connection stability
- Verify API key rate limits
- Reduce context chunk count
## πŸ“ž Support & Contact
### Technical Support
- **Documentation**: Check this README and inline comments
- **Issues**: Review common troubleshooting steps
- **Performance**: Monitor system health checks
### Business Contact
- **Company**: BLUESCARF ARTIFICIAL INTELLIGENCE
- **Purpose**: HR Assistant Support
- **Access**: Through admin panel for system administrators
## πŸ“„ License & Compliance
### Usage Terms
- Designed specifically for BLUESCARF AI internal use
- Ensure compliance with company data policies
- Maintain confidentiality of uploaded documents
### Data Handling
- All data processed locally
- No external sharing of company documents
- Secure storage and access controls
## πŸ”„ Version History
### v1.0.0 (Current)
- Initial release with full RAG functionality
- Google Gemini integration
- Admin panel for document management
- ChromaDB vector storage
- Professional UI with company branding
### Roadmap
- Multi-language support
- Advanced analytics dashboard
- Integration with HR systems
- Mobile-responsive enhancements
- Voice query capabilities
---
## πŸš€ Quick Start Checklist
- [ ] Upload all project files to deployment platform
- [ ] Add your company logo as `logo.png`
- [ ] Obtain Google Gemini API key
- [ ] Change default admin password
- [ ] Upload initial HR documents via admin panel
- [ ] Test with sample HR queries
- [ ] Configure environment variables if needed
- [ ] Monitor system health and performance
**Ready to deploy!** Your BLUESCARF AI HR Assistant is now configured for production use.
---
*Built with ❀️ for BLUESCARF ARTIFICIAL INTELLIGENCE*