Spaces:
Sleeping
Sleeping
| title: RAG-Based-HR-Assistant | |
| emoji: π― | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: streamlit | |
| sdk_version: 1.28.0 | |
| app_file: app.py | |
| pinned: false | |
| license: mit | |
| # BLUESCARF AI HR Assistant | |
| A sophisticated RAG-based HR Assistant powered by Google Gemini AI, designed specifically for BLUESCARF ARTIFICIAL INTELLIGENCE. This system provides intelligent, context-aware responses to HR-related queries using company documents and policies. | |
| ## π Features | |
| ### Core Capabilities | |
| - **RAG-Powered Intelligence**: Advanced retrieval-augmented generation using company documents | |
| - **Google Gemini Integration**: State-of-the-art AI responses with company context | |
| - **Document Learning**: Processes PDF policies, handbooks, and HR documents | |
| - **Semantic Search**: Intelligent document retrieval with ChromaDB vector storage | |
| - **Admin Management**: Secure document upload and knowledge base management | |
| ### Key Benefits | |
| - **One-Time Learning**: Documents processed once, knowledge persists | |
| - **Scope-Focused**: Only answers HR-related questions using company documents | |
| - **Enterprise-Ready**: Built for production deployment with security features | |
| - **Minimal Design**: Clean, professional interface optimized for efficiency | |
| - **Real-Time Updates**: Add/remove documents after deployment | |
| ## π Prerequisites | |
| ### Required | |
| - Python 3.8 or higher | |
| - Google Gemini API key ([Get yours here](https://makersuite.google.com/app/apikey)) | |
| - Minimum 2GB RAM for optimal performance | |
| - 500MB storage space for vector database | |
| ### Recommended | |
| - 4GB+ RAM for large document processing | |
| - SSD storage for faster vector operations | |
| - Stable internet connection for API calls | |
| ## π οΈ Installation & Setup | |
| ### Method 1: Hugging Face Spaces (Recommended) | |
| 1. **Clone or Download** this repository | |
| 2. **Upload files** to your Hugging Face Space | |
| 3. **Add your company logo** as `logo.png` (200x200px recommended) | |
| 4. **Deploy** - the app will automatically install dependencies | |
| ### Method 2: Local Development | |
| ```bash | |
| # Clone the repository | |
| git clone <repository-url> | |
| cd bluescarf-hr-assistant | |
| # Install dependencies | |
| pip install -r requirements.txt | |
| # Run the application | |
| streamlit run app.py | |
| ``` | |
| ### Method 3: Docker Deployment | |
| ```dockerfile | |
| FROM python:3.9-slim | |
| WORKDIR /app | |
| COPY . . | |
| RUN pip install -r requirements.txt | |
| EXPOSE 8501 | |
| CMD ["streamlit", "run", "app.py", "--server.port=8501", "--server.address=0.0.0.0"] | |
| ``` | |
| ## βοΈ Configuration | |
| ### Environment Variables | |
| Create a `.env` file for custom configuration: | |
| ```env | |
| # Application Settings | |
| COMPANY_NAME="BLUESCARF ARTIFICIAL INTELLIGENCE" | |
| ENVIRONMENT=production | |
| # Document Processing | |
| CHUNK_SIZE=1000 | |
| CHUNK_OVERLAP=200 | |
| MAX_FILE_SIZE=52428800 # 50MB | |
| # Vector Database | |
| MAX_CONTEXT_CHUNKS=5 | |
| SIMILARITY_THRESHOLD=0.5 | |
| # API Configuration | |
| GEMINI_MODEL=gemini-pro | |
| TEMPERATURE=0.3 | |
| ``` | |
| ### Admin Access | |
| **Default Admin Password**: `bluescarf_admin_2024` | |
| β οΈ **IMPORTANT**: Change this password immediately after deployment! | |
| ## π Usage Guide | |
| ### For End Users | |
| 1. **Enter API Key**: Provide your Google Gemini API key | |
| 2. **Ask HR Questions**: Query about policies, benefits, procedures | |
| 3. **Get Contextual Answers**: Receive responses based on company documents | |
| **Example Queries:** | |
| - "What is our vacation policy?" | |
| - "How do I apply for health insurance?" | |
| - "What are the performance review procedures?" | |
| - "Tell me about our remote work policy" | |
| ### For Administrators | |
| 1. **Access Admin Panel**: Click "Admin Access" and enter password | |
| 2. **Upload Documents**: Add PDF policies, handbooks, procedures | |
| 3. **Manage Knowledge Base**: View, delete, or update documents | |
| 4. **Monitor System**: Check health status and analytics | |
| ## π Project Structure | |
| ``` | |
| bluescarf-hr-assistant/ | |
| βββ app.py # Main Streamlit application | |
| βββ document_processor.py # PDF processing and chunking | |
| βββ vector_store.py # ChromaDB vector operations | |
| βββ admin.py # Administrative interface | |
| βββ config.py # Configuration management | |
| βββ utils.py # Utility functions | |
| βββ requirements.txt # Python dependencies | |
| βββ README.md # This documentation | |
| βββ logo.png # Company logo (add yours) | |
| βββ vector_db/ # Vector database storage (auto-created) | |
| βββ chroma.sqlite3 # ChromaDB database | |
| βββ metadata/ # Document metadata | |
| ``` | |
| ## π Security Features | |
| ### Authentication | |
| - Password-protected admin panel | |
| - API key validation and secure storage | |
| - Session-based access control | |
| ### Data Protection | |
| - Local vector storage (no external data sharing) | |
| - Secure document hashing for deduplication | |
| - Audit logging for administrative actions | |
| ### Access Control | |
| - HR-only query filtering | |
| - Document source validation | |
| - Secure file upload handling | |
| ## π Deployment Guide | |
| ### Hugging Face Spaces Deployment | |
| 1. **Create Space**: Visit [Hugging Face Spaces](https://huggingface.co/spaces) | |
| 2. **Choose Streamlit**: Select Streamlit as the SDK | |
| 3. **Upload Files**: Upload all project files | |
| 4. **Add Logo**: Replace `logo.png` with your company logo | |
| 5. **Configure Secrets**: Set environment variables if needed | |
| 6. **Deploy**: Space will build and deploy automatically | |
| ### Environment-Specific Optimizations | |
| #### For Hugging Face Spaces: | |
| - Automatic resource optimization | |
| - Reduced memory footprint | |
| - Optimized chunk sizes | |
| #### For Private Servers: | |
| - Full resource utilization | |
| - Enhanced caching | |
| - Advanced logging | |
| ## π Performance Optimization | |
| ### Document Processing | |
| - Intelligent chunking with semantic awareness | |
| - Batch embedding generation | |
| - Efficient vector storage with ChromaDB | |
| ### Response Generation | |
| - Context-aware retrieval | |
| - Optimized prompt engineering | |
| - Relevance scoring and ranking | |
| ### System Resources | |
| - Lazy loading of AI models | |
| - Memory-efficient vector operations | |
| - Automatic garbage collection | |
| ## π§ Customization | |
| ### Branding | |
| - Replace `logo.png` with your company logo | |
| - Update company name in `config.py` | |
| - Customize colors in the CSS section of `app.py` | |
| ### Functionality | |
| - Modify HR keywords in `utils.py` | |
| - Adjust chunk sizes in `config.py` | |
| - Customize response templates in `app.py` | |
| ### Integration | |
| - Add SSO authentication | |
| - Integrate with HR systems | |
| - Connect to document management platforms | |
| ## π Monitoring & Analytics | |
| ### Built-in Analytics | |
| - Query classification and tracking | |
| - Response quality metrics | |
| - Document usage statistics | |
| - Performance monitoring | |
| ### Health Checks | |
| - Vector database integrity | |
| - API connectivity status | |
| - Storage availability | |
| - Processing pipeline health | |
| ## π Troubleshooting | |
| ### Common Issues | |
| **API Key Invalid** | |
| - Verify key format and permissions | |
| - Check Gemini API quotas | |
| - Ensure internet connectivity | |
| **Document Processing Fails** | |
| - Verify PDF is text-based (not scanned) | |
| - Check file size limits (50MB default) | |
| - Ensure readable content exists | |
| **Vector Search Returns No Results** | |
| - Check document relevance to HR domain | |
| - Verify embedding model availability | |
| - Restart application to refresh cache | |
| **Admin Panel Access Denied** | |
| - Use correct password: `bluescarf_admin_2024` | |
| - Clear browser cache/cookies | |
| - Check for session timeouts | |
| ### Performance Issues | |
| **Slow Document Processing** | |
| - Reduce chunk size in configuration | |
| - Process documents in smaller batches | |
| - Increase available memory | |
| **API Response Timeouts** | |
| - Check internet connection stability | |
| - Verify API key rate limits | |
| - Reduce context chunk count | |
| ## π Support & Contact | |
| ### Technical Support | |
| - **Documentation**: Check this README and inline comments | |
| - **Issues**: Review common troubleshooting steps | |
| - **Performance**: Monitor system health checks | |
| ### Business Contact | |
| - **Company**: BLUESCARF ARTIFICIAL INTELLIGENCE | |
| - **Purpose**: HR Assistant Support | |
| - **Access**: Through admin panel for system administrators | |
| ## π License & Compliance | |
| ### Usage Terms | |
| - Designed specifically for BLUESCARF AI internal use | |
| - Ensure compliance with company data policies | |
| - Maintain confidentiality of uploaded documents | |
| ### Data Handling | |
| - All data processed locally | |
| - No external sharing of company documents | |
| - Secure storage and access controls | |
| ## π Version History | |
| ### v1.0.0 (Current) | |
| - Initial release with full RAG functionality | |
| - Google Gemini integration | |
| - Admin panel for document management | |
| - ChromaDB vector storage | |
| - Professional UI with company branding | |
| ### Roadmap | |
| - Multi-language support | |
| - Advanced analytics dashboard | |
| - Integration with HR systems | |
| - Mobile-responsive enhancements | |
| - Voice query capabilities | |
| --- | |
| ## π Quick Start Checklist | |
| - [ ] Upload all project files to deployment platform | |
| - [ ] Add your company logo as `logo.png` | |
| - [ ] Obtain Google Gemini API key | |
| - [ ] Change default admin password | |
| - [ ] Upload initial HR documents via admin panel | |
| - [ ] Test with sample HR queries | |
| - [ ] Configure environment variables if needed | |
| - [ ] Monitor system health and performance | |
| **Ready to deploy!** Your BLUESCARF AI HR Assistant is now configured for production use. | |
| --- | |
| *Built with β€οΈ for BLUESCARF ARTIFICIAL INTELLIGENCE* |