--- title: RAG-Based-HR-Assistant emoji: 🎯 colorFrom: blue colorTo: purple sdk: streamlit sdk_version: 1.28.0 app_file: app.py pinned: false license: mit --- # BLUESCARF AI HR Assistant A sophisticated RAG-based HR Assistant powered by Google Gemini AI, designed specifically for BLUESCARF ARTIFICIAL INTELLIGENCE. This system provides intelligent, context-aware responses to HR-related queries using company documents and policies. ## 🚀 Features ### Core Capabilities - **RAG-Powered Intelligence**: Advanced retrieval-augmented generation using company documents - **Google Gemini Integration**: State-of-the-art AI responses with company context - **Document Learning**: Processes PDF policies, handbooks, and HR documents - **Semantic Search**: Intelligent document retrieval with ChromaDB vector storage - **Admin Management**: Secure document upload and knowledge base management ### Key Benefits - **One-Time Learning**: Documents processed once, knowledge persists - **Scope-Focused**: Only answers HR-related questions using company documents - **Enterprise-Ready**: Built for production deployment with security features - **Minimal Design**: Clean, professional interface optimized for efficiency - **Real-Time Updates**: Add/remove documents after deployment ## 📋 Prerequisites ### Required - Python 3.8 or higher - Google Gemini API key ([Get yours here](https://makersuite.google.com/app/apikey)) - Minimum 2GB RAM for optimal performance - 500MB storage space for vector database ### Recommended - 4GB+ RAM for large document processing - SSD storage for faster vector operations - Stable internet connection for API calls ## 🛠️ Installation & Setup ### Method 1: Hugging Face Spaces (Recommended) 1. **Clone or Download** this repository 2. **Upload files** to your Hugging Face Space 3. **Add your company logo** as `logo.png` (200x200px recommended) 4. **Deploy** - the app will automatically install dependencies ### Method 2: Local Development ```bash # Clone the repository git clone cd bluescarf-hr-assistant # Install dependencies pip install -r requirements.txt # Run the application streamlit run app.py ``` ### Method 3: Docker Deployment ```dockerfile FROM python:3.9-slim WORKDIR /app COPY . . RUN pip install -r requirements.txt EXPOSE 8501 CMD ["streamlit", "run", "app.py", "--server.port=8501", "--server.address=0.0.0.0"] ``` ## ⚙️ Configuration ### Environment Variables Create a `.env` file for custom configuration: ```env # Application Settings COMPANY_NAME="BLUESCARF ARTIFICIAL INTELLIGENCE" ENVIRONMENT=production # Document Processing CHUNK_SIZE=1000 CHUNK_OVERLAP=200 MAX_FILE_SIZE=52428800 # 50MB # Vector Database MAX_CONTEXT_CHUNKS=5 SIMILARITY_THRESHOLD=0.5 # API Configuration GEMINI_MODEL=gemini-pro TEMPERATURE=0.3 ``` ### Admin Access **Default Admin Password**: `bluescarf_admin_2024` ⚠️ **IMPORTANT**: Change this password immediately after deployment! ## 📚 Usage Guide ### For End Users 1. **Enter API Key**: Provide your Google Gemini API key 2. **Ask HR Questions**: Query about policies, benefits, procedures 3. **Get Contextual Answers**: Receive responses based on company documents **Example Queries:** - "What is our vacation policy?" - "How do I apply for health insurance?" - "What are the performance review procedures?" - "Tell me about our remote work policy" ### For Administrators 1. **Access Admin Panel**: Click "Admin Access" and enter password 2. **Upload Documents**: Add PDF policies, handbooks, procedures 3. **Manage Knowledge Base**: View, delete, or update documents 4. **Monitor System**: Check health status and analytics ## 📁 Project Structure ``` bluescarf-hr-assistant/ ├── app.py # Main Streamlit application ├── document_processor.py # PDF processing and chunking ├── vector_store.py # ChromaDB vector operations ├── admin.py # Administrative interface ├── config.py # Configuration management ├── utils.py # Utility functions ├── requirements.txt # Python dependencies ├── README.md # This documentation ├── logo.png # Company logo (add yours) └── vector_db/ # Vector database storage (auto-created) ├── chroma.sqlite3 # ChromaDB database └── metadata/ # Document metadata ``` ## 🔒 Security Features ### Authentication - Password-protected admin panel - API key validation and secure storage - Session-based access control ### Data Protection - Local vector storage (no external data sharing) - Secure document hashing for deduplication - Audit logging for administrative actions ### Access Control - HR-only query filtering - Document source validation - Secure file upload handling ## 🚀 Deployment Guide ### Hugging Face Spaces Deployment 1. **Create Space**: Visit [Hugging Face Spaces](https://huggingface.co/spaces) 2. **Choose Streamlit**: Select Streamlit as the SDK 3. **Upload Files**: Upload all project files 4. **Add Logo**: Replace `logo.png` with your company logo 5. **Configure Secrets**: Set environment variables if needed 6. **Deploy**: Space will build and deploy automatically ### Environment-Specific Optimizations #### For Hugging Face Spaces: - Automatic resource optimization - Reduced memory footprint - Optimized chunk sizes #### For Private Servers: - Full resource utilization - Enhanced caching - Advanced logging ## 📊 Performance Optimization ### Document Processing - Intelligent chunking with semantic awareness - Batch embedding generation - Efficient vector storage with ChromaDB ### Response Generation - Context-aware retrieval - Optimized prompt engineering - Relevance scoring and ranking ### System Resources - Lazy loading of AI models - Memory-efficient vector operations - Automatic garbage collection ## 🔧 Customization ### Branding - Replace `logo.png` with your company logo - Update company name in `config.py` - Customize colors in the CSS section of `app.py` ### Functionality - Modify HR keywords in `utils.py` - Adjust chunk sizes in `config.py` - Customize response templates in `app.py` ### Integration - Add SSO authentication - Integrate with HR systems - Connect to document management platforms ## 📈 Monitoring & Analytics ### Built-in Analytics - Query classification and tracking - Response quality metrics - Document usage statistics - Performance monitoring ### Health Checks - Vector database integrity - API connectivity status - Storage availability - Processing pipeline health ## 🐛 Troubleshooting ### Common Issues **API Key Invalid** - Verify key format and permissions - Check Gemini API quotas - Ensure internet connectivity **Document Processing Fails** - Verify PDF is text-based (not scanned) - Check file size limits (50MB default) - Ensure readable content exists **Vector Search Returns No Results** - Check document relevance to HR domain - Verify embedding model availability - Restart application to refresh cache **Admin Panel Access Denied** - Use correct password: `bluescarf_admin_2024` - Clear browser cache/cookies - Check for session timeouts ### Performance Issues **Slow Document Processing** - Reduce chunk size in configuration - Process documents in smaller batches - Increase available memory **API Response Timeouts** - Check internet connection stability - Verify API key rate limits - Reduce context chunk count ## 📞 Support & Contact ### Technical Support - **Documentation**: Check this README and inline comments - **Issues**: Review common troubleshooting steps - **Performance**: Monitor system health checks ### Business Contact - **Company**: BLUESCARF ARTIFICIAL INTELLIGENCE - **Purpose**: HR Assistant Support - **Access**: Through admin panel for system administrators ## 📄 License & Compliance ### Usage Terms - Designed specifically for BLUESCARF AI internal use - Ensure compliance with company data policies - Maintain confidentiality of uploaded documents ### Data Handling - All data processed locally - No external sharing of company documents - Secure storage and access controls ## 🔄 Version History ### v1.0.0 (Current) - Initial release with full RAG functionality - Google Gemini integration - Admin panel for document management - ChromaDB vector storage - Professional UI with company branding ### Roadmap - Multi-language support - Advanced analytics dashboard - Integration with HR systems - Mobile-responsive enhancements - Voice query capabilities --- ## 🚀 Quick Start Checklist - [ ] Upload all project files to deployment platform - [ ] Add your company logo as `logo.png` - [ ] Obtain Google Gemini API key - [ ] Change default admin password - [ ] Upload initial HR documents via admin panel - [ ] Test with sample HR queries - [ ] Configure environment variables if needed - [ ] Monitor system health and performance **Ready to deploy!** Your BLUESCARF AI HR Assistant is now configured for production use. --- *Built with ❤️ for BLUESCARF ARTIFICIAL INTELLIGENCE*