HR-Assistant / README.md
HassanJalil's picture
Upload 13 files
0a9f9c2 verified

A newer version of the Streamlit SDK is available: 1.54.0

Upgrade
metadata
title: RAG-Based-HR-Assistant
emoji: 🎯
colorFrom: blue
colorTo: purple
sdk: streamlit
sdk_version: 1.28.0
app_file: app.py
pinned: false
license: mit

BLUESCARF AI HR Assistant

A sophisticated RAG-based HR Assistant powered by Google Gemini AI, designed specifically for BLUESCARF ARTIFICIAL INTELLIGENCE. This system provides intelligent, context-aware responses to HR-related queries using company documents and policies.

πŸš€ Features

Core Capabilities

  • RAG-Powered Intelligence: Advanced retrieval-augmented generation using company documents
  • Google Gemini Integration: State-of-the-art AI responses with company context
  • Document Learning: Processes PDF policies, handbooks, and HR documents
  • Semantic Search: Intelligent document retrieval with ChromaDB vector storage
  • Admin Management: Secure document upload and knowledge base management

Key Benefits

  • One-Time Learning: Documents processed once, knowledge persists
  • Scope-Focused: Only answers HR-related questions using company documents
  • Enterprise-Ready: Built for production deployment with security features
  • Minimal Design: Clean, professional interface optimized for efficiency
  • Real-Time Updates: Add/remove documents after deployment

πŸ“‹ Prerequisites

Required

  • Python 3.8 or higher
  • Google Gemini API key (Get yours here)
  • Minimum 2GB RAM for optimal performance
  • 500MB storage space for vector database

Recommended

  • 4GB+ RAM for large document processing
  • SSD storage for faster vector operations
  • Stable internet connection for API calls

πŸ› οΈ Installation & Setup

Method 1: Hugging Face Spaces (Recommended)

  1. Clone or Download this repository
  2. Upload files to your Hugging Face Space
  3. Add your company logo as logo.png (200x200px recommended)
  4. Deploy - the app will automatically install dependencies

Method 2: Local Development

# Clone the repository
git clone <repository-url>
cd bluescarf-hr-assistant

# Install dependencies
pip install -r requirements.txt

# Run the application
streamlit run app.py

Method 3: Docker Deployment

FROM python:3.9-slim

WORKDIR /app
COPY . .

RUN pip install -r requirements.txt

EXPOSE 8501

CMD ["streamlit", "run", "app.py", "--server.port=8501", "--server.address=0.0.0.0"]

βš™οΈ Configuration

Environment Variables

Create a .env file for custom configuration:

# Application Settings
COMPANY_NAME="BLUESCARF ARTIFICIAL INTELLIGENCE"
ENVIRONMENT=production

# Document Processing
CHUNK_SIZE=1000
CHUNK_OVERLAP=200
MAX_FILE_SIZE=52428800  # 50MB

# Vector Database
MAX_CONTEXT_CHUNKS=5
SIMILARITY_THRESHOLD=0.5

# API Configuration
GEMINI_MODEL=gemini-pro
TEMPERATURE=0.3

Admin Access

Default Admin Password: bluescarf_admin_2024

⚠️ IMPORTANT: Change this password immediately after deployment!

πŸ“š Usage Guide

For End Users

  1. Enter API Key: Provide your Google Gemini API key
  2. Ask HR Questions: Query about policies, benefits, procedures
  3. Get Contextual Answers: Receive responses based on company documents

Example Queries:

  • "What is our vacation policy?"
  • "How do I apply for health insurance?"
  • "What are the performance review procedures?"
  • "Tell me about our remote work policy"

For Administrators

  1. Access Admin Panel: Click "Admin Access" and enter password
  2. Upload Documents: Add PDF policies, handbooks, procedures
  3. Manage Knowledge Base: View, delete, or update documents
  4. Monitor System: Check health status and analytics

πŸ“ Project Structure

bluescarf-hr-assistant/
β”œβ”€β”€ app.py                 # Main Streamlit application
β”œβ”€β”€ document_processor.py  # PDF processing and chunking
β”œβ”€β”€ vector_store.py       # ChromaDB vector operations
β”œβ”€β”€ admin.py              # Administrative interface
β”œβ”€β”€ config.py             # Configuration management
β”œβ”€β”€ utils.py              # Utility functions
β”œβ”€β”€ requirements.txt      # Python dependencies
β”œβ”€β”€ README.md            # This documentation
β”œβ”€β”€ logo.png             # Company logo (add yours)
└── vector_db/           # Vector database storage (auto-created)
    β”œβ”€β”€ chroma.sqlite3   # ChromaDB database
    └── metadata/        # Document metadata

πŸ”’ Security Features

Authentication

  • Password-protected admin panel
  • API key validation and secure storage
  • Session-based access control

Data Protection

  • Local vector storage (no external data sharing)
  • Secure document hashing for deduplication
  • Audit logging for administrative actions

Access Control

  • HR-only query filtering
  • Document source validation
  • Secure file upload handling

πŸš€ Deployment Guide

Hugging Face Spaces Deployment

  1. Create Space: Visit Hugging Face Spaces
  2. Choose Streamlit: Select Streamlit as the SDK
  3. Upload Files: Upload all project files
  4. Add Logo: Replace logo.png with your company logo
  5. Configure Secrets: Set environment variables if needed
  6. Deploy: Space will build and deploy automatically

Environment-Specific Optimizations

For Hugging Face Spaces:

  • Automatic resource optimization
  • Reduced memory footprint
  • Optimized chunk sizes

For Private Servers:

  • Full resource utilization
  • Enhanced caching
  • Advanced logging

πŸ“Š Performance Optimization

Document Processing

  • Intelligent chunking with semantic awareness
  • Batch embedding generation
  • Efficient vector storage with ChromaDB

Response Generation

  • Context-aware retrieval
  • Optimized prompt engineering
  • Relevance scoring and ranking

System Resources

  • Lazy loading of AI models
  • Memory-efficient vector operations
  • Automatic garbage collection

πŸ”§ Customization

Branding

  • Replace logo.png with your company logo
  • Update company name in config.py
  • Customize colors in the CSS section of app.py

Functionality

  • Modify HR keywords in utils.py
  • Adjust chunk sizes in config.py
  • Customize response templates in app.py

Integration

  • Add SSO authentication
  • Integrate with HR systems
  • Connect to document management platforms

πŸ“ˆ Monitoring & Analytics

Built-in Analytics

  • Query classification and tracking
  • Response quality metrics
  • Document usage statistics
  • Performance monitoring

Health Checks

  • Vector database integrity
  • API connectivity status
  • Storage availability
  • Processing pipeline health

πŸ› Troubleshooting

Common Issues

API Key Invalid

  • Verify key format and permissions
  • Check Gemini API quotas
  • Ensure internet connectivity

Document Processing Fails

  • Verify PDF is text-based (not scanned)
  • Check file size limits (50MB default)
  • Ensure readable content exists

Vector Search Returns No Results

  • Check document relevance to HR domain
  • Verify embedding model availability
  • Restart application to refresh cache

Admin Panel Access Denied

  • Use correct password: bluescarf_admin_2024
  • Clear browser cache/cookies
  • Check for session timeouts

Performance Issues

Slow Document Processing

  • Reduce chunk size in configuration
  • Process documents in smaller batches
  • Increase available memory

API Response Timeouts

  • Check internet connection stability
  • Verify API key rate limits
  • Reduce context chunk count

πŸ“ž Support & Contact

Technical Support

  • Documentation: Check this README and inline comments
  • Issues: Review common troubleshooting steps
  • Performance: Monitor system health checks

Business Contact

  • Company: BLUESCARF ARTIFICIAL INTELLIGENCE
  • Purpose: HR Assistant Support
  • Access: Through admin panel for system administrators

πŸ“„ License & Compliance

Usage Terms

  • Designed specifically for BLUESCARF AI internal use
  • Ensure compliance with company data policies
  • Maintain confidentiality of uploaded documents

Data Handling

  • All data processed locally
  • No external sharing of company documents
  • Secure storage and access controls

πŸ”„ Version History

v1.0.0 (Current)

  • Initial release with full RAG functionality
  • Google Gemini integration
  • Admin panel for document management
  • ChromaDB vector storage
  • Professional UI with company branding

Roadmap

  • Multi-language support
  • Advanced analytics dashboard
  • Integration with HR systems
  • Mobile-responsive enhancements
  • Voice query capabilities

πŸš€ Quick Start Checklist

  • Upload all project files to deployment platform
  • Add your company logo as logo.png
  • Obtain Google Gemini API key
  • Change default admin password
  • Upload initial HR documents via admin panel
  • Test with sample HR queries
  • Configure environment variables if needed
  • Monitor system health and performance

Ready to deploy! Your BLUESCARF AI HR Assistant is now configured for production use.


Built with ❀️ for BLUESCARF ARTIFICIAL INTELLIGENCE