# HuggingFace CI/CD Pipeline Documentation ## ๐Ÿš€ Overview This repository implements a comprehensive CI/CD pipeline for deploying the **Corporate Policy Assistant** to HuggingFace Spaces with automated testing and validation. ## ๐Ÿ—๏ธ Architecture ### Hybrid AI System - **Embeddings**: HuggingFace Inference API (`intfloat/multilingual-e5-large`) - **LLM**: OpenRouter (`microsoft/wizardlm-2-8x22b`) - **Citation Validation**: Real-time hallucination detection - **Vector Database**: ChromaDB for document storage ### CI/CD Components 1. **GitHub Actions**: Automated testing and deployment 2. **HuggingFace Spaces**: Production environment 3. **Comprehensive Test Suite**: 27+ tests covering all components 4. **Code Quality**: Black, isort, flake8 validation ## ๐Ÿ“‹ Pipeline Workflow ### 1. **Code Quality Checks** ```bash # Formatting validation black --check . isort --check-only . flake8 --max-line-length=88 ``` ### 2. **Comprehensive Testing** ```bash # Run all tests pytest -v --cov=src --cov-report=xml # HF-specific tests pytest tests/test_embedding/test_hf_embedding_service.py -v # Citation validation tests pytest -k citation -v ``` ### 3. **Architecture Validation** - Service initialization checks - Import validation - End-to-end pipeline testing - Citation fix verification ### 4. **Deployment** - **Primary**: `msse-team-3/ai-engineering-project` - **Backup**: `sethmcknight/msse-ai-engineering` - **Health Checks**: Automated smoke tests ## ๐Ÿ”ง Configuration Files ### `.github/workflows/hf-ci-cd.yml` Main CI/CD pipeline with: - Multi-Python version testing (3.10, 3.11) - Comprehensive test suite - Automatic HF deployment - Post-deployment validation ### `.hf.yml` HuggingFace Space configuration: ```yaml title: MSSE AI Engineering - Corporate Policy Assistant sdk: gradio app_file: app.py models: - intfloat/multilingual-e5-large ``` ### `pytest.ini` Test configuration with coverage and markers: ```ini [tool.pytest.ini_options] markers = [ "unit: Unit tests", "integration: Integration tests", "hf: HuggingFace specific tests", "citation: Citation validation tests" ] ``` ## ๐Ÿงช Testing Strategy ### Unit Tests (Critical) - โœ… **HF Embedding Service**: 12 comprehensive tests - โœ… **Prompt Templates**: Citation fix validation - โœ… **LLM Components**: Response processing - โœ… **Context Formatting**: Fixed document numbering ### Integration Tests (Non-Critical) - โš ๏ธ **API Integration**: Real HF/OpenRouter calls - โš ๏ธ **End-to-End Pipeline**: Complete workflow - โš ๏ธ **Service Validation**: Production readiness ### Coverage Requirements - **Minimum**: 80% code coverage - **Focus Areas**: Core business logic - **Exclusions**: Test files, dev tools ## ๐Ÿšฆ Pipeline Triggers ### Automatic Deployment - **Push to `main`**: Full pipeline + production deployment - **Push to `hf-main-local`**: HF-specific testing + staging deployment ### Pull Request Validation - **All PRs**: Full test suite without deployment - **Pre-commit checks**: Code quality validation ### Manual Triggers - **Emergency Deployment**: Manual sync workflow - **Test-only Runs**: Validation without deployment ## ๐Ÿ” Required Secrets Configure these in GitHub repository settings: ```bash # HuggingFace HF_TOKEN=hf_xxxxxxxxxx # OpenRouter (for production testing) OPENROUTER_API_KEY=sk-or-xxxxxxxxxx # Existing secrets RENDER_API_KEY=rnd_xxxxxxxxxx RENDER_SERVICE_ID=srv-xxxxxxxxxx ``` ## ๐Ÿ“Š Monitoring & Validation ### Automated Health Checks ```bash # Production endpoints https://msse-team-3-ai-engineering-project.hf.space/health https://sethmcknight-msse-ai-engineering.hf.space/health ``` ### Citation Quality Monitoring - Real-time hallucination detection - Invalid citation logging - Performance metrics tracking ### Test Execution ```bash # Run comprehensive test suite ./scripts/hf_test_runner.sh # Run specific test categories pytest -m "hf and unit" -v pytest -m "citation" -v ``` ## ๐ŸŽฏ Key Features Validated ### โœ… Citation Hallucination Fix - **Problem**: LLM generated `document_1.md` instead of real filenames - **Solution**: Enhanced prompt engineering + context formatting - **Validation**: Automated tests verify proper citations ### โœ… Hybrid Architecture Support - **HF Embeddings**: Production-ready API integration - **OpenRouter LLM**: Reliable response generation - **Error Handling**: Graceful degradation on failures ### โœ… Test Infrastructure - **Mock Services**: CI-friendly testing - **Integration Tests**: Real API validation - **Coverage Reporting**: Quality metrics ## ๐Ÿš€ Deployment Process ### 1. **Development** ```bash # Create feature branch git checkout -b feature/your-feature # Make changes and test locally pytest tests/ # Submit PR git push origin feature/your-feature ``` ### 2. **CI Validation** - Automated testing on PR - Code quality checks - Architecture validation ### 3. **Production Deployment** ```bash # Merge to main triggers deployment git checkout main git merge feature/your-feature git push origin main ``` ### 4. **Post-Deployment** - Automated health checks - Citation validation monitoring - Performance tracking ## ๐Ÿ”ง Troubleshooting ### Common Issues **Test Failures in CI** ```bash # Check test runner output ./scripts/hf_test_runner.sh # Run specific failing tests pytest tests/test_embedding/ -v --tb=short ``` **HF Deployment Issues** - Verify `HF_TOKEN` secret is configured - Check HuggingFace Space settings - Review deployment logs in GitHub Actions **Citation Validation Warnings** - Expected behavior: System catches LLM hallucinations - Check that actual policy filenames are being used - Verify prompt template contains citation fix ### Debug Commands ```bash # Validate services locally python scripts/validate_services.py # Test citation fix python scripts/test_e2e_pipeline.py # Run full pipeline ./scripts/hf_test_runner.sh ``` ## ๐Ÿ“ˆ Performance Metrics ### Test Execution Times - **Unit Tests**: ~30 seconds - **Integration Tests**: ~2 minutes - **Full Pipeline**: ~5 minutes ### Deployment Times - **HuggingFace Build**: ~3-5 minutes - **Health Check Validation**: ~2 minutes - **Total Deployment**: ~7-10 minutes ## ๐ŸŽ‰ Success Indicators ### โœ… All Tests Passing - 27+ tests across all components - 80%+ code coverage - No critical linting errors ### โœ… Successful Deployment - HuggingFace Spaces responding - Health endpoints returning 200 - Citation validation working ### โœ… Quality Metrics - Real policy filenames in citations - No `document_1.md` hallucinations - Proper error handling --- **Last Updated**: October 25, 2025 **Pipeline Version**: 2.0 **Maintainer**: MSSE Team 3