# HuggingFace CI/CD Pipeline Documentation

## 🚀 Overview

This repository implements a comprehensive CI/CD pipeline for deploying the **Corporate Policy Assistant** to HuggingFace Spaces with automated testing and validation.

## 🏗️ Architecture

### Hybrid AI System
- **Embeddings**: HuggingFace Inference API (`intfloat/multilingual-e5-large`)
- **LLM**: OpenRouter (`microsoft/wizardlm-2-8x22b`)
- **Citation Validation**: Real-time hallucination detection
- **Vector Database**: ChromaDB for document storage

### CI/CD Components
1. **GitHub Actions**: Automated testing and deployment
2. **HuggingFace Spaces**: Production environment
3. **Comprehensive Test Suite**: 27+ tests covering all components
4. **Code Quality**: Black, isort, flake8 validation

## 📋 Pipeline Workflow

### 1. **Code Quality Checks**
```bash
# Formatting validation
black --check .
isort --check-only .
flake8 --max-line-length=88
```

### 2. **Comprehensive Testing**
```bash
# Run all tests
pytest -v --cov=src --cov-report=xml

# HF-specific tests
pytest tests/test_embedding/test_hf_embedding_service.py -v

# Citation validation tests
pytest -k citation -v
```

### 3. **Architecture Validation**
- Service initialization checks
- Import validation
- End-to-end pipeline testing
- Citation fix verification

### 4. **Deployment**
- **Primary**: `msse-team-3/ai-engineering-project`
- **Backup**: `sethmcknight/msse-ai-engineering`
- **Health Checks**: Automated smoke tests

## 🔧 Configuration Files

### `.github/workflows/hf-ci-cd.yml`
Main CI/CD pipeline with:
- Multi-Python version testing (3.10, 3.11)
- Comprehensive test suite
- Automatic HF deployment
- Post-deployment validation

### `.hf.yml`
HuggingFace Space configuration:
```yaml
title: MSSE AI Engineering - Corporate Policy Assistant
sdk: gradio
app_file: app.py
models:
  - intfloat/multilingual-e5-large
```

### `pytest.ini`
Test configuration with coverage and markers:
```ini
[tool.pytest.ini_options]
markers = [
    "unit: Unit tests",
    "integration: Integration tests",
    "hf: HuggingFace specific tests",
    "citation: Citation validation tests"
]
```

## 🧪 Testing Strategy

### Unit Tests (Critical)
- ✅ **HF Embedding Service**: 12 comprehensive tests
- ✅ **Prompt Templates**: Citation fix validation
- ✅ **LLM Components**: Response processing
- ✅ **Context Formatting**: Fixed document numbering

### Integration Tests (Non-Critical)
- ⚠️ **API Integration**: Real HF/OpenRouter calls
- ⚠️ **End-to-End Pipeline**: Complete workflow
- ⚠️ **Service Validation**: Production readiness

### Coverage Requirements
- **Minimum**: 80% code coverage
- **Focus Areas**: Core business logic
- **Exclusions**: Test files, dev tools

## 🚦 Pipeline Triggers

### Automatic Deployment
- **Push to `main`**: Full pipeline + production deployment
- **Push to `hf-main-local`**: HF-specific testing + staging deployment

### Pull Request Validation
- **All PRs**: Full test suite without deployment
- **Pre-commit checks**: Code quality validation

### Manual Triggers
- **Emergency Deployment**: Manual sync workflow
- **Test-only Runs**: Validation without deployment

## 🔐 Required Secrets

Configure these in GitHub repository settings:

```bash
# HuggingFace
HF_TOKEN=hf_xxxxxxxxxx

# OpenRouter (for production testing)
OPENROUTER_API_KEY=sk-or-xxxxxxxxxx

# Existing secrets
RENDER_API_KEY=rnd_xxxxxxxxxx
RENDER_SERVICE_ID=srv-xxxxxxxxxx
```

## 📊 Monitoring & Validation

### Automated Health Checks
```bash
# Production endpoints
https://msse-team-3-ai-engineering-project.hf.space/health
https://sethmcknight-msse-ai-engineering.hf.space/health
```

### Citation Quality Monitoring
- Real-time hallucination detection
- Invalid citation logging
- Performance metrics tracking

### Test Execution
```bash
# Run comprehensive test suite
./scripts/hf_test_runner.sh

# Run specific test categories
pytest -m "hf and unit" -v
pytest -m "citation" -v
```

## 🎯 Key Features Validated

### ✅ Citation Hallucination Fix
- **Problem**: LLM generated `document_1.md` instead of real filenames
- **Solution**: Enhanced prompt engineering + context formatting
- **Validation**: Automated tests verify proper citations

### ✅ Hybrid Architecture Support
- **HF Embeddings**: Production-ready API integration
- **OpenRouter LLM**: Reliable response generation
- **Error Handling**: Graceful degradation on failures

### ✅ Test Infrastructure
- **Mock Services**: CI-friendly testing
- **Integration Tests**: Real API validation
- **Coverage Reporting**: Quality metrics

## 🚀 Deployment Process

### 1. **Development**
```bash
# Create feature branch
git checkout -b feature/your-feature

# Make changes and test locally
pytest tests/

# Submit PR
git push origin feature/your-feature
```

### 2. **CI Validation**
- Automated testing on PR
- Code quality checks
- Architecture validation

### 3. **Production Deployment**
```bash
# Merge to main triggers deployment
git checkout main
git merge feature/your-feature
git push origin main
```

### 4. **Post-Deployment**
- Automated health checks
- Citation validation monitoring
- Performance tracking

## 🔧 Troubleshooting

### Common Issues

**Test Failures in CI**
```bash
# Check test runner output
./scripts/hf_test_runner.sh

# Run specific failing tests
pytest tests/test_embedding/ -v --tb=short
```

**HF Deployment Issues**
- Verify `HF_TOKEN` secret is configured
- Check HuggingFace Space settings
- Review deployment logs in GitHub Actions

**Citation Validation Warnings**
- Expected behavior: System catches LLM hallucinations
- Check that actual policy filenames are being used
- Verify prompt template contains citation fix

### Debug Commands
```bash
# Validate services locally
python scripts/validate_services.py

# Test citation fix
python scripts/test_e2e_pipeline.py

# Run full pipeline
./scripts/hf_test_runner.sh
```

## 📈 Performance Metrics

### Test Execution Times
- **Unit Tests**: ~30 seconds
- **Integration Tests**: ~2 minutes
- **Full Pipeline**: ~5 minutes

### Deployment Times
- **HuggingFace Build**: ~3-5 minutes
- **Health Check Validation**: ~2 minutes
- **Total Deployment**: ~7-10 minutes

## 🎉 Success Indicators

### ✅ All Tests Passing
- 27+ tests across all components
- 80%+ code coverage
- No critical linting errors

### ✅ Successful Deployment
- HuggingFace Spaces responding
- Health endpoints returning 200
- Citation validation working

### ✅ Quality Metrics
- Real policy filenames in citations
- No `document_1.md` hallucinations
- Proper error handling

---

**Last Updated**: October 25, 2025
**Pipeline Version**: 2.0
**Maintainer**: MSSE Team 3