# Complete Deployment Guide

## Table of Contents
1. [Local Development](#local-development)
2. [Docker Deployment](#docker-deployment)
3. [HuggingFace Spaces](#huggingface-spaces)
4. [AWS Deployment](#aws-deployment)
5. [Google Cloud](#google-cloud)
6. [Azure Deployment](#azure-deployment)

---

## Local Development

### Prerequisites
```bash
# System requirements
- Python 3.10+
- FFmpeg
- 4GB+ RAM
- (Optional) CUDA-capable GPU
```

### Setup
```bash
# 1. Clone repository
git clone https://github.com/YOUR_USERNAME/whisper-german-asr.git
cd whisper-german-asr

# 2. Run quick start script
chmod +x scripts/quick_start.sh
./scripts/quick_start.sh

# 3. Start services
# Option A: Gradio Demo
python demo/app.py

# Option B: FastAPI
uvicorn api.main:app --reload

# Option C: Both (separate terminals)
python demo/app.py &
uvicorn api.main:app --port 8000 &
```

### Testing
```bash
# Test API
curl -X POST "http://localhost:8000/transcribe" \
  -F "file=@test_audio.wav"

# Test Demo
# Open http://localhost:7860 in browser
```

---

## Docker Deployment

### Quick Start
```bash
# Build and run with docker-compose
docker-compose up -d

# View logs
docker-compose logs -f

# Stop services
docker-compose down
```

### Manual Docker Build
```bash
# Build image
docker build -t whisper-asr .

# Run API
docker run -d \
  -p 8000:8000 \
  -v $(pwd)/whisper_test_tuned:/app/whisper_test_tuned:ro \
  --name whisper-api \
  whisper-asr

# Run Demo
docker run -d \
  -p 7860:7860 \
  -v $(pwd)/whisper_test_tuned:/app/whisper_test_tuned:ro \
  --name whisper-demo \
  whisper-asr python demo/app.py
```

### Docker with GPU
```bash
# Install nvidia-docker2
# https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html

# Run with GPU
docker run -d \
  --gpus all \
  -p 8000:8000 \
  -v $(pwd)/whisper_test_tuned:/app/whisper_test_tuned:ro \
  whisper-asr
```

---

## HuggingFace Spaces

### Method 1: Gradio Space (Recommended)

#### Step 1: Create Space
1. Go to https://huggingface.co/spaces
2. Click "Create new Space"
3. Settings:
   - **Name:** whisper-german-asr
   - **SDK:** Gradio
   - **Hardware:** CPU Basic (free) or GPU T4 (paid)
   - **Visibility:** Public

#### Step 2: Prepare Files
```bash
# Create a new directory for Space
mkdir hf-space
cd hf-space

# Copy demo app
cp ../demo/app.py app.py

# Create requirements.txt
cat > requirements.txt << EOF
torch>=2.2.0
transformers>=4.42.0
librosa>=0.10.1
gradio>=4.0.0
soundfile>=0.12.1
EOF

# Create README.md with frontmatter
cat > README.md << EOF
---
title: Whisper German ASR
emoji: 🎙️
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 4.0.0
app_file: app.py
pinned: false
license: mit
---

# Whisper German ASR

Fine-tuned Whisper model for German speech recognition.

Try it out by recording or uploading German audio!
EOF
```

#### Step 3: Update app.py
```python
# Modify model loading to use HF Hub
def load_model(model_path="YOUR_USERNAME/whisper-small-german"):
    model = WhisperForConditionalGeneration.from_pretrained(model_path)
    processor = WhisperProcessor.from_pretrained(model_path)
    # ... rest of code
```

#### Step 4: Push Model to HF Hub (First Time)
```python
# In Python
from transformers import WhisperForConditionalGeneration, WhisperProcessor

model = WhisperForConditionalGeneration.from_pretrained("./whisper_test_tuned")
processor = WhisperProcessor.from_pretrained("openai/whisper-small")

# Push to Hub
model.push_to_hub("YOUR_USERNAME/whisper-small-german")
processor.push_to_hub("YOUR_USERNAME/whisper-small-german")
```

#### Step 5: Deploy to Space
```bash
# Clone Space repository
git clone https://huggingface.co/spaces/YOUR_USERNAME/whisper-german-asr
cd whisper-german-asr

# Copy files
cp ../hf-space/* .

# Push to Space
git add .
git commit -m "Initial deployment"
git push
```

### Method 2: Docker Space

```dockerfile
# Create Dockerfile in Space
FROM python:3.10-slim

WORKDIR /app

RUN apt-get update && apt-get install -y ffmpeg libsndfile1

COPY requirements.txt .
RUN pip install -r requirements.txt

COPY app.py .

CMD ["python", "app.py"]
```

---

## AWS Deployment

### Option 1: ECS Fargate

#### Step 1: Push Docker Image to ECR
```bash
# Create ECR repository
aws ecr create-repository --repository-name whisper-asr

# Login to ECR
aws ecr get-login-password --region us-east-1 | \
  docker login --username AWS --password-stdin \
  YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com

# Tag and push
docker tag whisper-asr:latest \
  YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/whisper-asr:latest
docker push YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/whisper-asr:latest
```

#### Step 2: Create ECS Task Definition
```json
{
  "family": "whisper-asr",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "1024",
  "memory": "2048",
  "containerDefinitions": [
    {
      "name": "whisper-api",
      "image": "YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/whisper-asr:latest",
      "portMappings": [
        {
          "containerPort": 8000,
          "protocol": "tcp"
        }
      ],
      "environment": [
        {
          "name": "MODEL_PATH",
          "value": "/app/whisper_test_tuned"
        }
      ]
    }
  ]
}
```

#### Step 3: Create ECS Service
```bash
aws ecs create-service \
  --cluster default \
  --service-name whisper-asr \
  --task-definition whisper-asr \
  --desired-count 1 \
  --launch-type FARGATE \
  --network-configuration "awsvpcConfiguration={subnets=[subnet-xxx],securityGroups=[sg-xxx],assignPublicIp=ENABLED}"
```

### Option 2: Lambda + API Gateway

```python
# lambda_function.py
import json
import base64
from transformers import WhisperForConditionalGeneration, WhisperProcessor
import librosa
import io

model = None
processor = None

def load_model():
    global model, processor
    if model is None:
        model = WhisperForConditionalGeneration.from_pretrained("/tmp/model")
        processor = WhisperProcessor.from_pretrained("openai/whisper-small")

def lambda_handler(event, context):
    load_model()
    
    # Decode base64 audio
    audio_data = base64.b64decode(event['body'])
    audio, sr = librosa.load(io.BytesIO(audio_data), sr=16000)
    
    # Transcribe
    input_features = processor(audio, sampling_rate=16000, return_tensors="pt").input_features
    predicted_ids = model.generate(input_features)
    transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0]
    
    return {
        'statusCode': 200,
        'body': json.dumps({'transcription': transcription})
    }
```

---

## Google Cloud

### Cloud Run Deployment

#### Step 1: Build and Push to GCR
```bash
# Enable APIs
gcloud services enable run.googleapis.com
gcloud services enable containerregistry.googleapis.com

# Build image
gcloud builds submit --tag gcr.io/PROJECT_ID/whisper-asr

# Or use Docker
docker tag whisper-asr gcr.io/PROJECT_ID/whisper-asr
docker push gcr.io/PROJECT_ID/whisper-asr
```

#### Step 2: Deploy to Cloud Run
```bash
gcloud run deploy whisper-asr \
  --image gcr.io/PROJECT_ID/whisper-asr \
  --platform managed \
  --region us-central1 \
  --allow-unauthenticated \
  --memory 2Gi \
  --cpu 2 \
  --timeout 300
```

#### Step 3: Get Service URL
```bash
gcloud run services describe whisper-asr \
  --platform managed \
  --region us-central1 \
  --format 'value(status.url)'
```

---

## Azure Deployment

### Azure Container Instances

#### Step 1: Push to Azure Container Registry
```bash
# Create ACR
az acr create --resource-group myResourceGroup \
  --name whisperasr --sku Basic

# Login
az acr login --name whisperasr

# Tag and push
docker tag whisper-asr whisperasr.azurecr.io/whisper-asr:latest
docker push whisperasr.azurecr.io/whisper-asr:latest
```

#### Step 2: Deploy Container Instance
```bash
az container create \
  --resource-group myResourceGroup \
  --name whisper-asr \
  --image whisperasr.azurecr.io/whisper-asr:latest \
  --cpu 2 \
  --memory 4 \
  --registry-login-server whisperasr.azurecr.io \
  --registry-username <username> \
  --registry-password <password> \
  --dns-name-label whisper-asr \
  --ports 8000
```

---

## Production Considerations

### Security
- [ ] Use HTTPS (SSL/TLS certificates)
- [ ] Implement rate limiting
- [ ] Add authentication/API keys
- [ ] Validate file uploads
- [ ] Set CORS policies

### Monitoring
- [ ] Setup logging (CloudWatch, Stackdriver, etc.)
- [ ] Add health checks
- [ ] Monitor latency and errors
- [ ] Track usage metrics

### Scaling
- [ ] Configure auto-scaling
- [ ] Use load balancer
- [ ] Implement caching
- [ ] Consider CDN for static assets

### Cost Optimization
- [ ] Use spot/preemptible instances
- [ ] Implement request batching
- [ ] Cache model in memory
- [ ] Monitor and optimize resource usage

---

## Troubleshooting

### Common Issues

**Model Not Loading**
```bash
# Check model path
ls -la whisper_test_tuned/

# Check permissions
chmod -R 755 whisper_test_tuned/
```

**Out of Memory**
```bash
# Reduce batch size
# Use CPU instead of GPU
# Increase container memory
```

**Slow Inference**
```bash
# Use GPU
# Reduce beam size
# Use smaller model
# Implement caching
```

**Port Already in Use**
```bash
# Find process
lsof -i :8000

# Kill process
kill -9 <PID>

# Use different port
uvicorn api.main:app --port 8001
```

---

## Next Steps

1. Choose deployment platform
2. Setup CI/CD pipeline
3. Configure monitoring
4. Test in production
5. Optimize performance
6. Scale as needed

For more help, see:
- [README.md](README.md)
- [PROJECT_SUMMARY.md](PROJECT_SUMMARY.md)
- [CONTRIBUTING.md](CONTRIBUTING.md)