Spaces:
Sleeping
Sleeping
| # Complete Deployment Guide | |
| ## Table of Contents | |
| 1. [Local Development](#local-development) | |
| 2. [Docker Deployment](#docker-deployment) | |
| 3. [HuggingFace Spaces](#huggingface-spaces) | |
| 4. [AWS Deployment](#aws-deployment) | |
| 5. [Google Cloud](#google-cloud) | |
| 6. [Azure Deployment](#azure-deployment) | |
| --- | |
| ## Local Development | |
| ### Prerequisites | |
| ```bash | |
| # System requirements | |
| - Python 3.10+ | |
| - FFmpeg | |
| - 4GB+ RAM | |
| - (Optional) CUDA-capable GPU | |
| ``` | |
| ### Setup | |
| ```bash | |
| # 1. Clone repository | |
| git clone https://github.com/YOUR_USERNAME/whisper-german-asr.git | |
| cd whisper-german-asr | |
| # 2. Run quick start script | |
| chmod +x scripts/quick_start.sh | |
| ./scripts/quick_start.sh | |
| # 3. Start services | |
| # Option A: Gradio Demo | |
| python demo/app.py | |
| # Option B: FastAPI | |
| uvicorn api.main:app --reload | |
| # Option C: Both (separate terminals) | |
| python demo/app.py & | |
| uvicorn api.main:app --port 8000 & | |
| ``` | |
| ### Testing | |
| ```bash | |
| # Test API | |
| curl -X POST "http://localhost:8000/transcribe" \ | |
| -F "file=@test_audio.wav" | |
| # Test Demo | |
| # Open http://localhost:7860 in browser | |
| ``` | |
| --- | |
| ## Docker Deployment | |
| ### Quick Start | |
| ```bash | |
| # Build and run with docker-compose | |
| docker-compose up -d | |
| # View logs | |
| docker-compose logs -f | |
| # Stop services | |
| docker-compose down | |
| ``` | |
| ### Manual Docker Build | |
| ```bash | |
| # Build image | |
| docker build -t whisper-asr . | |
| # Run API | |
| docker run -d \ | |
| -p 8000:8000 \ | |
| -v $(pwd)/whisper_test_tuned:/app/whisper_test_tuned:ro \ | |
| --name whisper-api \ | |
| whisper-asr | |
| # Run Demo | |
| docker run -d \ | |
| -p 7860:7860 \ | |
| -v $(pwd)/whisper_test_tuned:/app/whisper_test_tuned:ro \ | |
| --name whisper-demo \ | |
| whisper-asr python demo/app.py | |
| ``` | |
| ### Docker with GPU | |
| ```bash | |
| # Install nvidia-docker2 | |
| # https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html | |
| # Run with GPU | |
| docker run -d \ | |
| --gpus all \ | |
| -p 8000:8000 \ | |
| -v $(pwd)/whisper_test_tuned:/app/whisper_test_tuned:ro \ | |
| whisper-asr | |
| ``` | |
| --- | |
| ## HuggingFace Spaces | |
| ### Method 1: Gradio Space (Recommended) | |
| #### Step 1: Create Space | |
| 1. Go to https://huggingface.co/spaces | |
| 2. Click "Create new Space" | |
| 3. Settings: | |
| - **Name:** whisper-german-asr | |
| - **SDK:** Gradio | |
| - **Hardware:** CPU Basic (free) or GPU T4 (paid) | |
| - **Visibility:** Public | |
| #### Step 2: Prepare Files | |
| ```bash | |
| # Create a new directory for Space | |
| mkdir hf-space | |
| cd hf-space | |
| # Copy demo app | |
| cp ../demo/app.py app.py | |
| # Create requirements.txt | |
| cat > requirements.txt << EOF | |
| torch>=2.2.0 | |
| transformers>=4.42.0 | |
| librosa>=0.10.1 | |
| gradio>=4.0.0 | |
| soundfile>=0.12.1 | |
| EOF | |
| # Create README.md with frontmatter | |
| cat > README.md << EOF | |
| --- | |
| title: Whisper German ASR | |
| emoji: 🎙️ | |
| colorFrom: blue | |
| colorTo: green | |
| sdk: gradio | |
| sdk_version: 4.0.0 | |
| app_file: app.py | |
| pinned: false | |
| license: mit | |
| --- | |
| # Whisper German ASR | |
| Fine-tuned Whisper model for German speech recognition. | |
| Try it out by recording or uploading German audio! | |
| EOF | |
| ``` | |
| #### Step 3: Update app.py | |
| ```python | |
| # Modify model loading to use HF Hub | |
| def load_model(model_path="YOUR_USERNAME/whisper-small-german"): | |
| model = WhisperForConditionalGeneration.from_pretrained(model_path) | |
| processor = WhisperProcessor.from_pretrained(model_path) | |
| # ... rest of code | |
| ``` | |
| #### Step 4: Push Model to HF Hub (First Time) | |
| ```python | |
| # In Python | |
| from transformers import WhisperForConditionalGeneration, WhisperProcessor | |
| model = WhisperForConditionalGeneration.from_pretrained("./whisper_test_tuned") | |
| processor = WhisperProcessor.from_pretrained("openai/whisper-small") | |
| # Push to Hub | |
| model.push_to_hub("YOUR_USERNAME/whisper-small-german") | |
| processor.push_to_hub("YOUR_USERNAME/whisper-small-german") | |
| ``` | |
| #### Step 5: Deploy to Space | |
| ```bash | |
| # Clone Space repository | |
| git clone https://huggingface.co/spaces/YOUR_USERNAME/whisper-german-asr | |
| cd whisper-german-asr | |
| # Copy files | |
| cp ../hf-space/* . | |
| # Push to Space | |
| git add . | |
| git commit -m "Initial deployment" | |
| git push | |
| ``` | |
| ### Method 2: Docker Space | |
| ```dockerfile | |
| # Create Dockerfile in Space | |
| FROM python:3.10-slim | |
| WORKDIR /app | |
| RUN apt-get update && apt-get install -y ffmpeg libsndfile1 | |
| COPY requirements.txt . | |
| RUN pip install -r requirements.txt | |
| COPY app.py . | |
| CMD ["python", "app.py"] | |
| ``` | |
| --- | |
| ## AWS Deployment | |
| ### Option 1: ECS Fargate | |
| #### Step 1: Push Docker Image to ECR | |
| ```bash | |
| # Create ECR repository | |
| aws ecr create-repository --repository-name whisper-asr | |
| # Login to ECR | |
| aws ecr get-login-password --region us-east-1 | \ | |
| docker login --username AWS --password-stdin \ | |
| YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com | |
| # Tag and push | |
| docker tag whisper-asr:latest \ | |
| YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/whisper-asr:latest | |
| docker push YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/whisper-asr:latest | |
| ``` | |
| #### Step 2: Create ECS Task Definition | |
| ```json | |
| { | |
| "family": "whisper-asr", | |
| "networkMode": "awsvpc", | |
| "requiresCompatibilities": ["FARGATE"], | |
| "cpu": "1024", | |
| "memory": "2048", | |
| "containerDefinitions": [ | |
| { | |
| "name": "whisper-api", | |
| "image": "YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/whisper-asr:latest", | |
| "portMappings": [ | |
| { | |
| "containerPort": 8000, | |
| "protocol": "tcp" | |
| } | |
| ], | |
| "environment": [ | |
| { | |
| "name": "MODEL_PATH", | |
| "value": "/app/whisper_test_tuned" | |
| } | |
| ] | |
| } | |
| ] | |
| } | |
| ``` | |
| #### Step 3: Create ECS Service | |
| ```bash | |
| aws ecs create-service \ | |
| --cluster default \ | |
| --service-name whisper-asr \ | |
| --task-definition whisper-asr \ | |
| --desired-count 1 \ | |
| --launch-type FARGATE \ | |
| --network-configuration "awsvpcConfiguration={subnets=[subnet-xxx],securityGroups=[sg-xxx],assignPublicIp=ENABLED}" | |
| ``` | |
| ### Option 2: Lambda + API Gateway | |
| ```python | |
| # lambda_function.py | |
| import json | |
| import base64 | |
| from transformers import WhisperForConditionalGeneration, WhisperProcessor | |
| import librosa | |
| import io | |
| model = None | |
| processor = None | |
| def load_model(): | |
| global model, processor | |
| if model is None: | |
| model = WhisperForConditionalGeneration.from_pretrained("/tmp/model") | |
| processor = WhisperProcessor.from_pretrained("openai/whisper-small") | |
| def lambda_handler(event, context): | |
| load_model() | |
| # Decode base64 audio | |
| audio_data = base64.b64decode(event['body']) | |
| audio, sr = librosa.load(io.BytesIO(audio_data), sr=16000) | |
| # Transcribe | |
| input_features = processor(audio, sampling_rate=16000, return_tensors="pt").input_features | |
| predicted_ids = model.generate(input_features) | |
| transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0] | |
| return { | |
| 'statusCode': 200, | |
| 'body': json.dumps({'transcription': transcription}) | |
| } | |
| ``` | |
| --- | |
| ## Google Cloud | |
| ### Cloud Run Deployment | |
| #### Step 1: Build and Push to GCR | |
| ```bash | |
| # Enable APIs | |
| gcloud services enable run.googleapis.com | |
| gcloud services enable containerregistry.googleapis.com | |
| # Build image | |
| gcloud builds submit --tag gcr.io/PROJECT_ID/whisper-asr | |
| # Or use Docker | |
| docker tag whisper-asr gcr.io/PROJECT_ID/whisper-asr | |
| docker push gcr.io/PROJECT_ID/whisper-asr | |
| ``` | |
| #### Step 2: Deploy to Cloud Run | |
| ```bash | |
| gcloud run deploy whisper-asr \ | |
| --image gcr.io/PROJECT_ID/whisper-asr \ | |
| --platform managed \ | |
| --region us-central1 \ | |
| --allow-unauthenticated \ | |
| --memory 2Gi \ | |
| --cpu 2 \ | |
| --timeout 300 | |
| ``` | |
| #### Step 3: Get Service URL | |
| ```bash | |
| gcloud run services describe whisper-asr \ | |
| --platform managed \ | |
| --region us-central1 \ | |
| --format 'value(status.url)' | |
| ``` | |
| --- | |
| ## Azure Deployment | |
| ### Azure Container Instances | |
| #### Step 1: Push to Azure Container Registry | |
| ```bash | |
| # Create ACR | |
| az acr create --resource-group myResourceGroup \ | |
| --name whisperasr --sku Basic | |
| # Login | |
| az acr login --name whisperasr | |
| # Tag and push | |
| docker tag whisper-asr whisperasr.azurecr.io/whisper-asr:latest | |
| docker push whisperasr.azurecr.io/whisper-asr:latest | |
| ``` | |
| #### Step 2: Deploy Container Instance | |
| ```bash | |
| az container create \ | |
| --resource-group myResourceGroup \ | |
| --name whisper-asr \ | |
| --image whisperasr.azurecr.io/whisper-asr:latest \ | |
| --cpu 2 \ | |
| --memory 4 \ | |
| --registry-login-server whisperasr.azurecr.io \ | |
| --registry-username <username> \ | |
| --registry-password <password> \ | |
| --dns-name-label whisper-asr \ | |
| --ports 8000 | |
| ``` | |
| --- | |
| ## Production Considerations | |
| ### Security | |
| - [ ] Use HTTPS (SSL/TLS certificates) | |
| - [ ] Implement rate limiting | |
| - [ ] Add authentication/API keys | |
| - [ ] Validate file uploads | |
| - [ ] Set CORS policies | |
| ### Monitoring | |
| - [ ] Setup logging (CloudWatch, Stackdriver, etc.) | |
| - [ ] Add health checks | |
| - [ ] Monitor latency and errors | |
| - [ ] Track usage metrics | |
| ### Scaling | |
| - [ ] Configure auto-scaling | |
| - [ ] Use load balancer | |
| - [ ] Implement caching | |
| - [ ] Consider CDN for static assets | |
| ### Cost Optimization | |
| - [ ] Use spot/preemptible instances | |
| - [ ] Implement request batching | |
| - [ ] Cache model in memory | |
| - [ ] Monitor and optimize resource usage | |
| --- | |
| ## Troubleshooting | |
| ### Common Issues | |
| **Model Not Loading** | |
| ```bash | |
| # Check model path | |
| ls -la whisper_test_tuned/ | |
| # Check permissions | |
| chmod -R 755 whisper_test_tuned/ | |
| ``` | |
| **Out of Memory** | |
| ```bash | |
| # Reduce batch size | |
| # Use CPU instead of GPU | |
| # Increase container memory | |
| ``` | |
| **Slow Inference** | |
| ```bash | |
| # Use GPU | |
| # Reduce beam size | |
| # Use smaller model | |
| # Implement caching | |
| ``` | |
| **Port Already in Use** | |
| ```bash | |
| # Find process | |
| lsof -i :8000 | |
| # Kill process | |
| kill -9 <PID> | |
| # Use different port | |
| uvicorn api.main:app --port 8001 | |
| ``` | |
| --- | |
| ## Next Steps | |
| 1. Choose deployment platform | |
| 2. Setup CI/CD pipeline | |
| 3. Configure monitoring | |
| 4. Test in production | |
| 5. Optimize performance | |
| 6. Scale as needed | |
| For more help, see: | |
| - [README.md](README.md) | |
| - [PROJECT_SUMMARY.md](PROJECT_SUMMARY.md) | |
| - [CONTRIBUTING.md](CONTRIBUTING.md) | |