Spaces:

saadmannan
/

ASR-finetuning

Sleeping

App Files Files Community

ASR-finetuning / DEPLOYMENT_GUIDE.md

saadmannan

HF space application - exclude binary PDFs

5554ef1 3 months ago

preview code

raw

history blame contribute delete

9.6 kB

	# Complete Deployment Guide

	## Table of Contents
	1. [Local Development](#local-development)
	2. [Docker Deployment](#docker-deployment)
	3. [HuggingFace Spaces](#huggingface-spaces)
	4. [AWS Deployment](#aws-deployment)
	5. [Google Cloud](#google-cloud)
	6. [Azure Deployment](#azure-deployment)

	---

	## Local Development

	### Prerequisites
	```bash
	# System requirements
	- Python 3.10+
	- FFmpeg
	- 4GB+ RAM
	- (Optional) CUDA-capable GPU
	```

	### Setup
	```bash
	# 1. Clone repository
	git clone https://github.com/YOUR_USERNAME/whisper-german-asr.git
	cd whisper-german-asr

	# 2. Run quick start script
	chmod +x scripts/quick_start.sh
	./scripts/quick_start.sh

	# 3. Start services
	# Option A: Gradio Demo
	python demo/app.py

	# Option B: FastAPI
	uvicorn api.main:app --reload

	# Option C: Both (separate terminals)
	python demo/app.py &
	uvicorn api.main:app --port 8000 &
	```

	### Testing
	```bash
	# Test API
	curl -X POST "http://localhost:8000/transcribe" \
	-F "file=@test_audio.wav"

	# Test Demo
	# Open http://localhost:7860 in browser
	```

	---

	## Docker Deployment

	### Quick Start
	```bash
	# Build and run with docker-compose
	docker-compose up -d

	# View logs
	docker-compose logs -f

	# Stop services
	docker-compose down
	```

	### Manual Docker Build
	```bash
	# Build image
	docker build -t whisper-asr .

	# Run API
	docker run -d \
	-p 8000:8000 \
	-v $(pwd)/whisper_test_tuned:/app/whisper_test_tuned:ro \
	--name whisper-api \
	whisper-asr

	# Run Demo
	docker run -d \
	-p 7860:7860 \
	-v $(pwd)/whisper_test_tuned:/app/whisper_test_tuned:ro \
	--name whisper-demo \
	whisper-asr python demo/app.py
	```

	### Docker with GPU
	```bash
	# Install nvidia-docker2
	# https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html

	# Run with GPU
	docker run -d \
	--gpus all \
	-p 8000:8000 \
	-v $(pwd)/whisper_test_tuned:/app/whisper_test_tuned:ro \
	whisper-asr
	```

	---

	## HuggingFace Spaces

	### Method 1: Gradio Space (Recommended)

	#### Step 1: Create Space
	1. Go to https://huggingface.co/spaces
	2. Click "Create new Space"
	3. Settings:
	- Name: whisper-german-asr
	- SDK: Gradio
	- Hardware: CPU Basic (free) or GPU T4 (paid)
	- Visibility: Public

	#### Step 2: Prepare Files
	```bash
	# Create a new directory for Space
	mkdir hf-space
	cd hf-space

	# Copy demo app
	cp ../demo/app.py app.py

	# Create requirements.txt
	cat > requirements.txt << EOF
	torch>=2.2.0
	transformers>=4.42.0
	librosa>=0.10.1
	gradio>=4.0.0
	soundfile>=0.12.1
	EOF

	# Create README.md with frontmatter
	cat > README.md << EOF
	---
	title: Whisper German ASR
	emoji: 🎙️
	colorFrom: blue
	colorTo: green
	sdk: gradio
	sdk_version: 4.0.0
	app_file: app.py
	pinned: false
	license: mit
	---

	# Whisper German ASR

	Fine-tuned Whisper model for German speech recognition.

	Try it out by recording or uploading German audio!
	EOF
	```

	#### Step 3: Update app.py
	```python
	# Modify model loading to use HF Hub
	def load_model(model_path="YOUR_USERNAME/whisper-small-german"):
	model = WhisperForConditionalGeneration.from_pretrained(model_path)
	processor = WhisperProcessor.from_pretrained(model_path)
	# ... rest of code
	```

	#### Step 4: Push Model to HF Hub (First Time)
	```python
	# In Python
	from transformers import WhisperForConditionalGeneration, WhisperProcessor

	model = WhisperForConditionalGeneration.from_pretrained("./whisper_test_tuned")
	processor = WhisperProcessor.from_pretrained("openai/whisper-small")

	# Push to Hub
	model.push_to_hub("YOUR_USERNAME/whisper-small-german")
	processor.push_to_hub("YOUR_USERNAME/whisper-small-german")
	```

	#### Step 5: Deploy to Space
	```bash
	# Clone Space repository
	git clone https://huggingface.co/spaces/YOUR_USERNAME/whisper-german-asr
	cd whisper-german-asr

	# Copy files
	cp ../hf-space/* .

	# Push to Space
	git add .
	git commit -m "Initial deployment"
	git push
	```

	### Method 2: Docker Space

	```dockerfile
	# Create Dockerfile in Space
	FROM python:3.10-slim

	WORKDIR /app

	RUN apt-get update && apt-get install -y ffmpeg libsndfile1

	COPY requirements.txt .
	RUN pip install -r requirements.txt

	COPY app.py .

	CMD ["python", "app.py"]
	```

	---

	## AWS Deployment

	### Option 1: ECS Fargate

	#### Step 1: Push Docker Image to ECR
	```bash
	# Create ECR repository
	aws ecr create-repository --repository-name whisper-asr

	# Login to ECR
	aws ecr get-login-password --region us-east-1 \| \
	docker login --username AWS --password-stdin \
	YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com

	# Tag and push
	docker tag whisper-asr:latest \
	YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/whisper-asr:latest
	docker push YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/whisper-asr:latest
	```

	#### Step 2: Create ECS Task Definition
	```json
	{
	"family": "whisper-asr",
	"networkMode": "awsvpc",
	"requiresCompatibilities": ["FARGATE"],
	"cpu": "1024",
	"memory": "2048",
	"containerDefinitions": [
	{
	"name": "whisper-api",
	"image": "YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/whisper-asr:latest",
	"portMappings": [
	{
	"containerPort": 8000,
	"protocol": "tcp"
	}
	],
	"environment": [
	{
	"name": "MODEL_PATH",
	"value": "/app/whisper_test_tuned"
	}
	]
	}
	]
	}
	```

	#### Step 3: Create ECS Service
	```bash
	aws ecs create-service \
	--cluster default \
	--service-name whisper-asr \
	--task-definition whisper-asr \
	--desired-count 1 \
	--launch-type FARGATE \
	--network-configuration "awsvpcConfiguration={subnets=[subnet-xxx],securityGroups=[sg-xxx],assignPublicIp=ENABLED}"
	```

	### Option 2: Lambda + API Gateway

	```python
	# lambda_function.py
	import json
	import base64
	from transformers import WhisperForConditionalGeneration, WhisperProcessor
	import librosa
	import io

	model = None
	processor = None

	def load_model():
	global model, processor
	if model is None:
	model = WhisperForConditionalGeneration.from_pretrained("/tmp/model")
	processor = WhisperProcessor.from_pretrained("openai/whisper-small")

	def lambda_handler(event, context):
	load_model()

	# Decode base64 audio
	audio_data = base64.b64decode(event['body'])
	audio, sr = librosa.load(io.BytesIO(audio_data), sr=16000)

	# Transcribe
	input_features = processor(audio, sampling_rate=16000, return_tensors="pt").input_features
	predicted_ids = model.generate(input_features)
	transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0]

	return {
	'statusCode': 200,
	'body': json.dumps({'transcription': transcription})
	}
	```

	---

	## Google Cloud

	### Cloud Run Deployment

	#### Step 1: Build and Push to GCR
	```bash
	# Enable APIs
	gcloud services enable run.googleapis.com
	gcloud services enable containerregistry.googleapis.com

	# Build image
	gcloud builds submit --tag gcr.io/PROJECT_ID/whisper-asr

	# Or use Docker
	docker tag whisper-asr gcr.io/PROJECT_ID/whisper-asr
	docker push gcr.io/PROJECT_ID/whisper-asr
	```

	#### Step 2: Deploy to Cloud Run
	```bash
	gcloud run deploy whisper-asr \
	--image gcr.io/PROJECT_ID/whisper-asr \
	--platform managed \
	--region us-central1 \
	--allow-unauthenticated \
	--memory 2Gi \
	--cpu 2 \
	--timeout 300
	```

	#### Step 3: Get Service URL
	```bash
	gcloud run services describe whisper-asr \
	--platform managed \
	--region us-central1 \
	--format 'value(status.url)'
	```

	---

	## Azure Deployment

	### Azure Container Instances

	#### Step 1: Push to Azure Container Registry
	```bash
	# Create ACR
	az acr create --resource-group myResourceGroup \
	--name whisperasr --sku Basic

	# Login
	az acr login --name whisperasr

	# Tag and push
	docker tag whisper-asr whisperasr.azurecr.io/whisper-asr:latest
	docker push whisperasr.azurecr.io/whisper-asr:latest
	```

	#### Step 2: Deploy Container Instance
	```bash
	az container create \
	--resource-group myResourceGroup \
	--name whisper-asr \
	--image whisperasr.azurecr.io/whisper-asr:latest \
	--cpu 2 \
	--memory 4 \
	--registry-login-server whisperasr.azurecr.io \
	--registry-username <username> \
	--registry-password <password> \
	--dns-name-label whisper-asr \
	--ports 8000
	```

	---

	## Production Considerations

	### Security
	- [ ] Use HTTPS (SSL/TLS certificates)
	- [ ] Implement rate limiting
	- [ ] Add authentication/API keys
	- [ ] Validate file uploads
	- [ ] Set CORS policies

	### Monitoring
	- [ ] Setup logging (CloudWatch, Stackdriver, etc.)
	- [ ] Add health checks
	- [ ] Monitor latency and errors
	- [ ] Track usage metrics

	### Scaling
	- [ ] Configure auto-scaling
	- [ ] Use load balancer
	- [ ] Implement caching
	- [ ] Consider CDN for static assets

	### Cost Optimization
	- [ ] Use spot/preemptible instances
	- [ ] Implement request batching
	- [ ] Cache model in memory
	- [ ] Monitor and optimize resource usage

	---

	## Troubleshooting

	### Common Issues

	Model Not Loading
	```bash
	# Check model path
	ls -la whisper_test_tuned/

	# Check permissions
	chmod -R 755 whisper_test_tuned/
	```

	Out of Memory
	```bash
	# Reduce batch size
	# Use CPU instead of GPU
	# Increase container memory
	```

	Slow Inference
	```bash
	# Use GPU
	# Reduce beam size
	# Use smaller model
	# Implement caching
	```

	Port Already in Use
	```bash
	# Find process
	lsof -i :8000

	# Kill process
	kill -9 <PID>

	# Use different port
	uvicorn api.main:app --port 8001
	```

	---

	## Next Steps

	1. Choose deployment platform
	2. Setup CI/CD pipeline
	3. Configure monitoring
	4. Test in production
	5. Optimize performance
	6. Scale as needed

	For more help, see:
	- [README.md](README.md)
	- [PROJECT_SUMMARY.md](PROJECT_SUMMARY.md)
	- [CONTRIBUTING.md](CONTRIBUTING.md)