Spaces:

Betimes-Solution
/

Azure_Powered_AI_Summary

Sleeping

App Files Files Community

Chirapath commited on Sep 2, 2025

Commit

841f71f

verified ·

1 Parent(s): a5568a5

Delete Developer.md

Browse files

Files changed (1) hide show

Developer.md +0 -1904

Developer.md DELETED Viewed

@@ -1,1904 +0,0 @@
-# 🛠️ Azure Speech Transcription - Developer Guide
-## 📋 Table of Contents
-- [System Architecture](#-system-architecture)
-- [Development Environment](#-development-environment)
-- [Deployment Guide](#-deployment-guide)
-- [API Documentation](#-api-documentation)
-- [Database Schema](#-database-schema)
-- [Security Implementation](#-security-implementation)
-- [Monitoring & Maintenance](#-monitoring--maintenance)
-- [Contributing Guidelines](#-contributing-guidelines)
-- [Advanced Configuration](#-advanced-configuration)
-- [Troubleshooting](#-troubleshooting)
----
-## 🏗️ System Architecture
-### Overview
-The Azure Speech Transcription service is built with a modern, secure architecture focusing on user privacy, PDPA compliance, and scalability.
-```
-┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
-│   Frontend UI   │    │   Backend API   │    │ Azure Services  │
-│   (Gradio)      │◄──►│   (Python)      │◄──►│ Speech & Blob   │
-└─────────────────┘    └─────────────────┘    └─────────────────┘
-         │                       │                       │
-         │                       │                       │
-         ▼                       ▼                       ▼
-┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
-│   User Session  │    │ SQLite Database │    │  User Storage   │
-│   Management    │    │   (Metadata)    │    │   (Isolated)    │
-└─────────────────┘    └─────────────────┘    └─────────────────┘
-```
-### Core Components
-#### 1. Frontend Layer (`gradio_app.py`)
-- **Technology**: Gradio with custom CSS
-- **Purpose**: User interface and session management
-- **Features**: Authentication, file upload, real-time status, history management
-#### 2. Backend Layer (`app_core.py`)
-- **Technology**: Python with threading and async processing
-- **Purpose**: Business logic, authentication, and Azure integration
-- **Features**: User management, transcription processing, PDPA compliance
-#### 3. Data Layer
-- **Database**: SQLite with Azure Blob backup
-- **Storage**: Azure Blob Storage with user separation
-- **Security**: User-isolated folders and encrypted connections
-#### 4. External Services
-- **Azure Speech Services**: Transcription processing
-- **Azure Blob Storage**: File and database storage
-- **FFmpeg**: Audio/video conversion
-### Data Flow
-```
-1. User uploads file → 2. Authentication check → 3. File validation
-        ↓                       ↓                       ↓
-8. Download results ← 7. Store transcript ← 6. Process with Azure
-        ↑                       ↑                       ↑
-9. Update UI status ← 4. Save to user folder ← 5. Background processing
-```
----
-## 💻 Development Environment
-### Prerequisites
-- **Python**: 3.8 or higher
-- **Azure Account**: With Speech Services and Blob Storage
-- **FFmpeg**: For audio/video processing
-- **Git**: For version control
-### Environment Setup
-#### 1. Clone Repository
-```bash
-git clone <repository-url>
-cd azure-speech-transcription
-```
-#### 2. Virtual Environment
-```bash
-# Create virtual environment
-python -m venv venv
-# Activate (Windows)
-venv\Scripts\activate
-# Activate (macOS/Linux)
-source venv/bin/activate
-```
-#### 3. Install Dependencies
-```bash
-pip install -r requirements.txt
-```
-#### 4. Environment Configuration
-```bash
-# Copy environment template
-cp .env.example .env
-# Edit with your Azure credentials
-nano .env
-```
-#### 5. Install FFmpeg
-**Windows (Chocolatey):**
-```bash
-choco install ffmpeg
-```
-**macOS (Homebrew):**
-```bash
-brew install ffmpeg
-```
-**Ubuntu/Debian:**
-```bash
-sudo apt update
-sudo apt install ffmpeg
-```
-#### 6. Verify Installation
-```python
-python -c "
-import gradio as gr
-from azure.storage.blob import BlobServiceClient
-import subprocess
-print('Gradio:', gr.__version__)
-print('FFmpeg:', subprocess.run(['ffmpeg', '-version'], capture_output=True).returncode == 0)
-print('Azure Blob:', 'OK')
-"
-```
-### Development Server
-```bash
-# Start development server
-python gradio_app.py
-# Server will be available at:
-# http://localhost:7860
-```
-### Development Tools
-#### Recommended IDE Setup
-- **VS Code**: With Python, Azure, and Git extensions
-- **PyCharm**: Professional edition with Azure toolkit
-- **Vim/Emacs**: With appropriate Python plugins
-#### Useful Extensions
-```json
-{
-  "recommendations": [
-    "ms-python.python",
-    "ms-vscode.azure-cli",
-    "ms-azuretools.azure-cli-tools",
-    "ms-python.black-formatter",
-    "ms-python.flake8"
-  ]
-}
-```
-#### Code Quality Tools
-```bash
-# Install development tools
-pip install black flake8 pytest mypy
-# Format code
-black .
-# Lint code
-flake8 .
-# Type checking
-mypy app_core.py gradio_app.py
-```
----
-## 🚀 Deployment Guide
-### Production Deployment Options
-#### Option 1: Traditional Server Deployment
-**1. Server Preparation**
-```bash
-# Update system
-sudo apt update && sudo apt upgrade -y
-# Install Python and dependencies
-sudo apt install python3 python3-pip python3-venv nginx ffmpeg -y
-# Create application user
-sudo useradd -m -s /bin/bash transcription
-sudo su - transcription
-```
-**2. Application Setup**
-```bash
-# Clone repository
-git clone <repository-url> /home/transcription/app
-cd /home/transcription/app
-# Setup virtual environment
-python3 -m venv venv
-source venv/bin/activate
-pip install -r requirements.txt
-# Configure environment
-cp .env.example .env
-# Edit .env with production values
-```
-**3. Systemd Service**
-```ini
-# /etc/systemd/system/transcription.service
-[Unit]
-Description=Azure Speech Transcription Service
-After=network.target
-[Service]
-Type=simple
-User=transcription
-Group=transcription
-WorkingDirectory=/home/transcription/app
-Environment=PATH=/home/transcription/app/venv/bin
-ExecStart=/home/transcription/app/venv/bin/python gradio_app.py
-Restart=always
-RestartSec=10
-[Install]
-WantedBy=multi-user.target
-```
-**4. Nginx Configuration**
-```nginx
-# /etc/nginx/sites-available/transcription
-server {
-    listen 80;
-    server_name your-domain.com;
-    client_max_body_size 500M;
-    location / {
-        proxy_pass http://127.0.0.1:7860;
-        proxy_set_header Host $host;
-        proxy_set_header X-Real-IP $remote_addr;
-        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
-        proxy_set_header X-Forwarded-Proto $scheme;
-        proxy_read_timeout 300s;
-        proxy_connect_timeout 75s;
-    }
-}
-```
-**5. SSL Certificate**
-```bash
-# Install Certbot
-sudo apt install certbot python3-certbot-nginx -y
-# Get SSL certificate
-sudo certbot --nginx -d your-domain.com
-# Verify auto-renewal
-sudo certbot renew --dry-run
-```
-**6. Start Services**
-```bash
-# Enable and start application
-sudo systemctl enable transcription
-sudo systemctl start transcription
-# Enable and restart nginx
-sudo systemctl enable nginx
-sudo systemctl restart nginx
-# Check status
-sudo systemctl status transcription
-sudo systemctl status nginx
-```
-#### Option 2: Docker Deployment
-**1. Dockerfile**
-```dockerfile
-FROM python:3.9-slim
-# Install system dependencies
-RUN apt-get update && apt-get install -y \
-    ffmpeg \
-    && rm -rf /var/lib/apt/lists/*
-# Set working directory
-WORKDIR /app
-# Copy requirements and install Python dependencies
-COPY requirements.txt .
-RUN pip install --no-cache-dir -r requirements.txt
-# Copy application code
-COPY . .
-# Create necessary directories
-RUN mkdir -p uploads database temp
-# Expose port
-EXPOSE 7860
-# Run application
-CMD ["python", "gradio_app.py"]
-```
-**2. Docker Compose**
-```yaml
-# docker-compose.yml
-version: '3.8'
-services:
-  transcription:
-    build: .
-    ports:
-      - "7860:7860"
-    environment:
-      - AZURE_SPEECH_KEY=${AZURE_SPEECH_KEY}
-      - AZURE_SPEECH_KEY_ENDPOINT=${AZURE_SPEECH_KEY_ENDPOINT}
-      - AZURE_REGION=${AZURE_REGION}
-      - AZURE_BLOB_CONNECTION=${AZURE_BLOB_CONNECTION}
-      - AZURE_CONTAINER=${AZURE_CONTAINER}
-      - AZURE_BLOB_SAS_TOKEN=${AZURE_BLOB_SAS_TOKEN}
-      - ALLOWED_LANGS=${ALLOWED_LANGS}
-    volumes:
-      - ./uploads:/app/uploads
-      - ./database:/app/database
-      - ./temp:/app/temp
-    restart: unless-stopped
-  nginx:
-    image: nginx:alpine
-    ports:
-      - "80:80"
-      - "443:443"
-    volumes:
-      - ./nginx.conf:/etc/nginx/nginx.conf
-      - ./ssl:/etc/ssl/certs
-    depends_on:
-      - transcription
-    restart: unless-stopped
-```
-**3. Deploy with Docker**
-```bash
-# Build and start
-docker-compose up -d
-# View logs
-docker-compose logs -f transcription
-# Update application
-git pull
-docker-compose build transcription
-docker-compose up -d transcription
-```
-#### Option 3: Cloud Deployment (Azure Container Instances)
-**1. Create Container Registry**
-```bash
-# Create ACR
-az acr create --resource-group myResourceGroup \
-  --name myregistry --sku Basic
-# Login to ACR
-az acr login --name myregistry
-# Build and push image
-docker build -t myregistry.azurecr.io/transcription:latest .
-docker push myregistry.azurecr.io/transcription:latest
-```
-**2. Deploy Container Instance**
-```bash
-# Create container instance
-az container create \
-  --resource-group myResourceGroup \
-  --name transcription-app \
-  --image myregistry.azurecr.io/transcription:latest \
-  --cpu 2 --memory 4 \
-  --port 7860 \
-  --environment-variables \
-    AZURE_SPEECH_KEY=$AZURE_SPEECH_KEY \
-    AZURE_SPEECH_KEY_ENDPOINT=$AZURE_SPEECH_KEY_ENDPOINT \
-    AZURE_REGION=$AZURE_REGION \
-    AZURE_BLOB_CONNECTION="$AZURE_BLOB_CONNECTION" \
-    AZURE_CONTAINER=$AZURE_CONTAINER \
-    AZURE_BLOB_SAS_TOKEN="$AZURE_BLOB_SAS_TOKEN"
-```
----
-## 📡 API Documentation
-### Core Classes and Methods
-#### TranscriptionManager Class
-**Purpose**: Main service class handling all transcription operations
-```python
-class TranscriptionManager:
-    def __init__(self)
-    # User Authentication
-    def register_user(email: str, username: str, password: str,
-                     gdpr_consent: bool, data_retention_agreed: bool,
-                     marketing_consent: bool) -> Tuple[bool, str, Optional[str]]
-    def login_user(login: str, password: str) -> Tuple[bool, str, Optional[User]]
-    # Transcription Operations
-    def submit_transcription(file_bytes: bytes, original_filename: str,
-                           user_id: str, language: str,
-                           settings: Dict) -> str
-    def get_job_status(job_id: str) -> Optional[TranscriptionJob]
-    # Data Management
-    def get_user_history(user_id: str, limit: int) -> List[TranscriptionJob]
-    def get_user_stats(user_id: str) -> Dict
-    def export_user_data(user_id: str) -> Dict
-    def delete_user_account(user_id: str) -> bool
-```
-#### DatabaseManager Class
-**Purpose**: Handle database operations and Azure blob synchronization
-```python
-class DatabaseManager:
-    def __init__(db_path: str = None)
-    # User Operations
-    def create_user(...) -> Tuple[bool, str, Optional[str]]
-    def authenticate_user(login: str, password: str) -> Tuple[bool, str, Optional[User]]
-    def get_user_by_id(user_id: str) -> Optional[User]
-    # Job Operations
-    def save_job(job: TranscriptionJob)
-    def get_job(job_id: str) -> Optional[TranscriptionJob]
-    def get_user_jobs(user_id: str, limit: int) -> List[TranscriptionJob]
-    def get_pending_jobs() -> List[TranscriptionJob]
-```
-#### AuthManager Class
-**Purpose**: Authentication utilities and validation
-```python
-class AuthManager:
-    @staticmethod
-    def hash_password(password: str) -> str
-    def verify_password(password: str, password_hash: str) -> bool
-    def validate_email(email: str) -> bool
-    def validate_username(username: str) -> bool
-    def validate_password(password: str) -> Tuple[bool, str]
-```
-### Data Models
-#### User Model
-```python
-@dataclass
-class User:
-    user_id: str
-    email: str
-    username: str
-    password_hash: str
-    created_at: str
-    last_login: Optional[str] = None
-    is_active: bool = True
-    gdpr_consent: bool = False
-    data_retention_agreed: bool = False
-    marketing_consent: bool = False
-```
-#### TranscriptionJob Model
-```python
-@dataclass
-class TranscriptionJob:
-    job_id: str
-    user_id: str
-    original_filename: str
-    audio_url: str
-    language: str
-    status: str  # pending, processing, completed, failed
-    created_at: str
-    completed_at: Optional[str] = None
-    transcript_text: Optional[str] = None
-    transcript_url: Optional[str] = None
-    error_message: Optional[str] = None
-    azure_trans_id: Optional[str] = None
-    settings: Optional[Dict] = None
-```
-### Configuration Parameters
-#### Environment Variables
-```python
-# Required
-AZURE_SPEECH_KEY: str
-AZURE_SPEECH_KEY_ENDPOINT: str
-AZURE_REGION: str
-AZURE_BLOB_CONNECTION: str
-AZURE_CONTAINER: str
-AZURE_BLOB_SAS_TOKEN: str
-# Optional
-ALLOWED_LANGS: str  # JSON string
-API_VERSION: str = "v3.2"
-PASSWORD_SALT: str = "default_salt"
-MAX_FILE_SIZE_MB: int = 500
-```
-#### Transcription Settings
-```python
-settings = {
-    'audio_format': str,           # wav, mp3, etc.
-    'diarization_enabled': bool,   # Speaker identification
-    'speakers': int,               # Max speakers (1-10)
-    'profanity': str,              # masked, removed, raw
-    'punctuation': str,            # automatic, dictated, none
-    'timestamps': bool,            # Include timestamps
-    'lexical': bool,               # Include lexical forms
-    'language_id_enabled': bool,   # Auto language detection
-    'candidate_locales': List[str] # Language candidates
-}
-```
----
-## 🗄️ Database Schema
-### SQLite Database Structure
-#### Users Table
-```sql
-CREATE TABLE users (
-    user_id TEXT PRIMARY KEY,
-    email TEXT UNIQUE NOT NULL,
-    username TEXT UNIQUE NOT NULL,
-    password_hash TEXT NOT NULL,
-    created_at TEXT NOT NULL,
-    last_login TEXT,
-    is_active BOOLEAN DEFAULT 1,
-    gdpr_consent BOOLEAN DEFAULT 0,
-    data_retention_agreed BOOLEAN DEFAULT 0,
-    marketing_consent BOOLEAN DEFAULT 0
-);
--- Indexes
-CREATE INDEX idx_users_email ON users(email);
-CREATE INDEX idx_users_username ON users(username);
-```
-#### Transcriptions Table
-```sql
-CREATE TABLE transcriptions (
-    job_id TEXT PRIMARY KEY,
-    user_id TEXT NOT NULL,
-    original_filename TEXT NOT NULL,
-    audio_url TEXT,
-    language TEXT NOT NULL,
-    status TEXT NOT NULL,
-    created_at TEXT NOT NULL,
-    completed_at TEXT,
-    transcript_text TEXT,
-    transcript_url TEXT,
-    error_message TEXT,
-    azure_trans_id TEXT,
-    settings TEXT,
-    FOREIGN KEY (user_id) REFERENCES users (user_id)
-);
--- Indexes
-CREATE INDEX idx_transcriptions_user_id ON transcriptions(user_id);
-CREATE INDEX idx_transcriptions_status ON transcriptions(status);
-CREATE INDEX idx_transcriptions_created_at ON transcriptions(created_at DESC);
-CREATE INDEX idx_transcriptions_user_created ON transcriptions(user_id, created_at DESC);
-```
-### Azure Blob Storage Structure
-```
-Container: {AZURE_CONTAINER}/
-├── shared/
-│   └── database/
-│       └── transcriptions.db           # Shared database backup
-├── users/
-│   ├── {user-id-1}/
-│   │   ├── audio/                      # Processed audio files
-│   │   │   ├── {job-id-1}.wav
-│   │   │   └── {job-id-2}.wav
-│   │   ├── transcripts/                # Transcript files
-│   │   │   ├── {job-id-1}.txt
-│   │   │   └── {job-id-2}.txt
-│   │   └── originals/                  # Original uploaded files
-│   │       ├── {job-id-1}_{filename}.mp4
-│   │       └── {job-id-2}_{filename}.wav
-│   └── {user-id-2}/
-│       ├── audio/
-│       ├── transcripts/
-│       └── originals/
-```
-### Database Operations
-#### User Management Queries
-```sql
--- Create user
-INSERT INTO users (user_id, email, username, password_hash, created_at,
-                   gdpr_consent, data_retention_agreed, marketing_consent)
-VALUES (?, ?, ?, ?, ?, ?, ?, ?);
--- Authenticate user
-SELECT * FROM users
-WHERE (email = ? OR username = ?) AND is_active = 1;
--- Update last login
-UPDATE users SET last_login = ? WHERE user_id = ?;
--- Get user stats
-SELECT status, COUNT(*) FROM transcriptions
-WHERE user_id = ? GROUP BY status;
-```
-#### Job Management Queries
-```sql
--- Create job
-INSERT INTO transcriptions (job_id, user_id, original_filename, language,
-                           status, created_at, settings)
-VALUES (?, ?, ?, ?, 'pending', ?, ?);
--- Update job status
-UPDATE transcriptions
-SET status = ?, completed_at = ?, transcript_text = ?, transcript_url = ?
-WHERE job_id = ?;
--- Get user jobs
-SELECT * FROM transcriptions
-WHERE user_id = ?
-ORDER BY created_at DESC LIMIT ?;
--- Get pending jobs for background processor
-SELECT * FROM transcriptions
-WHERE status IN ('pending', 'processing');
-```
----
-## 🔒 Security Implementation
-### Authentication Security
-#### Password Security
-```python
-# Password hashing with salt
-def hash_password(password: str) -> str:
-    salt = os.environ.get("PASSWORD_SALT", "default_salt")
-    return hashlib.sha256((password + salt).encode()).hexdigest()
-# Password validation
-def validate_password(password: str) -> Tuple[bool, str]:
-    if len(password) < 8:
-        return False, "Password must be at least 8 characters"
-    if not re.search(r'[A-Z]', password):
-        return False, "Password must contain uppercase letter"
-    if not re.search(r'[a-z]', password):
-        return False, "Password must contain lowercase letter"
-    if not re.search(r'\d', password):
-        return False, "Password must contain number"
-    return True, "Valid"
-```
-#### Session Management
-```python
-# User session state
-session_state = {
-    'user_id': str,
-    'username': str,
-    'logged_in_at': datetime,
-    'last_activity': datetime
-}
-# Session validation
-def validate_session(session_state: dict) -> bool:
-    if not session_state or 'user_id' not in session_state:
-        return False
-    # Check session timeout (if implemented)
-    last_activity = session_state.get('last_activity')
-    if last_activity:
-        timeout = timedelta(hours=24)  # 24-hour sessions
-        if datetime.now() - last_activity > timeout:
-            return False
-    return True
-```
-### Data Security
-#### Access Control
-```python
-# User data access verification
-def verify_user_access(job_id: str, user_id: str) -> bool:
-    job = get_job(job_id)
-    return job and job.user_id == user_id
-# File path security
-def get_user_blob_path(user_id: str, blob_type: str, filename: str) -> str:
-    # Ensure user can only access their own folder
-    safe_filename = os.path.basename(filename)  # Prevent path traversal
-    return f"users/{user_id}/{blob_type}/{safe_filename}"
-```
-#### Data Encryption
-```python
-# Azure Blob Storage encryption (configured at Azure level)
-# - Encryption at rest: Enabled by default
-# - Encryption in transit: HTTPS enforced
-# - Customer-managed keys: Optional enhancement
-# Database encryption (for sensitive fields)
-from cryptography.fernet import Fernet
-def encrypt_sensitive_data(data: str, key: bytes) -> str:
-    f = Fernet(key)
-    return f.encrypt(data.encode()).decode()
-def decrypt_sensitive_data(encrypted_data: str, key: bytes) -> str:
-    f = Fernet(key)
-    return f.decrypt(encrypted_data.encode()).decode()
-```
-### Azure Security
-#### Blob Storage Security
-```python
-# SAS token configuration for least privilege
-sas_permissions = BlobSasPermissions(
-    read=True,
-    write=True,
-    delete=True,
-    list=True
-)
-# IP restrictions (optional)
-sas_ip_range = "192.168.1.0/24"  # Restrict to specific IP range
-# Time-limited tokens
-sas_expiry = datetime.utcnow() + timedelta(hours=1)
-```
-#### Speech Service Security
-```python
-# Secure API calls
-headers = {
-    "Ocp-Apim-Subscription-Key": AZURE_SPEECH_KEY,
-    "Content-Type": "application/json"
-}
-# Request timeout and retry logic
-response = requests.post(
-    url,
-    headers=headers,
-    json=body,
-    timeout=30,
-    verify=True  # Verify SSL certificates
-)
-```
-### Input Validation
-#### File Upload Security
-```python
-def validate_uploaded_file(file_path: str, max_size: int = 500 * 1024 * 1024) -> Tuple[bool, str]:
-    try:
-        # Check file exists
-        if not os.path.exists(file_path):
-            return False, "File not found"
-        # Check file size
-        file_size = os.path.getsize(file_path)
-        if file_size > max_size:
-            return False, f"File too large: {file_size / 1024 / 1024:.1f}MB"
-        # Check file type by content (not just extension)
-        import magic
-        mime_type = magic.from_file(file_path, mime=True)
-        allowed_types = ['audio/', 'video/']
-        if not any(mime_type.startswith(t) for t in allowed_types):
-            return False, f"Invalid file type: {mime_type}"
-        return True, "Valid"
-    except Exception as e:
-        return False, f"Validation error: {str(e)}"
-```
-#### SQL Injection Prevention
-```python
-# Use parameterized queries (already implemented)
-cursor.execute(
-    "SELECT * FROM users WHERE email = ? AND password_hash = ?",
-    (email, password_hash)
-)
-# Input sanitization
-def sanitize_input(user_input: str) -> str:
-    # Remove dangerous characters
-    import html
-    sanitized = html.escape(user_input)
-    # Limit length
-    return sanitized[:1000]
-```
----
-## 📊 Monitoring & Maintenance
-### Application Monitoring
-#### Health Checks
-```python
-def health_check() -> Dict[str, Any]:
-    """System health check endpoint"""
-    try:
-        # Database check
-        db_status = check_database_connection()
-        # Azure services check
-        blob_status = check_blob_storage()
-        speech_status = check_speech_service()
-        # FFmpeg check
-        ffmpeg_status = check_ffmpeg_installation()
-        # Disk space check
-        disk_status = check_disk_space()
-        return {
-            'status': 'healthy' if all([db_status, blob_status, speech_status, ffmpeg_status]) else 'unhealthy',
-            'timestamp': datetime.now().isoformat(),
-            'services': {
-                'database': db_status,
-                'blob_storage': blob_status,
-                'speech_service': speech_status,
-                'ffmpeg': ffmpeg_status,
-                'disk_space': disk_status
-            }
-        }
-    except Exception as e:
-        return {
-            'status': 'error',
-            'timestamp': datetime.now().isoformat(),
-            'error': str(e)
-        }
-def check_database_connection() -> bool:
-    try:
-        with transcription_manager.db.get_connection() as conn:
-            conn.execute("SELECT 1").fetchone()
-        return True
-    except:
-        return False
-def check_blob_storage() -> bool:
-    try:
-        client = BlobServiceClient.from_connection_string(AZURE_BLOB_CONNECTION)
-        client.list_containers(max_results=1)
-        return True
-    except:
-        return False
-```
-#### Logging Configuration
-```python
-import logging
-from logging.handlers import RotatingFileHandler
-def setup_logging():
-    """Configure application logging"""
-    # Create formatter
-    formatter = logging.Formatter(
-        '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
-    )
-    # Console handler
-    console_handler = logging.StreamHandler()
-    console_handler.setFormatter(formatter)
-    console_handler.setLevel(logging.INFO)
-    # File handler with rotation
-    file_handler = RotatingFileHandler(
-        'logs/transcription.log',
-        maxBytes=10*1024*1024,  # 10MB
-        backupCount=5
-    )
-    file_handler.setFormatter(formatter)
-    file_handler.setLevel(logging.DEBUG)
-    # Configure root logger
-    logger = logging.getLogger()
-    logger.setLevel(logging.DEBUG)
-    logger.addHandler(console_handler)
-    logger.addHandler(file_handler)
-    # Separate logger for sensitive operations
-    auth_logger = logging.getLogger('auth')
-    auth_handler = RotatingFileHandler(
-        'logs/auth.log',
-        maxBytes=5*1024*1024,  # 5MB
-        backupCount=10
-    )
-    auth_handler.setFormatter(formatter)
-    auth_logger.addHandler(auth_handler)
-    auth_logger.setLevel(logging.INFO)
-```
-#### Performance Monitoring
-```python
-import time
-from functools import wraps
-def monitor_performance(func):
-    """Decorator to monitor function performance"""
-    @wraps(func)
-    def wrapper(*args, **kwargs):
-        start_time = time.time()
-        try:
-            result = func(*args, **kwargs)
-            duration = time.time() - start_time
-            logging.info(f"{func.__name__} completed in {duration:.2f}s")
-            return result
-        except Exception as e:
-            duration = time.time() - start_time
-            logging.error(f"{func.__name__} failed after {duration:.2f}s: {str(e)}")
-            raise
-    return wrapper
-# Usage
-@monitor_performance
-def submit_transcription(self, file_bytes, filename, user_id, language, settings):
-    # Implementation here
-    pass
-```
-### Database Maintenance
-#### Backup Strategy
-```python
-def backup_database():
-    """Backup database to Azure Blob Storage"""
-    try:
-        # Create timestamped backup
-        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
-        backup_name = f"shared/backups/transcriptions_backup_{timestamp}.db"
-        # Upload current database
-        blob_client = blob_service.get_blob_client(
-            container=AZURE_CONTAINER,
-            blob=backup_name
-        )
-        with open(db_path, "rb") as data:
-            blob_client.upload_blob(data)
-        logging.info(f"Database backup created: {backup_name}")
-        # Clean old backups (keep last 30 days)
-        cleanup_old_backups()
-    except Exception as e:
-        logging.error(f"Database backup failed: {str(e)}")
-def cleanup_old_backups():
-    """Remove backups older than 30 days"""
-    try:
-        cutoff_date = datetime.now() - timedelta(days=30)
-        container_client = blob_service.get_container_client(AZURE_CONTAINER)
-        for blob in container_client.list_blobs(name_starts_with="shared/backups/"):
-            if blob.last_modified < cutoff_date:
-                blob_service.delete_blob(AZURE_CONTAINER, blob.name)
-                logging.info(f"Deleted old backup: {blob.name}")
-    except Exception as e:
-        logging.error(f"Backup cleanup failed: {str(e)}")
-```
-#### Database Optimization
-```python
-def optimize_database():
-    """Optimize database performance"""
-    try:
-        with transcription_manager.db.get_connection() as conn:
-            # Analyze tables
-            conn.execute("ANALYZE")
-            # Vacuum database (compact)
-            conn.execute("VACUUM")
-            # Update statistics
-            conn.execute("PRAGMA optimize")
-        logging.info("Database optimization completed")
-    except Exception as e:
-        logging.error(f"Database optimization failed: {str(e)}")
-# Schedule optimization (run weekly)
-import schedule
-schedule.every().week.do(optimize_database)
-schedule.every().day.at("02:00").do(backup_database)
-```
-### Resource Management
-#### Cleanup Tasks
-```python
-def cleanup_temporary_files():
-    """Clean up temporary files older than 24 hours"""
-    try:
-        cutoff_time = time.time() - (24 * 60 * 60)  # 24 hours ago
-        temp_dirs = ['uploads', 'temp']
-        for temp_dir in temp_dirs:
-            if os.path.exists(temp_dir):
-                for filename in os.listdir(temp_dir):
-                    filepath = os.path.join(temp_dir, filename)
-                    if os.path.isfile(filepath) and os.path.getmtime(filepath) < cutoff_time:
-                        os.remove(filepath)
-                        logging.info(f"Cleaned up temporary file: {filepath}")
-    except Exception as e:
-        logging.error(f"Temporary file cleanup failed: {str(e)}")
-def monitor_disk_space():
-    """Monitor and alert on disk space"""
-    try:
-        import shutil
-        total, used, free = shutil.disk_usage("/")
-        # Convert to GB
-        free_gb = free // (1024**3)
-        total_gb = total // (1024**3)
-        usage_percent = (used / total) * 100
-        if usage_percent > 85:
-            logging.warning(f"High disk usage: {usage_percent:.1f}% ({free_gb}GB free)")
-        if free_gb < 5:
-            logging.critical(f"Low disk space: {free_gb}GB remaining")
-    except Exception as e:
-        logging.error(f"Disk space monitoring failed: {str(e)}")
-```
-### Monitoring Alerts
-#### Email Alerts (Optional)
-```python
-import smtplib
-from email.mime.text import MIMEText
-def send_alert(subject: str, message: str):
-    """Send email alert for critical issues"""
-    try:
-        smtp_server = os.environ.get("SMTP_SERVER")
-        smtp_port = int(os.environ.get("SMTP_PORT", "587"))
-        smtp_user = os.environ.get("SMTP_USER")
-        smtp_pass = os.environ.get("SMTP_PASS")
-        alert_email = os.environ.get("ALERT_EMAIL")
-        if not all([smtp_server, smtp_user, smtp_pass, alert_email]):
-            return  # Email not configured
-        msg = MIMEText(message)
-        msg['Subject'] = f"[Transcription Service] {subject}"
-        msg['From'] = smtp_user
-        msg['To'] = alert_email
-        with smtplib.SMTP(smtp_server, smtp_port) as server:
-            server.starttls()
-            server.login(smtp_user, smtp_pass)
-            server.send_message(msg)
-    except Exception as e:
-        logging.error(f"Failed to send alert: {str(e)}")
-```
----
-## 🤝 Contributing Guidelines
-### Development Workflow
-#### 1. Setup Development Environment
-```bash
-# Fork repository
-git clone https://github.com/your-username/azure-speech-transcription.git
-cd azure-speech-transcription
-# Create feature branch
-git checkout -b feature/your-feature-name
-# Setup environment
-python -m venv venv
-source venv/bin/activate  # or venv\Scripts\activate on Windows
-pip install -r requirements.txt
-pip install -r requirements-dev.txt  # Development dependencies
-```
-#### 2. Code Quality Standards
-**Python Style Guide**
-- Follow PEP 8 style guidelines
-- Use type hints for function parameters and return values
-- Maximum line length: 88 characters (Black formatter)
-- Use meaningful variable and function names
-**Code Formatting**
-```bash
-# Install development tools
-pip install black flake8 mypy pytest
-# Format code
-black .
-# Check style
-flake8 .
-# Type checking
-mypy app_core.py gradio_app.py
-# Run tests
-pytest tests/
-```
-**Documentation Standards**
-- All functions must have docstrings
-- Include type hints
-- Document complex logic with inline comments
-- Update README.md for new features
-```python
-def submit_transcription(
-    self,
-    file_bytes: bytes,
-    original_filename: str,
-    user_id: str,
-    language: str,
-    settings: Dict[str, Any]
-) -> str:
-    """
-    Submit a new transcription job for processing.
-    Args:
-        file_bytes: Raw bytes of the audio/video file
-        original_filename: Original name of the uploaded file
-        user_id: ID of the authenticated user
-        language: Language code for transcription (e.g., 'en-US')
-        settings: Transcription configuration options
-    Returns:
-        str: Unique job ID for tracking transcription progress
-    Raises:
-        ValueError: If user_id is invalid or file is too large
-        ConnectionError: If Azure services are unavailable
-    """
-```
-#### 3. Testing Requirements
-**Unit Tests**
-```python
-import pytest
-from unittest.mock import Mock, patch
-from app_core import TranscriptionManager, AuthManager
-class TestAuthManager:
-    def test_password_hashing(self):
-        password = "TestPassword123"
-        hashed = AuthManager.hash_password(password)
-        assert hashed != password
-        assert AuthManager.verify_password(password, hashed)
-        assert not AuthManager.verify_password("wrong", hashed)
-    def test_email_validation(self):
-        assert AuthManager.validate_email("test@example.com")
-        assert not AuthManager.validate_email("invalid-email")
-        assert not AuthManager.validate_email("")
-class TestTranscriptionManager:
-    @patch('app_core.BlobServiceClient')
-    def test_submit_transcription(self, mock_blob):
-        manager = TranscriptionManager()
-        job_id = manager.submit_transcription(
-            b"fake audio data",
-            "test.wav",
-            "user123",
-            "en-US",
-            {"audio_format": "wav"}
-        )
-        assert isinstance(job_id, str)
-        assert len(job_id) == 36  # UUID length
-```
-**Integration Tests**
-```python
-class TestIntegration:
-    def test_full_transcription_workflow(self):
-        # Test complete workflow from upload to download
-        pass
-    def test_user_registration_and_login(self):
-        # Test complete auth workflow
-        pass
-```
-#### 4. Commit Guidelines
-**Commit Message Format**
-```
-type(scope): brief description
-Detailed explanation of changes if needed
-- List specific changes
-- Include any breaking changes
-- Reference issue numbers
-Closes #123
-```
-**Commit Types**
-- `feat`: New feature
-- `fix`: Bug fix
-- `docs`: Documentation changes
-- `style`: Code style changes (formatting, etc.)
-- `refactor`: Code refactoring
-- `test`: Adding or updating tests
-- `chore`: Maintenance tasks
-**Example Commits**
-```bash
-git commit -m "feat(auth): add password strength validation
-- Implement password complexity requirements
-- Add client-side validation feedback
-- Update registration form UI
-Closes #45"
-git commit -m "fix(transcription): handle Azure service timeouts
-- Add retry logic for failed API calls
-- Improve error messages for users
-- Log detailed error information
-Fixes #67"
-```
-#### 5. Pull Request Process
-**PR Checklist**
-- [ ] Code follows style guidelines
-- [ ] All tests pass
-- [ ] Documentation updated
-- [ ] Security considerations reviewed
-- [ ] Performance impact assessed
-- [ ] Breaking changes documented
-**PR Template**
-```markdown
-## Description
-Brief description of changes
-## Type of Change
-- [ ] Bug fix
-- [ ] New feature
-- [ ] Breaking change
-- [ ] Documentation update
-## Testing
-- [ ] Unit tests added/updated
-- [ ] Integration tests pass
-- [ ] Manual testing completed
-## Security
-- [ ] No sensitive data exposed
-- [ ] Input validation implemented
-- [ ] Access controls maintained
-## Performance
-- [ ] No performance degradation
-- [ ] Database queries optimized
-- [ ] Resource usage considered
-```
-### Feature Development
-#### Adding New Languages
-```python
-# 1. Update environment configuration
-ALLOWED_LANGS = {
-    "en-US": "English (United States)",
-    "es-ES": "Spanish (Spain)",
-    "new-LANG": "New Language Name"
-}
-# 2. Test language support
-def test_new_language():
-    # Verify Azure Speech Services supports the language
-    # Test transcription accuracy
-    # Update documentation
-```
-#### Adding New Audio Formats
-```python
-# 1. Update supported formats list
-AUDIO_FORMATS = [
-    "wav", "mp3", "ogg", "opus", "flac",
-    "new_format"  # Add new format
-]
-# 2. Update FFmpeg conversion logic
-def _convert_to_audio(self, input_path, output_path, audio_format="wav"):
-    if audio_format == "new_format":
-        # Add specific conversion parameters
-        cmd = ["ffmpeg", "-i", input_path, "-codec", "new_codec", output_path]
-```
-#### Adding New Features
-```python
-# 1. Database schema updates
-def upgrade_database_schema():
-    with self.get_connection() as conn:
-        conn.execute("""
-            ALTER TABLE transcriptions
-            ADD COLUMN new_feature_data TEXT
-        """)
-# 2. API endpoint updates
-def new_feature_endpoint(user_id: str, feature_data: Dict) -> Dict:
-    # Implement new feature logic
-    pass
-# 3. UI updates
-def add_new_feature_ui():
-    new_feature_input = gr.Textbox(label="New Feature")
-    new_feature_button = gr.Button("Use New Feature")
-```
----
-## ⚙️ Advanced Configuration
-### Performance Optimization
-#### Concurrent Processing
-```python
-# Adjust worker thread pool size based on server capacity
-class TranscriptionManager:
-    def __init__(self, max_workers: int = None):
-        if max_workers is None:
-            # Auto-detect based on CPU cores
-            import multiprocessing
-            max_workers = min(multiprocessing.cpu_count(), 10)
-        self.executor = ThreadPoolExecutor(max_workers=max_workers)
-# Configure based on server specs
-# Small server: max_workers=2-4
-# Medium server: max_workers=5-8
-# Large server: max_workers=10+
-```
-#### Database Optimization
-```python
-# SQLite performance tuning
-def configure_database_performance(db_path: str):
-    with sqlite3.connect(db_path) as conn:
-        # Enable WAL mode for better concurrency
-        conn.execute("PRAGMA journal_mode=WAL")
-        # Increase cache size (in KB)
-        conn.execute("PRAGMA cache_size=10000")
-        # Optimize synchronization
-        conn.execute("PRAGMA synchronous=NORMAL")
-        # Enable foreign keys
-        conn.execute("PRAGMA foreign_keys=ON")
-```
-#### Memory Management
-```python
-# Large file handling
-def process_large_file(file_path: str):
-    """Process large files in chunks to manage memory"""
-    chunk_size = 64 * 1024 * 1024  # 64MB chunks
-    with open(file_path, 'rb') as f:
-        while chunk := f.read(chunk_size):
-            # Process chunk
-            yield chunk
-# Garbage collection for long-running processes
-import gc
-def cleanup_memory():
-    """Force garbage collection"""
-    gc.collect()
-# Schedule periodic cleanup
-schedule.every(30).minutes.do(cleanup_memory)
-```
-### Security Hardening
-#### Rate Limiting
-```python
-from collections import defaultdict
-from time import time
-class RateLimiter:
-    def __init__(self, max_requests: int = 100, window: int = 3600):
-        self.max_requests = max_requests
-        self.window = window
-        self.requests = defaultdict(list)
-    def is_allowed(self, user_id: str) -> bool:
-        now = time()
-        user_requests = self.requests[user_id]
-        # Clean old requests
-        user_requests[:] = [req_time for req_time in user_requests
-                           if now - req_time < self.window]
-        # Check limit
-        if len(user_requests) >= self.max_requests:
-            return False
-        user_requests.append(now)
-        return True
-# Usage in endpoints
-rate_limiter = RateLimiter(max_requests=50, window=3600)  # 50 per hour
-def submit_transcription(self, user_id: str, ...):
-    if not rate_limiter.is_allowed(user_id):
-        raise Exception("Rate limit exceeded")
-```
-#### Input Sanitization
-```python
-import bleach
-import re
-def sanitize_filename(filename: str) -> str:
-    """Sanitize uploaded filename"""
-    # Remove path traversal attempts
-    filename = os.path.basename(filename)
-    # Remove dangerous characters
-    filename = re.sub(r'[<>:"/\\|?*]', '_', filename)
-    # Limit length
-    if len(filename) > 255:
-        name, ext = os.path.splitext(filename)
-        filename = name[:250] + ext
-    return filename
-def sanitize_user_input(text: str) -> str:
-    """Sanitize user text input"""
-    # Remove HTML tags
-    text = bleach.clean(text, tags=[], strip=True)
-    # Limit length
-    text = text[:1000]
-    return text.strip()
-```
-#### Audit Logging
-```python
-class AuditLogger:
-    def __init__(self):
-        self.logger = logging.getLogger('audit')
-    def log_user_action(self, user_id: str, action: str, details: Dict = None):
-        """Log user actions for security auditing"""
-        audit_entry = {
-            'timestamp': datetime.now().isoformat(),
-            'user_id': user_id,
-            'action': action,
-            'details': details or {},
-            'ip_address': self._get_client_ip(),
-            'user_agent': self._get_user_agent()
-        }
-        self.logger.info(json.dumps(audit_entry))
-    def _get_client_ip(self) -> str:
-        # Implementation depends on deployment setup
-        return "unknown"
-    def _get_user_agent(self) -> str:
-        # Implementation depends on deployment setup
-        return "unknown"
-# Usage
-audit = AuditLogger()
-audit.log_user_action(user_id, "login", {"success": True})
-audit.log_user_action(user_id, "transcription_submit", {"filename": filename})
-```
-### Custom Extensions
-#### Plugin Architecture
-```python
-class TranscriptionPlugin:
-    """Base class for transcription plugins"""
-    def pre_process(self, file_bytes: bytes, settings: Dict) -> bytes:
-        """Pre-process audio before transcription"""
-        return file_bytes
-    def post_process(self, transcript: str, settings: Dict) -> str:
-        """Post-process transcript text"""
-        return transcript
-    def get_name(self) -> str:
-        """Return plugin name"""
-        raise NotImplementedError
-class NoiseReductionPlugin(TranscriptionPlugin):
-    def get_name(self) -> str:
-        return "noise_reduction"
-    def pre_process(self, file_bytes: bytes, settings: Dict) -> bytes:
-        # Implement noise reduction using audio processing library
-        # This is a placeholder - actual implementation would use
-        # libraries like librosa, scipy, or pydub
-        return file_bytes
-class LanguageDetectionPlugin(TranscriptionPlugin):
-    def get_name(self) -> str:
-        return "language_detection"
-    def pre_process(self, file_bytes: bytes, settings: Dict) -> bytes:
-        # Detect language and update settings
-        detected_language = self._detect_language(file_bytes)
-        settings['detected_language'] = detected_language
-        return file_bytes
-# Plugin manager
-class PluginManager:
-    def __init__(self):
-        self.plugins: List[TranscriptionPlugin] = []
-    def register_plugin(self, plugin: TranscriptionPlugin):
-        self.plugins.append(plugin)
-    def apply_pre_processing(self, file_bytes: bytes, settings: Dict) -> bytes:
-        for plugin in self.plugins:
-            file_bytes = plugin.pre_process(file_bytes, settings)
-        return file_bytes
-    def apply_post_processing(self, transcript: str, settings: Dict) -> str:
-        for plugin in self.plugins:
-            transcript = plugin.post_process(transcript, settings)
-        return transcript
-```
----
-## 🔧 Troubleshooting
-### Common Development Issues
-#### Environment Setup Problems
-**Issue**: Azure connection fails
-```bash
-# Check environment variables
-python -c "
-import os
-print('AZURE_SPEECH_KEY:', bool(os.getenv('AZURE_SPEECH_KEY')))
-print('AZURE_BLOB_CONNECTION:', bool(os.getenv('AZURE_BLOB_CONNECTION')))
-"
-# Test Azure connection
-python -c "
-from azure.storage.blob import BlobServiceClient
-client = BlobServiceClient.from_connection_string('$AZURE_BLOB_CONNECTION')
-print('Containers:', list(client.list_containers()))
-"
-```
-**Issue**: FFmpeg not found
-```bash
-# Check FFmpeg installation
-ffmpeg -version
-# Install FFmpeg (Ubuntu/Debian)
-sudo apt update && sudo apt install ffmpeg
-# Install FFmpeg (Windows with Chocolatey)
-choco install ffmpeg
-# Install FFmpeg (macOS with Homebrew)
-brew install ffmpeg
-```
-**Issue**: Database initialization fails
-```python
-# Check database permissions
-import os
-db_dir = "database"
-if not os.path.exists(db_dir):
-    os.makedirs(db_dir)
-    print(f"Created directory: {db_dir}")
-# Test database creation
-import sqlite3
-conn = sqlite3.connect("database/test.db")
-conn.execute("CREATE TABLE test (id INTEGER)")
-conn.close()
-print("Database test successful")
-```
-#### Runtime Issues
-**Issue**: Memory errors with large files
-```python
-# Monitor memory usage
-import psutil
-def check_memory():
-    memory = psutil.virtual_memory()
-    print(f"Memory usage: {memory.percent}%")
-    print(f"Available: {memory.available / 1024**3:.1f}GB")
-# Implement file chunking for large uploads
-def process_large_file_in_chunks(file_path: str, chunk_size: int = 64*1024*1024):
-    with open(file_path, 'rb') as f:
-        while chunk := f.read(chunk_size):
-            yield chunk
-```
-**Issue**: Transcription jobs stuck
-```python
-# Check pending jobs
-def diagnose_stuck_jobs():
-    pending_jobs = transcription_manager.db.get_pending_jobs()
-    print(f"Pending jobs: {len(pending_jobs)}")
-    for job in pending_jobs:
-        duration = datetime.now() - datetime.fromisoformat(job.created_at)
-        print(f"Job {job.job_id}: {job.status} for {duration}")
-        if duration.total_seconds() > 3600:  # 1 hour
-            print(f"⚠️ Job {job.job_id} may be stuck")
-# Reset stuck jobs
-def reset_stuck_jobs():
-    with transcription_manager.db.get_connection() as conn:
-        conn.execute("""
-            UPDATE transcriptions
-            SET status = 'pending', azure_trans_id = NULL
-            WHERE status = 'processing'
-            AND created_at < datetime('now', '-1 hour')
-        """)
-```
-**Issue**: Azure API errors
-```python
-# Test Azure Speech Service
-def test_azure_speech():
-    try:
-        url = f"{AZURE_SPEECH_KEY_ENDPOINT}/speechtotext/v3.2/transcriptions"
-        headers = {"Ocp-Apim-Subscription-Key": AZURE_SPEECH_KEY}
-        response = requests.get(url, headers=headers)
-        print(f"Status: {response.status_code}")
-        print(f"Response: {response.text[:200]}")
-    except Exception as e:
-        print(f"Azure Speech test failed: {e}")
-# Check Azure service status
-def check_azure_status():
-    # Check Azure status page
-    status_url = "https://status.azure.com/en-us/status"
-    print(f"Check Azure status: {status_url}")
-```
-### Debugging Tools
-#### Debug Mode Configuration
-```python
-# Enable debug mode
-DEBUG = os.environ.get("DEBUG", "false").lower() == "true"
-if DEBUG:
-    logging.basicConfig(level=logging.DEBUG)
-    # Enable Gradio debug mode
-    demo.launch(debug=True, show_error=True)
-```
-#### Performance Profiling
-```python
-import cProfile
-import pstats
-def profile_function(func):
-    """Profile function performance"""
-    profiler = cProfile.Profile()
-    def wrapper(*args, **kwargs):
-        profiler.enable()
-        result = func(*args, **kwargs)
-        profiler.disable()
-        # Print stats
-        stats = pstats.Stats(profiler)
-        stats.sort_stats('cumulative')
-        stats.print_stats(10)  # Top 10 functions
-        return result
-    return wrapper
-# Usage
-@profile_function
-def submit_transcription(self, ...):
-    # Function implementation
-    pass
-```
-#### Log Analysis
-```python
-def analyze_logs(log_file: str = "logs/transcription.log"):
-    """Analyze application logs for issues"""
-    errors = []
-    warnings = []
-    performance_issues = []
-    with open(log_file, 'r') as f:
-        for line in f:
-            if 'ERROR' in line:
-                errors.append(line.strip())
-            elif 'WARNING' in line:
-                warnings.append(line.strip())
-            elif 'completed in' in line:
-                # Extract timing information
-                import re
-                match = re.search(r'completed in (\d+\.\d+)s', line)
-                if match and float(match.group(1)) > 30:  # > 30 seconds
-                    performance_issues.append(line.strip())
-    print(f"Errors: {len(errors)}")
-    print(f"Warnings: {len(warnings)}")
-    print(f"Performance issues: {len(performance_issues)}")
-    return {
-        'errors': errors[-10:],  # Last 10 errors
-        'warnings': warnings[-10:],  # Last 10 warnings
-        'performance_issues': performance_issues[-10:]
-    }
-```
-### Production Troubleshooting
-#### Service Health Check
-```bash
-#!/bin/bash
-# health_check.sh
-echo "=== System Health Check ==="
-# Check service status
-systemctl is-active transcription
-systemctl is-active nginx
-# Check disk space
-df -h
-# Check memory usage
-free -h
-# Check CPU usage
-top -b -n1 | grep "Cpu(s)"
-# Check logs for errors
-tail -n 50 /home/transcription/app/logs/transcription.log | grep ERROR
-# Check Azure connectivity
-curl -s -o /dev/null -w "%{http_code}" https://azure.microsoft.com/
-echo "=== Health Check Complete ==="
-```
-#### Database Recovery
-```python
-def recover_database():
-    """Recover database from Azure backup"""
-    try:
-        # List available backups
-        container_client = blob_service.get_container_client(AZURE_CONTAINER)
-        backups = []
-        for blob in container_client.list_blobs(name_starts_with="shared/backups/"):
-            backups.append({
-                'name': blob.name,
-                'modified': blob.last_modified
-            })
-        # Sort by date (newest first)
-        backups.sort(key=lambda x: x['modified'], reverse=True)
-        if not backups:
-            print("No backups found")
-            return
-        # Download latest backup
-        latest_backup = backups[0]['name']
-        print(f"Restoring from: {latest_backup}")
-        blob_client = blob_service.get_blob_client(
-            container=AZURE_CONTAINER,
-            blob=latest_backup
-        )
-        # Download backup
-        with open("database/transcriptions_restored.db", "wb") as f:
-            f.write(blob_client.download_blob().readall())
-        print("Database restored successfully")
-        print("Restart the application to use restored database")
-    except Exception as e:
-        print(f"Database recovery failed: {str(e)}")
-```
----
-## 📚 Additional Resources
-### Documentation Links
-- [Azure Speech Services Documentation](https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/)
-- [Azure Blob Storage Documentation](https://docs.microsoft.com/en-us/azure/storage/blobs/)
-- [Gradio Documentation](https://gradio.app/docs/)
-- [SQLite Documentation](https://www.sqlite.org/docs.html)
-- [FFmpeg Documentation](https://ffmpeg.org/documentation.html)
-### Useful Tools
-- **Azure Storage Explorer**: GUI for managing blob storage
-- **DB Browser for SQLite**: Visual database management
-- **Postman**: API testing and development
-- **Azure CLI**: Command-line Azure management
-- **Visual Studio Code**: Recommended IDE with Azure extensions
-### Community Resources
-- [Azure Speech Services Community](https://docs.microsoft.com/en-us/answers/topics/azure-speech-services.html)
-- [Gradio Community](https://github.com/gradio-app/gradio/discussions)
-- [Python Audio Processing Libraries](https://github.com/topics/audio-processing)
----
-**This developer guide provides comprehensive information for setting up, developing, deploying, and maintaining the Azure Speech Transcription service. For additional help, refer to the linked documentation and community resources.** 🚀