Spaces:

Elvoro
/

Tools

Running

App Files Files Community

Zumrat Kochshegulov commited on Oct 7, 2025

Commit

e4d57c9

unverified ·

2 Parent(s): 0b94fac cb9baf6

Merge pull request #2 from ElvoroLtd/feat/video-editor

Browse files

Files changed (11) hide show

.env.example +6 -59
API_SETUP_GUIDE.md +0 -316
QUICKSTART.md +0 -313
README.md +200 -261
config/api_keys.yaml +12 -5
requirements.txt +52 -13
src/api_clients.py +110 -40
src/asset_selector.py +233 -0
src/automation.py +330 -329
src/main.py +28 -25
src/video_renderer.py +382 -55

.env.example CHANGED Viewed

@@ -1,75 +1,22 @@
-# ============================================
-# SOMIRA CONTENT AUTOMATION - CONFIGURATION
-# ============================================
-# -------------------- API KEYS --------------------
-# Gemini API (Google AI) - For prompt enhancement and video selection
-# Get yours at: https://aistudio.google.com/app/apikey
 GEMINI_API_KEY=your_gemini_api_key_here
-# RunwayML API - For AI video generation
-# Get yours at: https://dev.runwayml.com/
-RUNWAYML_API_KEY=key_your_runwayml_api_key_here
-# Google Cloud - Service Account for TTS and Storage
-# Path to your service account JSON key file
 GOOGLE_APPLICATION_CREDENTIALS=/path/to/your/service-account-key.json
-# OR use Azure TTS (Alternative to Google TTS)
-# AZURE_SPEECH_KEY=your_azure_speech_key_here
-# AZURE_SPEECH_REGION=eastus
-# -------------------- CLOUD STORAGE --------------------
-# Google Cloud Storage bucket name for video storage
-# Create bucket at: https://console.cloud.google.com/storage
 GCS_BUCKET_NAME=your_bucket_name_here
-# -------------------- CONFIGURATION --------------------
-# Audio library size (number of background music tracks available)
 AUDIO_LIBRARY_SIZE=27
-# Video library size (number of product video clips available)
 VIDEO_LIBRARY_SIZE=47
-# Default TTS voice (Google Cloud TTS voices)
-# Options: en-US-AriaNeural, en-US-JennyNeural, en-US-GuyNeural, etc.
-# Full list: https://cloud.google.com/text-to-speech/docs/voices
 DEFAULT_VOICE=en-US-Neural2-F
-# Video rendering quality (low, medium, high, ultra)
 VIDEO_QUALITY=high
-# Enable debug logging (true/false)
 DEBUG_MODE=false
-# -------------------- OPTIONAL SETTINGS --------------------
-# Maximum video generation timeout (seconds)
 VIDEO_GENERATION_TIMEOUT=300
-# Maximum concurrent API requests
 MAX_CONCURRENT_REQUESTS=4
-# Retry attempts for failed API calls
 MAX_RETRY_ATTEMPTS=3
-# Output directory for generated videos
 OUTPUT_DIRECTORY=./output
-# Temp directory for intermediate files
 TEMP_DIRECTORY=/tmp/somira
-# -------------------- NOTES --------------------
-#
-# 1. Never commit this file with actual API keys to version control
-# 2. Copy this file to .env and fill in your actual values
-# 3. Make sure .env is listed in your .gitignore file
-# 4. See API_SETUP_GUIDE.md for detailed setup instructions
-#

+# API Keys
 GEMINI_API_KEY=your_gemini_api_key_here
+RUNWAYML_API_KEY=your_runwayml_api_key_here
+DEEPSEEK_API_KEY=your_deepseek_api_key_here
 GOOGLE_APPLICATION_CREDENTIALS=/path/to/your/service-account-key.json
+# Cloud Storage
 GCS_BUCKET_NAME=your_bucket_name_here
+# Configuration
 AUDIO_LIBRARY_SIZE=27
 VIDEO_LIBRARY_SIZE=47
 DEFAULT_VOICE=en-US-Neural2-F
 VIDEO_QUALITY=high
 DEBUG_MODE=false
+# Optional Settings
 VIDEO_GENERATION_TIMEOUT=300
 MAX_CONCURRENT_REQUESTS=4
 MAX_RETRY_ATTEMPTS=3
 OUTPUT_DIRECTORY=./output
 TEMP_DIRECTORY=/tmp/somira

API_SETUP_GUIDE.md DELETED Viewed

@@ -1,316 +0,0 @@
-# API Setup Guide - Complete Instructions
-This guide will walk you through obtaining all necessary API keys for your Somira video generation system.
----
-## 1. Google Gemini API (Prompt Enhancement)
-### Purpose
-Enhances user prompts and analyzes scripts for intelligent video selection.
-### How to Get Your API Key
-1. **Go to Google AI Studio**
-   - Visit: https://aistudio.google.com/app/apikey
-   - Sign in with your Google account
-2. **Create API Key**
-   - Click "Get API key" button (top left)
-   - Click "Create API key"
-   - Choose "Create API key in new project" (or select existing project)
-   - Copy the API key immediately (shown only once!)
-3. **Add to Your Environment**
-   ```bash
-   export GEMINI_API_KEY="your_api_key_here"
-   ```
-### Pricing
-- Free tier available with rate limits
-- Model used: `gemini-2.0-flash-exp` (optimized for speed and cost)
-### Documentation
-- https://ai.google.dev/gemini-api/docs
----
-## 2. RunwayML API (Video Generation)
-### Purpose
-Generates AI videos from text prompts using Gen-4 model.
-### How to Get Your API Key
-1. **Create Developer Account**
-   - Visit: https://dev.runwayml.com/
-   - Sign up for a new account
-   - Create a new organization (corresponds to your integration)
-2. **Create API Key**
-   - Navigate to "API Keys" tab
-   - Click "Create new key"
-   - Give it a descriptive name (e.g., "Somira Production")
-   - Copy the key immediately and store securely (never shown again)
-3. **Add Credits**
-   - Go to "Billing" tab
-   - Add credits to your organization
-   - Minimum payment: $10 (at $0.01 per credit)
-4. **Add to Your Environment**
-   ```bash
-   export RUNWAYML_API_KEY="key_your_api_key_here"
-   ```
-### Pricing
-- Pay-per-use model with credits
-- Gen-4 Turbo: ~5-10 credits per 10-second video
-- Minimum: $10 to start
-### Documentation
-- https://docs.dev.runwayml.com/
----
-## 3. Google Cloud Text-to-Speech (Azure Alternative)
-### Purpose
-Converts text scripts to natural-sounding speech with timing data for lip-sync.
-### Option A: Google Cloud TTS (Recommended)
-#### How to Get Your API Key
-1. **Create Google Cloud Project**
-   - Visit: https://console.cloud.google.com/
-   - Create new project or select existing
-2. **Enable Text-to-Speech API**
-   - Go to "APIs & Services" > "Library"
-   - Search "Text-to-Speech API"
-   - Click "Enable"
-3. **Create Service Account**
-   - Go to "APIs & Services" > "Credentials"
-   - Click "Create Credentials" > "Service Account"
-   - Download JSON key file
-4. **Add to Your Environment**
-   ```bash
-   export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account-key.json"
-   ```
-#### Pricing
-- Free tier: 1 million characters/month (Standard voices)
-- $4 per million characters after (Standard)
-- $16 per million characters (Neural2/Studio voices)
-### Option B: Azure Cognitive Services TTS
-#### How to Get Your API Key
-1. **Create Azure Account**
-   - Visit: https://portal.azure.com/
-   - Sign up (free tier available)
-2. **Create Speech Service Resource**
-   - Search "Speech Services" in Azure Portal
-   - Click "Create"
-   - Select subscription, resource group, region
-   - Choose pricing tier (F0 for free)
-3. **Get Keys**
-   - Go to your Speech Service resource
-   - Navigate to "Keys and Endpoint"
-   - Copy Key 1 or Key 2
-   - Copy the Region (e.g., eastus)
-4. **Add to Your Environment**
-   ```bash
-   export AZURE_SPEECH_KEY="your_key_here"
-   export AZURE_SPEECH_REGION="eastus"
-   ```
-#### Pricing
-- Free tier: 5 audio hours/month
-- Standard: $1 per audio hour
-- Neural: $16 per million characters
-### Documentation
-- Google: https://cloud.google.com/text-to-speech/docs
-- Azure: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/
----
-## 4. Google Cloud Storage (Video Storage)
-### Purpose
-Stores generated videos, audio files, and video library.
-### How to Set Up
-1. **Create GCS Bucket**
-   - Go to: https://console.cloud.google.com/storage
-   - Click "Create Bucket"
-   - Choose unique name (e.g., "somira-videos")
-   - Select region (same as your app for best performance)
-   - Choose "Standard" storage class
-2. **Set Permissions**
-   - Make bucket public (if videos should be publicly accessible)
-   - Or configure IAM for service account access
-3. **Add to Your Environment**
-   ```bash
-   export GCS_BUCKET_NAME="somira-videos"
-   ```
-### Pricing
-- $0.020 per GB/month (Standard storage)
-- $0.12 per GB egress (after free tier)
-- Free tier: 5GB storage
----
-## Complete .env File Example
-Create a `.env` file in your project root:
-```bash
-# Gemini API (Prompt Enhancement)
-GEMINI_API_KEY=AIzaSyC_your_gemini_key_here
-# RunwayML API (Video Generation)
-RUNWAYML_API_KEY=key_1234567890abcdefghijklmnop
-# Google Cloud TTS (Option A - Recommended)
-GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account-key.json
-# OR Azure TTS (Option B)
-# AZURE_SPEECH_KEY=your_azure_key_here
-# AZURE_SPEECH_REGION=eastus
-# Google Cloud Storage
-GCS_BUCKET_NAME=somira-videos
-# Configuration
-AUDIO_LIBRARY_SIZE=27
-VIDEO_LIBRARY_SIZE=47
-DEFAULT_VOICE=en-US-AriaNeural
-```
----
-## Security Best Practices
-### DO:
-- Store API keys in environment variables or secret managers
-- Never commit API keys to version control (add .env to .gitignore)
-- Use descriptive names for API keys so you can revoke them later
-- Rotate keys regularly
-- Use separate keys for development and production
-### DON'T:
-- Never expose API keys on the client-side or in client-side code
-- Never hard-code API keys directly in source code
-- Don't share keys in public repositories
----
-## Installation Steps
-1. **Install Dependencies**
-   ```bash
-   pip install -r requirements.txt
-   ```
-2. **Set Up Environment Variables**
-   ```bash
-   cp .env.example .env
-   # Edit .env with your actual keys
-   ```
-3. **Load Environment Variables**
-   ```python
-   from dotenv import load_dotenv
-   load_dotenv()
-   ```
-4. **Test API Connections**
-   ```python
-   from api_clients import APIClients
-   config = {
-       'gemini_api_key': os.getenv('GEMINI_API_KEY'),
-       'runwayml_api_key': os.getenv('RUNWAYML_API_KEY'),
-       'gcs_bucket_name': os.getenv('GCS_BUCKET_NAME'),
-       'video_library_size': 47,
-       'default_voice': 'en-US-AriaNeural'
-   }
-   clients = APIClients(config)
-   ```
----
-## Cost Estimates (Monthly)
-For a moderate usage scenario (100 videos/month):
-| Service | Usage | Cost |
-|---------|-------|------|
-| Gemini API | ~200K tokens | Free (within limits) |
-| RunwayML | 100 videos × 10 sec | ~$50-100 |
-| Google TTS | ~100K characters | Free (within limits) |
-| Google Cloud Storage | 50GB storage + egress | ~$2-5 |
-| **Total** | | **~$52-105/month** |
-Most of the cost comes from RunwayML video generation. Consider:
-- Using shorter video durations (5s instead of 10s)
-- Caching generated videos
-- Using Gen-4 Turbo for faster/cheaper results
----
-## Troubleshooting
-### Common Issues
-1. **"API key not found" errors**
-   - Check environment variables are loaded
-   - Verify .env file location
-   - Restart your application after adding keys
-2. **RunwayML "Insufficient credits"**
-   - Add credits in the billing tab of developer portal
-   - Minimum $10 required to start
-3. **Google Cloud authentication errors**
-   - Verify service account JSON path is correct
-   - Check service account has necessary permissions
-   - Ensure APIs are enabled in Cloud Console
-4. **Rate limiting**
-   - Implement exponential backoff
-   - Add delays between API calls
-   - Consider upgrading to paid tiers
----
-## Support Resources
-- **Gemini**: https://ai.google.dev/support
-- **RunwayML**: https://help.runwayml.com/
-- **Google Cloud**: https://cloud.google.com/support
-- **Azure**: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/get-started-text-to-speech
----
-## Next Steps
-1. Obtain all API keys following the instructions above
-2. Configure your .env file
-3. Test each API endpoint individually
-4. Run the full video generation pipeline
-5. Monitor usage and costs in each platform's dashboard

QUICKSTART.md DELETED Viewed

@@ -1,313 +0,0 @@
-# 🚀 Quick Start Guide
-Get your Somira Content Automation System up and running in 5 minutes!
----
-## Prerequisites
-- Python 3.8 or higher
-- pip (Python package manager)
-- API keys (see [API_SETUP_GUIDE.md](API_SETUP_GUIDE.md))
----
-## Installation
-### 1. Clone or Download the Project
-```bash
-cd somira-automation
-```
-### 2. Create Virtual Environment (Recommended)
-```bash
-# Create virtual environment
-python -m venv venv
-# Activate it
-# On macOS/Linux:
-source venv/bin/activate
-# On Windows:
-venv\Scripts\activate
-```
-### 3. Install Dependencies
-```bash
-pip install -r requirements.txt
-```
----
-## Configuration
-### 1. Set Up Environment Variables
-```bash
-# Copy example file
-cp .env.example .env
-# Edit with your API keys
-nano .env  # or use your favorite editor
-```
-**Required values in `.env`:**
-- `GEMINI_API_KEY` - Get from https://aistudio.google.com/app/apikey
-- `RUNWAYML_API_KEY` - Get from https://dev.runwayml.com/
-- `GOOGLE_APPLICATION_CREDENTIALS` - Path to GCP service account JSON
-- `GCS_BUCKET_NAME` - Your Google Cloud Storage bucket name
-### 2. Verify Configuration
-```bash
-python main.py --health-check
-```
-You should see:
-```
-✓ Gemini API: Connected
-✓ RunwayML API: Configured
-✓ TTS API: Configured
-✓ Google Cloud Storage: Connected
-✅ Health check passed
-```
----
-## Usage
-### Basic Usage (Default Content)
-```bash
-python main.py
-```
-This will:
-1. Generate a hook video using AI
-2. Select background music
-3. Choose 3 relevant product videos
-4. Generate text-to-speech audio
-5. Render the final video with subtitles
-6. Upload to Google Cloud Storage
-### Custom Content
-```bash
-python main.py \
-  --strategy example_strategy.json \
-  --script example_script.txt \
-  --output ./output/my_video
-```
-### Run a Quick Test
-```bash
-python main.py --test
-```
-This runs a minimal test to verify everything works without using many credits.
----
-## Command Line Options
-```bash
-python main.py [OPTIONS]
-Options:
-  --strategy FILE    Path to JSON file with content strategy
-  --script FILE      Path to text file with TTS script
-  --output DIR       Output directory for results
-  --health-check     Run health check on all services
-  --test             Run test pipeline with minimal resources
-  --verbose          Enable verbose logging
-  --help             Show help message
-```
----
-## Example Workflows
-### Create Multiple Videos from Different Scripts
-```bash
-# Video 1
-python main.py \
-  --script scripts/script1.txt \
-  --output output/video1
-# Video 2
-python main.py \
-  --script scripts/script2.txt \
-  --output output/video2
-# Video 3
-python main.py \
-  --script scripts/script3.txt \
-  --output output/video3
-```
-### Custom Strategy with Different Style
-Create `my_strategy.json`:
-```json
-{
-  "brand": "Somira",
-  "gemini_prompt": "Your custom prompt here...",
-  "runway_prompt": "Your custom RunwayML prompt...",
-  "style": "minimal",
-  "aspect_ratio": "16:9",
-  "duration": 10
-}
-```
-Then run:
-```bash
-python main.py --strategy my_strategy.json
-```
----
-## Understanding the Pipeline
-The automation runs in 4 steps:
-**Step 1: Asset Generation (Parallel)** ⚡
-- Generate hook video with AI (RunwayML)
-- Select background music (from library)
-- Select 3 product videos (AI-powered)
-- Generate voice-over (TTS)
-**Step 2: Video Rendering** 🎬
-- Merge all videos
-- Add audio tracks
-- Apply transitions and effects
-**Step 3: Subtitle Addition** 📝
-- Generate subtitles from TTS timing
-- Overlay on video
-**Step 4: Cloud Upload** ☁️
-- Upload to Google Cloud Storage
-- Generate public URL
----
-## File Structure
-```
-somira-automation/
-├── main.py                 # Main entry point
-├── automation.py           # Pipeline orchestrator
-├── api_clients.py          # API integrations
-├── video_renderer.py       # Video processing
-├── utils.py                # Utilities and logging
-├── requirements.txt        # Python dependencies
-├── .env                    # Your API keys (DO NOT COMMIT)
-├── .env.example            # Template for .env
-├── example_strategy.json   # Sample content strategy
-├── example_script.txt      # Sample TTS script
-├── API_SETUP_GUIDE.md      # Detailed API setup
-└── QUICKSTART.md           # This file
-```
----
-## Troubleshooting
-### "Module not found" errors
-```bash
-pip install -r requirements.txt
-```
-### "API key not found" errors
-```bash
-# Check your .env file exists and has the right keys
-cat .env
-# Make sure you've loaded it
-python -c "from dotenv import load_dotenv; load_dotenv(); import os; print(os.getenv('GEMINI_API_KEY'))"
-```
-### RunwayML "Insufficient credits"
-- Add credits at https://dev.runwayml.com/ (minimum $10)
-### Google Cloud authentication errors
-```bash
-# Verify your service account JSON exists
-ls -l /path/to/service-account-key.json
-# Set it in your .env
-GOOGLE_APPLICATION_CREDENTIALS=/full/path/to/service-account-key.json
-```
-### Videos taking too long
-- RunwayML video generation takes 30-60 seconds typically
-- The `--test` command uses minimal resources for quick testing
----
-## Cost Estimates
-For 100 videos per month:
-| Service | Cost |
-|---------|------|
-| Gemini API | Free (within limits) |
-| RunwayML | ~$50-100 |
-| Google TTS | Free (within limits) |
-| Google Storage | ~$2-5 |
-| **Total** | **~$52-105/month** |
-💡 **Tip:** Use the `--test` command frequently to avoid unnecessary API costs during development.
----
-## Next Steps
-1. ✅ Complete API setup (see [API_SETUP_GUIDE.md](API_SETUP_GUIDE.md))
-2. ✅ Run health check: `python main.py --health-check`
-3. ✅ Run test: `python main.py --test`
-4. ✅ Generate your first video: `python main.py`
-5. 📚 Customize: Edit `example_strategy.json` and `example_script.txt`
-6. 🚀 Scale: Create multiple strategies and automate batch processing
----
-## Support
-- **API Issues:** See [API_SETUP_GUIDE.md](API_SETUP_GUIDE.md)
-- **Bugs:** Check logs in console output
-- **Questions:** Review code comments in `main.py` and `automation.py`
----
-## Tips for Best Results
-### Prompt Engineering
-- Be specific about visual details
-- Include camera movements
-- Specify lighting and mood
-- Mention aspect ratio for consistency
-### TTS Scripts
-- Keep sentences natural and conversational
-- Use pauses (commas, periods) for pacing
-- Test different voices in `DEFAULT_VOICE` setting
-- Aim for 15-30 seconds of speech
-### Video Selection
-- The AI analyzes your script for context
-- More descriptive scripts = better video selection
-- Review selected videos in logs
-### Performance
-- Parallel execution makes Step 1 fast
-- Most time is spent waiting for RunwayML
-- Use `--test` to verify setup without long waits
----
-Happy automating! 🎉

README.md CHANGED Viewed

@@ -1,359 +1,298 @@
-# 🎬 Somira Content Automation System
-**Automated video generation pipeline for product advertisements using AI**
-Transform text scripts into professional product videos with AI-generated content, voice-overs, and intelligent video selection - all automated end-to-end.
----
-## ✨ Features
-- **🤖 AI-Powered Video Generation** - Create unique hook videos using RunwayML Gen-4
-- **🧠 Intelligent Prompt Enhancement** - Gemini AI optimizes prompts for better results
-- **🎙️ Professional Text-to-Speech** - Natural voice-overs with Google Cloud TTS
-- **📹 Smart Video Selection** - AI analyzes scripts to select relevant product footage
-- **🎵 Automatic Music Integration** - Background music from curated library
-- **📝 Subtitle Generation** - Automatic subtitle overlay with timing
-- **⚡ Parallel Processing** - Concurrent API calls for maximum speed
-- **☁️ Cloud Storage** - Automatic upload to Google Cloud Storage
-- **🔄 Robust Error Handling** - Fallback mechanisms for reliability
----
-## 🎯 Use Cases
-- Product advertisement videos for social media
-- Instagram Reels and TikTok content
-- Automated marketing video generation
-- A/B testing different video hooks
-- Scalable video production pipelines
-- Content marketing automation
----
-## 📋 Requirements
-- **Python 3.8+**
-- **API Keys:**
-  - Google Gemini API (free tier available)
-  - RunwayML API ($10 minimum)
-  - Google Cloud Platform account (TTS + Storage)
-- **Storage:** ~1GB for video library
-- **RAM:** 4GB minimum
 ---
 ## 🚀 Quick Start
 ### 1. Installation
 ```bash
-# Clone repository
-git clone <your-repo-url>
 cd somira-automation
-# Create virtual environment
 python -m venv venv
-source venv/bin/activate  # On Windows: venv\Scripts\activate
 # Install dependencies
 pip install -r requirements.txt
 ```
-### 2. Configuration
 ```bash
-# Copy environment template
 cp .env.example .env
-# Edit with your API keys
-nano .env
 ```
-**Required API Keys:**
-- `GEMINI_API_KEY` - https://aistudio.google.com/app/apikey
-- `RUNWAYML_API_KEY` - https://dev.runwayml.com/
-- `GOOGLE_APPLICATION_CREDENTIALS` - GCP service account JSON
-- `GCS_BUCKET_NAME` - Your GCS bucket name
-### 3. Verify Setup
 ```bash
 python main.py --health-check
 ```
-### 4. Generate Your First Video
 ```bash
 python main.py
 ```
-**📚 For detailed setup instructions, see [QUICKSTART.md](QUICKSTART.md)**
 ---
-## 📖 Documentation
-| Document | Description |
-|----------|-------------|
-| [QUICKSTART.md](QUICKSTART.md) | Get started in 5 minutes |
-| [API_SETUP_GUIDE.md](API_SETUP_GUIDE.md) | Detailed API key setup |
-| [example_strategy.json](example_strategy.json) | Sample content strategy |
-| [example_script.txt](example_script.txt) | Sample TTS script |
----
-## 🏗️ Architecture
-```
-┌─────────────────────────────────────────────────────┐
-│                   MAIN PIPELINE                      │
-└─────────────────────────────────────────────────────┘
-                          │
-                          ▼
-┌─────────────────────────────────────────────────────┐
-│           STEP 1: Asset Generation (Parallel)        │
-├─────────────────────────────────────────────────────┤
-│  ┌──────────────┐  ┌──────────────┐                │
-│  │ Gemini API   │→ │ RunwayML API │                │
-│  │ (Enhance)    │  │ (Hook Video) │                │
-│  └──────────────┘  └──────────────┘                │
-│                                                      │
-│  ┌──────────────┐  ┌──────────────┐                │
-│  │ Music        │  │ Video        │                │
-│  │ Selection    │  │ Selection AI │                │
-│  └──────────────┘  └──────────────┘                │
-│                                                      │
-│  ┌──────────────┐                                   │
-│  │ Google TTS   │                                   │
-│  │ (Voice-over) │                                   │
-│  └──────────────┘                                   │
-└─────────────────────────────────────────────────────┘
-                          │
-                          ▼
-┌─────────────────────────────────────────────────────┐
-│          STEP 2: Video Rendering & Merging           │
-├─────────────────────────────────────────────────────┤
-│  • Merge hook + library videos                      │
-│  • Add background music                             │
-│  • Mix voice-over audio                             │
-│  • Apply transitions                                │
-└─────────────────────────────────────────────────────┘
-                          │
-                          ▼
-┌─────────────────────────────────────────────────────┐
-│            STEP 3: Subtitle Generation               │
-├─────────────────────────────────────────────────────┤
-│  • Extract timing from TTS                          │
-│  • Generate subtitle file                           │
-│  • Overlay on video                                 │
-└─────────────────────────────────────────────────────┘
-                          │
-                          ▼
-┌─────────────────────────────────────────────────────┐
-│             STEP 4: Cloud Storage Upload             │
-├─────────────────────────────────────────────────────┤
-│  • Upload to Google Cloud Storage                   │
-│  • Generate public URL                              │
-│  • Save metadata                                    │
-└─────────────────────────────────────────────────────┘
-```
 ---
-## 💻 Usage Examples
-### Basic Usage
 ```bash
-# Use default content
 python main.py
-# Output:
-# ✅ Pipeline completed successfully
-# 📹 Final Video: https://storage.googleapis.com/...
 ```
 ### Custom Content
 ```bash
-# Use custom strategy and script
-python main.py \
-  --strategy campaigns/holiday_2025.json \
-  --script scripts/holiday_promo.txt \
-  --output ./output/holiday_video
 ```
 ### Batch Processing
 ```python
 import asyncio
 from automation import ContentAutomation
-async def generate_multiple_videos():
     automation = ContentAutomation(config)
-    scripts = [
-        "scripts/script1.txt",
-        "scripts/script2.txt",
-        "scripts/script3.txt"
-    ]
     for script_file in scripts:
         with open(script_file) as f:
-            script = f.read()
-        result = await automation.execute_pipeline(
-            content_strategy=strategy,
-            tts_script=script
-        )
-        print(f"Generated: {result['final_url']}")
-asyncio.run(generate_multiple_videos())
-```
-### Health Check
-```bash
-python main.py --health-check
-# Output:
-# 🏥 Running health check...
-#   ✓ Gemini API: Connected
-#   ✓ RunwayML API: Configured
-#   ✓ TTS API: Configured
-#   ✓ Google Cloud Storage: Connected
-# ✅ All systems operational!
 ```
 ---
-## 🔧 Configuration
-### Content Strategy Format
-```json
-{
-  "brand": "Somira",
-  "gemini_prompt": "Descriptive prompt for enhancement",
-  "runway_prompt": "Specific prompt for video generation",
-  "style": "commercial",
-  "aspect_ratio": "9:16",
-  "duration": 5,
-  "platform": "Instagram Reels / TikTok"
-}
-```
-### Environment Variables
-| Variable | Required | Description |
-|----------|----------|-------------|
-| `GEMINI_API_KEY` | Yes | Google Gemini API key |
-| `RUNWAYML_API_KEY` | Yes | RunwayML API key |
-| `GOOGLE_APPLICATION_CREDENTIALS` | Yes | Path to GCP service account JSON |
-| `GCS_BUCKET_NAME` | Yes | Google Cloud Storage bucket |
-| `AUDIO_LIBRARY_SIZE` | No | Number of music tracks (default: 27) |
-| `VIDEO_LIBRARY_SIZE` | No | Number of video clips (default: 47) |
-| `DEFAULT_VOICE` | No | TTS voice name (default: en-US-Neural2-F) |
----
-## 📊 Performance
-- **Step 1 (Parallel):** 30-60 seconds (depends on RunwayML)
-- **Step 2 (Rendering):** 10-20 seconds
-- **Step 3 (Subtitles):** 5-10 seconds
-- **Step 4 (Upload):** 5-15 seconds
-**Total:** ~50-105 seconds per video
----
-## 💰 Cost Analysis
-### Per Video Cost
-| Service | Cost | Notes |
-|---------|------|-------|
-| Gemini API | ~$0.001 | Usually free tier |
-| RunwayML Gen-4 | $0.50-1.00 | Varies by duration |
-| Google TTS | ~$0.001 | Usually free tier |
-| GCS Storage | ~$0.001 | Per video |
-| **Total per video** | **~$0.50-1.00** | |
-### Monthly Estimates (100 videos)
-- Gemini: Free (within free tier)
-- RunwayML: $50-100
-- Google TTS: Free (within 1M chars/month)
-- GCS: $2-5
-- **Total: $52-105/month**
 ---
-## 🛡️ Error Handling
-The system includes comprehensive error handling:
-- ✅ **Automatic retries** for transient API failures
-- ✅ **Fallback mechanisms** for video/music selection
-- ✅ **Graceful degradation** when optional features fail
-- ✅ **Detailed logging** for debugging
-- ✅ **Partial results** saved on pipeline failure
----
-## 📁 Project Structure
 ```
-somira-automation/
-├── main.py                  # CLI entry point
-├── automation.py            # Pipeline orchestrator
-├── api_clients.py           # API integrations (Gemini, RunwayML, TTS, GCS)
-├── video_renderer.py        # Video processing and rendering
-├── utils.py                 # Logging and utility functions
-├── requirements.txt         # Python dependencies
-├── .env.example             # Environment variables template
-├── example_strategy.json    # Sample content strategy
-├── example_script.txt       # Sample TTS script
-├── README.md                # This file
-├── QUICKSTART.md            # Quick start guide
-└── API_SETUP_GUIDE.md       # Detailed API setup instructions
-```
 ---
-## 🔐 Security Best Practices
-1. **Never commit `.env` file** - Added to `.gitignore`
-2. **Use environment variables** - No hardcoded keys
-3. **Restrict API key permissions** - Minimum necessary access
-4. **Rotate keys regularly** - Every 90 days recommended
-5. **Monitor API usage** - Set up billing alerts
-6. **Use service accounts** - For GCP resources
----
-## 🐛 Troubleshooting
-### Common Issues
-**"Module not found"**
-```bash
-pip install -r requirements.txt
-```
-**"API key not valid"**
-- Check your `.env` file
-- Verify keys are correctly copied (no extra spaces)
-- Ensure APIs are enabled in respective consoles
-**"Insufficient credits" (RunwayML)**
-- Add credits at https://dev.runwayml.com/
-- Minimum $10 required
-**"Permission denied" (GCS)**
-- Check service account has Storage Admin role
-- Verify `GOOGLE_APPLICATION_CREDENTIALS` path is correct
-**Videos taking too long**

+# 🎬 Somira Content Automation
+**AI-powered video generation pipeline that transforms text scripts into professional product advertisements.**
 ---
 ## 🚀 Quick Start
 ### 1. Installation
 ```bash
+# Clone and setup
+git clone <your-repo>
 cd somira-automation
+# Create virtual environment (recommended)
 python -m venv venv
+source venv/bin/activate  # Windows: venv\Scripts\activate
 # Install dependencies
 pip install -r requirements.txt
 ```
+### 2. API Setup
+**You need these API keys:**
+#### Gemini API (Free)
+1. Go to https://aistudio.google.com/app/apikey
+2. Click "Create API Key"
+3. Copy the key
+#### RunwayML API ($10 minimum)
+1. Go to https://dev.runwayml.com/
+2. Sign up and create organization
+3. Go to "API Keys" → "Create new key"
+4. Add $10+ credits in "Billing" tab
+#### Google Cloud (Free tier available)
+1. Go to https://console.cloud.google.com/
+2. Create project → Enable "Text-to-Speech API"
+3. Create service account → Download JSON key
+4. Create storage bucket
+### 3. Configuration
 ```bash
+# Copy and edit environment file
 cp .env.example .env
 ```
+Edit `.env` with your keys:
+```bash
+# Required API Keys
+GEMINI_API_KEY=AIzaSyC_your_key_here
+RUNWAYML_API_KEY=key_your_runwayml_key_here
+GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
+GCS_BUCKET_NAME=your-bucket-name
+# Optional Settings
+DEFAULT_VOICE=en-US-Neural2-F
+AUDIO_LIBRARY_SIZE=27
+VIDEO_LIBRARY_SIZE=47
+```
+### 4. Verify Setup
 ```bash
 python main.py --health-check
 ```
+You should see: `✅ All systems operational!`
+### 5. Generate Your First Video
 ```bash
 python main.py
 ```
 ---
+## 🎯 What It Does
+This system automatically creates 15-second vertical videos (perfect for TikTok/Reels) by:
+1. **AI Video Generation** - Creates unique hook videos using RunwayML Gen-4
+2. **Smart Content Selection** - Gemini AI analyzes your script to pick relevant product footage
+3. **Professional Voice-overs** - Converts text to natural speech using Google TTS
+4. **Auto Editing** - Merges videos, adds background music, subtitles, and effects
+5. **Cloud Storage** - Uploads final videos to Google Cloud Storage
+**Pipeline Time**: ~1-2 minutes per video
 ---
+## 💻 Usage
+### Basic Commands
 ```bash
+# Generate video with default content
 python main.py
+# Test system (uses minimal credits)
+python main.py --test
+# Health check
+python main.py --health-check
+# Custom content
+python main.py --strategy strategy.json --script script.txt
 ```
 ### Custom Content
+Create `my_script.txt`:
+```
+I heard a pop and my neck was stuck. After one minute with Somira massager, the pain was gone. This product actually works!
+```
+Create `my_strategy.json`:
+```json
+{
+  "brand": "Somira",
+  "gemini_prompt": "A dramatic scene showing neck pain relief",
+  "runway_prompt": "Person experiencing neck pain then relief",
+  "style": "commercial",
+  "aspect_ratio": "9:16",
+  "duration": 5
+}
+```
+Run:
 ```bash
+python main.py --strategy my_strategy.json --script my_script.txt
 ```
 ### Batch Processing
 ```python
 import asyncio
 from automation import ContentAutomation
+async def create_videos():
     automation = ContentAutomation(config)
+    scripts = ["script1.txt", "script2.txt", "script3.txt"]
     for script_file in scripts:
         with open(script_file) as f:
+            result = await automation.execute_pipeline(strategy, f.read())
+        print(f"Created: {result['final_url']}")
+asyncio.run(create_videos())
 ```
 ---
+## 💰 Pricing
+### Cost Per Video
+| Service | Cost |
+|---------|------|
+| RunwayML (5s video) | ~$0.50 |
+| Gemini API | ~$0.001 |
+| Google TTS | ~$0.001 |
+| Cloud Storage | ~$0.001 |
+| **Total** | **~$0.50** |
+### Monthly Estimate (100 videos)
+- **RunwayML**: $50
+- **Other services**: $2-5
+- **Total**: ~$55/month
+---
+## 🏗️ How It Works
+### Pipeline Steps
+1. **Asset Generation** (30-60s)
+   - AI creates hook video from prompt
+   - Selects 3 relevant product videos
+   - Generates voice-over from script
+   - Picks background music
+2. **Video Composition** (10-20s)
+   - Merges all video clips
+   - Adds audio tracks and music
+   - Applies transitions
+3. **Subtitles** (5-10s)
+   - Generates animated subtitles
+   - Times them to voice-over
+4. **Cloud Upload** (5-15s)
+   - Uploads to Google Cloud Storage
+   - Returns public URL
+### Output Specifications
+- **Format**: MP4, H.264
+- **Aspect Ratio**: 9:16 (vertical)
+- **Duration**: 15 seconds max
+- **Resolution**: 1080x1920
+- **Audio**: 44.1kHz, stereo
+---
+## 🔧 Technical Details
+### Project Structure
+```
+somira-automation/
+├── main.py                 # CLI entry point
+├── automation.py           # Pipeline orchestrator
+├── api_clients.py          # Gemini, RunwayML, TTS, GCS
+├── video_renderer.py       # Video processing engine
+├── asset_selector.py       # AI video selection
+├── utils.py                # Logging & utilities
+├── requirements.txt        # Python dependencies
+└── config/
+    ├── api_keys.yaml       # API configurations
+    └── content_strategies.yaml
+```
+### Key Dependencies
+- `moviepy` - Video editing and composition
+- `google-generativeai` - Gemini API client
+- `google-cloud-texttospeech` - TTS service
+- `google-cloud-storage` - Cloud storage
+- `aiohttp` - Async HTTP requests
+- `pandas` - Data processing
+### API Requirements
+- **Gemini**: Free tier available
+- **RunwayML**: $10 minimum deposit
+- **Google Cloud**: $300 free credits for new accounts
+- **Storage**: 5GB free tier
 ---
+## 🐛 Troubleshooting
+### Common Issues
+**"API key not found"**
+- Check `.env` file exists and has correct keys
+- Restart terminal after adding keys to `.env`
+**"Insufficient RunwayML credits"**
+- Add credits at https://dev.runwayml.com/
+- Minimum $10 required
+**"Google Cloud permission denied"**
+- Verify service account JSON path in `.env`
+- Check service account has "Storage Admin" role
+**"Module not found"**
+```bash
+pip install -r requirements.txt
 ```
+**Videos taking too long**
+- RunwayML generation takes 30-60 seconds
+- Use `--test` for quick verification
+### Performance Tips
+- Keep scripts under 200 characters for optimal TTS
+- Use specific, visual prompts for better AI videos
+- Test with `--test` flag before full runs
+- Monitor API usage in respective dashboards
 ---
+## 📞 Support
+### Debugging
+- Run with `--verbose` for detailed logs
+- Check console output for specific error messages
+- Verify all APIs are enabled in their consoles
+### Cost Control
+- Use `--test` frequently during development
+- Set billing alerts in Google Cloud & RunwayML
+- Monitor usage in API dashboards
+### Security
+- ✅ Never commit `.env` file (included in `.gitignore`)
+- ✅ Use environment variables for all keys
+- ✅ Rotate API keys every 90 days
+- ❌ Never hardcode keys in source files
+---
+## 🎉 Next Steps
+1. ✅ Complete API setup
+2. ✅ Run `python main.py --health-check`
+3. ✅ Test with `python main.py --test`
+4. ✅ Generate first video with `python main.py`
+5. 🚀 Customize scripts and strategies for your products
+6. 📈 Scale with batch processing for multiple videos
+**Need help?** Check the error messages in console - they're designed to be helpful and specific about what went wrong.
+---
+*Happy video generating! 🎬*

config/api_keys.yaml CHANGED Viewed

@@ -1,17 +1,24 @@
-# API Configuration
 gemini:
   base_url: "https://generativelanguage.googleapis.com/v1beta"
-  model: "gemini-pro"
 runwayml:
   base_url: "https://api.runwayml.com/v1"
   timeout: 300
 tts:
-  provider: "azure"  # or "google", "amazon"
-  voice: "en-US-AriaNeural"
-  rate: "medium"
 gcs:
   bucket: "somira-videos"
   video_prefix: "automated-content/"

 gemini:
   base_url: "https://generativelanguage.googleapis.com/v1beta"
+  model: "gemini-2.0-flash-exp"
 runwayml:
   base_url: "https://api.runwayml.com/v1"
   timeout: 300
+deepseek:
+  base_url: "https://api.deepseek.com/v1"
+  model: "deepseek-chat"
 tts:
+  provider: "google"
+  voice: "en-US-Neural2-F"
 gcs:
   bucket: "somira-videos"
   video_prefix: "automated-content/"
+video:
+  max_duration: 15
+  aspect_ratio: "9:16"
+  target_resolution: "1080x1920"

requirements.txt CHANGED Viewed

@@ -1,17 +1,56 @@
-# Core async HTTP
-aiohttp==3.9.5
 aiofiles==23.2.1
-# Google AI (Gemini)
-google-generativeai==0.8.3
-# Google Cloud Services
 google-cloud-storage==2.18.2
 google-cloud-texttospeech==2.17.2
-# Environment variables
 python-dotenv==1.0.1
-# Utilities
-asyncio==3.4.3
-typing-extensions==4.12.2

 aiofiles==23.2.1
+aiohttp==3.9.5
+aiosignal==1.4.0
+annotated-types==0.7.0
+attrs==25.3.0
+cachetools==5.5.2
+certifi==2025.8.3
+charset-normalizer==3.4.3
+decorator==4.4.2
+frozenlist==1.7.0
+google-ai-generativelanguage==0.6.10
+google-api-core==2.25.1
+google-api-python-client==2.184.0
+google-auth==2.40.3
+google-auth-httplib2==0.2.0
+google-cloud-core==2.4.3
 google-cloud-storage==2.18.2
 google-cloud-texttospeech==2.17.2
+google-crc32c==1.7.1
+google-generativeai==0.8.3
+google-resumable-media==2.7.2
+googleapis-common-protos==1.70.0
+grpcio==1.75.1
+grpcio-status==1.71.2
+httplib2==0.31.0
+idna==3.10
+imageio==2.37.0
+imageio-ffmpeg==0.6.0
+moviepy==1.0.3
+multidict==6.6.4
+numpy==1.26.4
+pandas==2.3.3
+pillow==11.3.0
+proglog==0.1.12
+propcache==0.4.0
+proto-plus==1.26.1
+protobuf==5.29.5
+pyasn1==0.6.1
+pyasn1_modules==0.4.2
+pydantic==2.11.10
+pydantic_core==2.33.2
+pyparsing==3.2.5
+python-dateutil==2.9.0.post0
 python-dotenv==1.0.1
+pytz==2025.2
+PyYAML==6.0.3
+requests==2.32.5
+rsa==4.9.1
+six==1.17.0
+tqdm==4.67.1
+typing-inspection==0.4.2
+typing_extensions==4.15.0
+tzdata==2025.2
+uritemplate==4.2.0
+urllib3==2.5.0
+yarl==1.21.0

src/api_clients.py CHANGED Viewed

@@ -5,7 +5,7 @@ import aiohttp
 import json
 import os
 from typing import Dict, List, Optional
-from google import genai
 from google.cloud import storage, texttospeech
 import asyncio
 from utils import logger
@@ -16,9 +16,8 @@ class APIClients:
         self.config = config
         # Initialize Gemini client
-        self.gemini_client = genai.Client(
-            api_key=config.get('gemini_api_key') or os.getenv('GEMINI_API_KEY')
-        )
         # Initialize GCS client
         self.gcs_client = storage.Client()
@@ -57,11 +56,9 @@ class APIClients:
             Return only the enhanced prompt, nothing else.
             """
-            response = self.gemini_client.models.generate_content(
-                model="gemini-2.0-flash-exp",
-                contents=enhancement_instruction
-            )
             enhanced_prompt = response.text.strip()
             logger.info(f"Enhanced prompt: {enhanced_prompt[:100]}...")
@@ -75,20 +72,14 @@ class APIClients:
     async def generate_video(self, prompt: str, duration: int = 10) -> Dict:
         """
         Generate video using RunwayML Gen-4 API
-        Args:
-            prompt: Text prompt for video generation
-            duration: Video duration in seconds (5 or 10)
-        Returns:
-            Dict with video URL and metadata
         """
         try:
             logger.info(f"Generating video with RunwayML: {prompt[:100]}...")
             headers = {
                 "Authorization": f"Bearer {self.runway_api_key}",
-                "Content-Type": "application/json"
             }
             payload = {
@@ -151,20 +142,13 @@ class APIClients:
     async def generate_tts(self, text: str, voice_name: Optional[str] = None) -> Dict:
         """
-        Generate TTS audio using Azure Cognitive Services
-        Args:
-            text: Text to convert to speech
-            voice_name: Azure voice name (default from config)
-        Returns:
-            Dict with audio URL, duration, and lip sync data
         """
         try:
             logger.info(f"Generating TTS for text: {text[:100]}...")
             if not voice_name:
-                voice_name = self.config.get('default_voice', 'en-US-AriaNeural')
             # Configure the speech synthesis request
             synthesis_input = texttospeech.SynthesisInput(text=text)
@@ -184,15 +168,16 @@ class APIClients:
                 pitch=0.0
             )
-            # Perform the text-to-speech request
             response = self.tts_client.synthesize_speech(
                 input=synthesis_input,
                 voice=voice,
-                audio_config=audio_config,
-                enable_time_pointing=[texttospeech.TimePointingType.SSML_MARK]
             )
             # Save audio to temporary file
             audio_filename = f"tts_{hash(text)}.mp3"
             audio_path = f"/tmp/{audio_filename}"
@@ -202,23 +187,111 @@ class APIClients:
             # Upload to GCS
             audio_url = await self.store_in_gcs(audio_path, 'audio')
-            # Extract timing information for lip sync
-            lip_sync_data = self._extract_timing_data(response)
             logger.info(f"TTS generated successfully: {audio_url}")
             return {
                 'audio_url': audio_url,
                 'duration': len(response.audio_content) / 32000,  # Approximate
-                'lip_sync_data': lip_sync_data,
                 'voice': voice_name,
-                'text': text
             }
         except Exception as e:
             logger.error(f"Error generating TTS: {e}")
             raise
     async def select_videos(self, tts_script: str, count: int = 3) -> List[Dict]:
         """
         AI agent selects videos based on script using Gemini
@@ -246,11 +319,8 @@ class APIClients:
             Return as JSON array with format:
             [{{"keyword": "...", "timing": "0-5", "style": "..."}}, ...]
             """
-            response = self.gemini_client.models.generate_content(
-                model="gemini-2.0-flash-exp",
-                contents=analysis_prompt
-            )
             # Parse Gemini response
             try:

 import json
 import os
 from typing import Dict, List, Optional
+import google.generativeai as genai
 from google.cloud import storage, texttospeech
 import asyncio
 from utils import logger
         self.config = config
         # Initialize Gemini client
+        self.gemini_client = genai
+        genai.configure(api_key=config.get('gemini_api_key') or os.getenv('GEMINI_API_KEY'))
         # Initialize GCS client
         self.gcs_client = storage.Client()
             Return only the enhanced prompt, nothing else.
             """
+            model = genai.GenerativeModel('gemini-2.0-flash-exp')
+            response = model.generate_content(enhancement_instruction)
             enhanced_prompt = response.text.strip()
             logger.info(f"Enhanced prompt: {enhanced_prompt[:100]}...")
     async def generate_video(self, prompt: str, duration: int = 10) -> Dict:
         """
         Generate video using RunwayML Gen-4 API
         """
         try:
             logger.info(f"Generating video with RunwayML: {prompt[:100]}...")
             headers = {
                 "Authorization": f"Bearer {self.runway_api_key}",
+                "Content-Type": "application/json",
+                "X-Runway-Version": "1.0.0"  # Add this required header
             }
             payload = {
     async def generate_tts(self, text: str, voice_name: Optional[str] = None) -> Dict:
         """
+        Generate TTS audio using Google Cloud TTS
         """
         try:
             logger.info(f"Generating TTS for text: {text[:100]}...")
             if not voice_name:
+                voice_name = self.config.get('default_voice', 'en-US-Neural2-F')
             # Configure the speech synthesis request
             synthesis_input = texttospeech.SynthesisInput(text=text)
                 pitch=0.0
             )
+            # Remove TimePointingType as it's not available in this version
             response = self.tts_client.synthesize_speech(
                 input=synthesis_input,
                 voice=voice,
+                audio_config=audio_config
+                # Remove: enable_time_pointing=[texttospeech.TimePointingType.SSML_MARK]
             )
             # Save audio to temporary file
+            import tempfile
             audio_filename = f"tts_{hash(text)}.mp3"
             audio_path = f"/tmp/{audio_filename}"
             # Upload to GCS
             audio_url = await self.store_in_gcs(audio_path, 'audio')
+            # Remove lip sync data extraction
             logger.info(f"TTS generated successfully: {audio_url}")
             return {
                 'audio_url': audio_url,
                 'duration': len(response.audio_content) / 32000,  # Approximate
                 'voice': voice_name,
+                'text': text,
+                'local_path': audio_path  # Add local path directly
             }
         except Exception as e:
             logger.error(f"Error generating TTS: {e}")
             raise
+    async def download_file(self, url: str, filename: str) -> str:
+        """Download file from URL to local temporary file"""
+        import aiohttp
+        import tempfile
+        from pathlib import Path
+        local_path = Path(tempfile.gettempdir()) / filename
+        try:
+            async with aiohttp.ClientSession() as session:
+                async with session.get(url) as response:
+                    if response.status == 200:
+                        with open(local_path, 'wb') as f:
+                            f.write(await response.read())
+                        logger.info(f"✓ Downloaded {filename} from {url}")
+                        return str(local_path)
+                    else:
+                        raise Exception(f"Download failed: {response.status}")
+        except Exception as e:
+            logger.error(f"Failed to download {url}: {e}")
+            raise
+    async def health_check(self) -> Dict[str, bool]:
+        """
+        Check health of all API connections
+        Returns:
+            Dict with service health status
+        """
+        logger.info("🏥 Running health check...")
+        health = {
+            'gemini': False,
+            'runwayml': False,
+            'tts': False,
+            'gcs': False
+        }
+        try:
+            # Test Gemini with a simple prompt
+            test_prompt = "Hello"
+            enhanced = await self.enhance_prompt(test_prompt)
+            if enhanced and len(enhanced) > 0:
+                health['gemini'] = True
+                logger.info("  ✅ Gemini API: Connected")
+            else:
+                logger.error("  ❌ Gemini API: No response")
+        except Exception as e:
+            logger.error(f"  ❌ Gemini API: {e}")
+        try:
+            # Test GCS - check if bucket exists and is accessible
+            from google.cloud.exceptions import NotFound
+            try:
+                self.gcs_bucket.exists()
+                health['gcs'] = True
+                logger.info("  ✅ Google Cloud Storage: Connected")
+            except NotFound:
+                logger.error("  ❌ Google Cloud Storage: Bucket not found")
+            except Exception as e:
+                logger.error(f"  ❌ Google Cloud Storage: {e}")
+        except Exception as e:
+            logger.error(f"  ❌ Google Cloud Storage check failed: {e}")
+        # Check if API keys are configured (without making actual API calls)
+        if self.runway_api_key and len(self.runway_api_key) > 10:
+            health['runwayml'] = True
+            logger.info("  ✅ RunwayML API: Configured")
+        else:
+            logger.error("  ❌ RunwayML API: Not configured or invalid key")
+        if self.tts_client:
+            health['tts'] = True
+            logger.info("  ✅ TTS API: Configured")
+        else:
+            logger.error("  ❌ TTS API: Not configured")
+        # Check DeepSeek configuration
+        deepseek_key = self.config.get('deepseek_api_key')
+        if deepseek_key and len(deepseek_key) > 10:
+            logger.info("  ✅ DeepSeek API: Configured")
+        else:
+            logger.warning("  ⚠️ DeepSeek API: Not configured")
+        all_healthy = all(health.values())
+        status = "✅ All systems operational!" if all_healthy else "⚠️ Some services have issues"
+        logger.info(f"\n{status}")
+        return health
     async def select_videos(self, tts_script: str, count: int = 3) -> List[Dict]:
         """
         AI agent selects videos based on script using Gemini
             Return as JSON array with format:
             [{{"keyword": "...", "timing": "0-5", "style": "..."}}, ...]
             """
+            model = genai.GenerativeModel('gemini-2.0-flash-exp')
+            response = model.generate_content(analysis_prompt)
             # Parse Gemini response
             try:

src/asset_selector.py ADDED Viewed

	@@ -0,0 +1,233 @@

+"""
+AI-powered asset selection using DeepSeek for contextual video matching
+"""
+import pandas as pd
+import aiohttp
+import json
+from typing import List, Dict, Optional
+from utils import logger
+class AssetSelector:
+    def __init__(self, config: Dict):
+        self.config = config
+        self.video_library = self._load_video_library()
+        self.audio_library = self._load_audio_library()
+    def _load_video_library(self) -> pd.DataFrame:
+        """Load video library from CSV data"""
+        try:
+            # Create a simple video library from your provided data
+            video_data = [
+                {
+                    'url': 'https://storage.googleapis.com/somira/Somira%20Massager.mp4',
+                    'duration': 2,
+                    'alignment': 'product mention, solution, features',
+                    'energy': 5,
+                    'description': 'Product showcase'
+                },
+                {
+                    'url': 'https://storage.googleapis.com/somira/FemaleWomenPuttingOnNeckMassagerr.mp4',
+                    'duration': 2,
+                    'alignment': 'using the product, turning on, operation',
+                    'energy': 35,
+                    'description': 'Product usage demonstration'
+                },
+                {
+                    'url': 'https://storage.googleapis.com/somira/PersonEnjoyingTheNeckMassager.mp4',
+                    'duration': 1.5,
+                    'alignment': 'comfort, relaxation, satisfaction',
+                    'energy': 40,
+                    'description': 'User satisfaction'
+                },
+                # Add more videos as needed for testing
+            ]
+            return pd.DataFrame(video_data)
+        except Exception as e:
+            logger.error(f"Failed to load video library: {e}")
+            return pd.DataFrame()
+    def _load_audio_library(self) -> List[str]:
+        """Load audio library URLs"""
+        return [f"https://storage.googleapis.com/somira/{i}.mp3" for i in range(1, 27)]
+    async def select_videos(self, tts_script: str, max_duration: int = 10) -> List[Dict]:
+        """
+        Select videos using AI analysis of TTS script
+        Args:
+            tts_script: The script to analyze
+            max_duration: Maximum total duration for selected videos
+        Returns:
+            List of selected video metadata
+        """
+        try:
+            logger.info(f"🤖 AI video selection for script: {tts_script[:100]}...")
+            # Use DeepSeek for intelligent selection
+            selected_videos = await self._analyze_with_deepseek(tts_script, max_duration)
+            if not selected_videos:
+                logger.warning("⚠️ AI selection failed, using fallback")
+                selected_videos = self._fallback_selection(tts_script, max_duration)
+            total_duration = sum(v['duration'] for v in selected_videos)
+            logger.info(f"✓ Selected {len(selected_videos)} videos, total: {total_duration}s")
+            return selected_videos
+        except Exception as e:
+            logger.error(f"❌ Video selection failed: {e}")
+            return self._fallback_selection(tts_script, max_duration)
+    async def _analyze_with_deepseek(self, tts_script: str, max_duration: int) -> List[Dict]:
+        """Use DeepSeek API for contextual video selection"""
+        try:
+            # Prepare video library context
+            video_context = "\n".join([
+                f"{i}. {row['description']} - {row['duration']}s - Alignment: {row['alignment']}"
+                for i, row in self.video_library.iterrows()
+            ])
+            prompt = f"""
+            TTS Script: "{tts_script}"
+            Available Videos:
+            {video_context}
+            Select 3-4 videos that best match the script content. Consider:
+            - Video alignment descriptions
+            - Logical flow (problem -> solution -> result)
+            - Total duration under {max_duration} seconds
+            - Energy level appropriateness
+            Return JSON format:
+            {{
+                "selected_videos": [
+                    {{
+                        "index": 0,
+                        "reason": "Matches product mention in script",
+                        "start_time": 0
+                    }}
+                ],
+                "total_duration": 8,
+                "rationale": "Overall selection strategy"
+            }}
+            """
+            # DeepSeek API call
+            headers = {
+                "Authorization": f"Bearer {self.config.get('deepseek_api_key')}",
+                "Content-Type": "application/json"
+            }
+            payload = {
+                "model": "deepseek-chat",
+                "messages": [
+                    {"role": "system", "content": "You are a video editor AI that selects the most relevant videos for advertising content."},
+                    {"role": "user", "content": prompt}
+                ],
+                "temperature": 0.3,
+                "max_tokens": 2000
+            }
+            async with aiohttp.ClientSession() as session:
+                async with session.post(
+                    "https://api.deepseek.com/v1/chat/completions",
+                    headers=headers,
+                    json=payload
+                ) as response:
+                    if response.status == 200:
+                        result = await response.json()
+                        selection = json.loads(result['choices'][0]['message']['content'])
+                        # Map to actual video data
+                        selected = []
+                        for item in selection['selected_videos']:
+                            if item['index'] < len(self.video_library):
+                                video = self.video_library.iloc[item['index']]
+                                selected.append({
+                                    'url': video['url'],
+                                    'duration': video['duration'],
+                                    'reason': item['reason'],
+                                    'alignment': video['alignment'],
+                                    'energy': video['energy']
+                                })
+                        return selected
+                    else:
+                        logger.error(f"DeepSeek API error: {response.status}")
+                        return []
+        except Exception as e:
+            logger.error(f"DeepSeek analysis failed: {e}")
+            return []
+    def _fallback_selection(self, tts_script: str, max_duration: int) -> List[Dict]:
+        """Fallback selection based on keyword matching"""
+        script_lower = tts_script.lower()
+        selected = []
+        total_duration = 0
+        # Define keyword mappings for fallback
+        keyword_mappings = {
+            'pain': ['pop', 'stuck', 'neck', 'pain'],
+            'solution': ['somira', 'massager', 'solution', 'relief'],
+            'satisfaction': ['gone', 'comfort', 'satisfaction']
+        }
+        # Simple fallback videos
+        fallback_videos = [
+            {
+                'url': 'https://storage.googleapis.com/somira/Somira%20Massager.mp4',
+                'duration': 2,
+                'reason': 'Product showcase',
+                'alignment': 'product',
+                'energy': 5
+            },
+            {
+                'url': 'https://storage.googleapis.com/somira/FemaleWomenPuttingOnNeckMassagerr.mp4',
+                'duration': 2,
+                'reason': 'Usage demonstration',
+                'alignment': 'usage',
+                'energy': 35
+            },
+            {
+                'url': 'https://storage.googleapis.com/somira/PersonEnjoyingTheNeckMassager.mp4',
+                'duration': 1.5,
+                'reason': 'User satisfaction',
+                'alignment': 'satisfaction',
+                'energy': 40
+            }
+        ]
+        # Select based on keywords in script
+        for video in fallback_videos:
+            if total_duration + video['duration'] <= max_duration:
+                selected.append(video)
+                total_duration += video['duration']
+        return selected[:3]  # Max 3 videos
+    def _find_video_for_category(self, category: str) -> Optional[Dict]:
+        """Find best video for a category"""
+        for _, row in self.video_library.iterrows():
+            if category in str(row['alignment']).lower():
+                return {
+                    'url': row['url'],
+                    'duration': row['duration'],
+                    'reason': f"Matches {category} category",
+                    'alignment': row['alignment'],
+                    'energy': row['energy']
+                }
+        return None
+    def select_background_music(self) -> str:
+        """Select background music using round-robin"""
+        import random
+        selected = random.choice(self.audio_library)
+        logger.info(f"🎵 Selected background music: {selected}")
+        return selected

src/automation.py CHANGED Viewed

@@ -1,12 +1,15 @@
 """
-Main automation orchestrator with full implementation
 """
 import asyncio
 import os
 import time
 from typing import Dict, List, Optional, Any
 from api_clients import APIClients
 from video_renderer import VideoRenderer
 from utils import logger
@@ -15,393 +18,391 @@ class ContentAutomation:
         self.config = config
         self.api_clients = APIClients(config)
         self.video_renderer = VideoRenderer(config)
-        self.current_audio_index = 0
         self.pipeline_start_time = None
-    async def execute_pipeline(
-        self,
-        content_strategy: Dict[str, str],
-        tts_script: str,
-        video_config: Optional[Dict] = None
-    ) -> Dict[str, Any]:
-        """
-        Execute the complete automation pipeline
-        Args:
-            content_strategy: Dict with prompts and style preferences
-            tts_script: Text script for voice-over
-            video_config: Optional video rendering configuration
-        Returns:
-            Dict with final video URL and metadata
         """
         self.pipeline_start_time = time.time()
-        logger.info("=" * 60)
-        logger.info("🚀 Starting Content Automation Pipeline")
-        logger.info("=" * 60)
         try:
-            # Step 1: Generate all assets simultaneously
-            logger.info("\n📦 STEP 1: Generating Assets (Parallel Execution)")
-            assets = await self.execute_step_1(content_strategy, tts_script)
-            self._log_step_completion(1, assets)
-            # Validate critical assets
-            if not self._validate_assets(assets):
-                raise Exception("Critical assets failed to generate")
-            # Step 2: Merge videos and audio
-            logger.info("\n🎬 STEP 2: Rendering Video")
-            rendered_video = await self.video_renderer.render_video(
-                assets,
-                video_config or {}
-            )
-            self._log_step_completion(2, {'rendered_video': rendered_video})
-            # Step 3: Add subtitles
-            logger.info("\n📝 STEP 3: Adding Subtitles")
-            subtitled_video = await self.video_renderer.add_subtitles(
-                rendered_video,
-                tts_script,
-                assets.get('tts_audio', {})
-            )
-            self._log_step_completion(3, {'subtitled_video': subtitled_video})
-            # Step 4: Store final video in GCS
-            logger.info("\n☁️  STEP 4: Uploading to Cloud Storage")
-            final_url = await self.api_clients.store_in_gcs(
-                subtitled_video,
-                content_type='video'
-            )
-            self._log_step_completion(4, {'final_url': final_url})
-            # Pipeline completion summary
             elapsed_time = time.time() - self.pipeline_start_time
-            logger.info("\n" + "=" * 60)
-            logger.info(f"✅ Pipeline Completed Successfully in {elapsed_time:.2f}s")
-            logger.info(f"📹 Final Video: {final_url}")
-            logger.info("=" * 60)
             return {
                 'success': True,
                 'final_url': final_url,
-                'local_path': subtitled_video,
-                'assets': assets,
                 'duration': elapsed_time,
-                'metadata': {
-                    'content_strategy': content_strategy,
-                    'tts_script': tts_script,
-                    'timestamp': time.time()
                 }
             }
         except Exception as e:
             elapsed_time = time.time() - self.pipeline_start_time if self.pipeline_start_time else 0
-            logger.error(f"\n❌ Pipeline Failed after {elapsed_time:.2f}s: {e}")
             return {
                 'success': False,
                 'error': str(e),
-                'duration': elapsed_time,
-                'partial_assets': locals().get('assets', {})
             }
-    async def execute_step_1(
-        self,
-        content_strategy: Dict[str, str],
-        tts_script: str
-    ) -> Dict[str, Any]:
-        """
-        Execute all step 1 processes simultaneously for maximum efficiency
-        Args:
-            content_strategy: Content generation strategy
-            tts_script: Text for TTS generation
-        Returns:
-            Dict containing all generated assets
-        """
-        logger.info("⚡ Launching parallel tasks...")
-        # Create all tasks
         tasks = {
-            'hook_video': self.generate_hook_video(content_strategy),
-            'background_music': self.select_background_music(),
-            'selected_videos': self.select_videos_from_library(tts_script),
-            'tts_audio': self.generate_tts_audio(tts_script)
         }
-        # Execute all tasks concurrently
-        start_time = time.time()
-        results = await asyncio.gather(
-            *tasks.values(),
-            return_exceptions=True
-        )
-        execution_time = time.time() - start_time
-        # Map results back to task names
-        assets = {}
-        for (task_name, _), result in zip(tasks.items(), results):
-            if isinstance(result, Exception):
-                logger.error(f"❌ {task_name} failed: {result}")
-                assets[task_name] = None
-            else:
                 logger.info(f"✓ {task_name} completed")
-                assets[task_name] = result
-        logger.info(f"\n⚡ Parallel execution completed in {execution_time:.2f}s")
-        return assets
-    async def generate_hook_video(self, strategy: Dict[str, str]) -> Optional[Dict]:
-        """
-        Generate hook video using AI APIs with prompt enhancement
-        Args:
-            strategy: Content strategy with prompts
-        Returns:
-            Dict with video URL and metadata, or None if failed
-        """
         try:
-            logger.info("🎥 Generating hook video...")
-            # Choose the right prompt
-            base_prompt = strategy.get('runway_prompt') or strategy.get('gemini_prompt')
-            if not base_prompt:
-                raise ValueError("No prompt found in strategy")
-            # Enhance prompt with Gemini for better video quality
-            logger.info("  → Enhancing prompt with Gemini AI...")
-            enhanced_prompt = await self.api_clients.enhance_prompt(base_prompt)
-            # Generate video with RunwayML
-            logger.info("  → Generating video with RunwayML Gen-4...")
             video_data = await self.api_clients.generate_video(
                 enhanced_prompt,
-                duration=strategy.get('duration', 5)  # Default 5s for hook
             )
-            logger.info(f"  ✓ Hook video generated: {video_data.get('task_id', 'N/A')}")
             return video_data
         except Exception as e:
-            logger.error(f"  ✗ Hook video generation failed: {e}")
             return None
-    async def select_background_music(self) -> str:
-        """
-        Select background music from library using linear rotation
-        Returns:
-            URL to background music file
-        """
-        try:
-            logger.info("🎵 Selecting background music...")
-            # Linear selection with rotation
-            audio_index = self.current_audio_index
-            self.current_audio_index = (self.current_audio_index + 1) % self.config['audio_library_size']
-            # Construct GCS URL
-            bucket_name = self.config.get('gcs_bucket_name', 'somira-videos')
-            audio_url = f"gs://{bucket_name}/audio-library/audio{audio_index + 1}.mp3"
-            logger.info(f"  ✓ Selected audio #{audio_index + 1}: {audio_url}")
-            return audio_url
-        except Exception as e:
-            logger.error(f"  ✗ Music selection failed: {e}")
-            # Return default/fallback audio
-            return f"gs://{self.config.get('gcs_bucket_name')}/audio-library/default.mp3"
-    async def select_videos_from_library(self, tts_script: str) -> List[Dict]:
-        """
-        AI agent selects 3 videos based on TTS script content
-        Args:
-            tts_script: The voice-over script to analyze
-        Returns:
-            List of selected video metadata dicts
-        """
-        try:
-            logger.info("🎬 Selecting videos from library...")
-            logger.info(f"  → Analyzing script: {tts_script[:80]}...")
-            # Use AI to select contextually relevant videos
-            selected_videos = await self.api_clients.select_videos(tts_script, count=3)
-            if not selected_videos:
-                logger.warning("  ⚠ No videos selected, using fallback")
-                return self._get_fallback_videos()
-            logger.info(f"  ✓ Selected {len(selected_videos)} videos:")
-            for i, video in enumerate(selected_videos, 1):
-                logger.info(f"    {i}. {video.get('keyword', 'N/A')} - {video.get('reason', 'N/A')}")
-            return selected_videos
-        except Exception as e:
-            logger.error(f"  ✗ Video selection failed: {e}")
-            return self._get_fallback_videos()
-    async def generate_tts_audio(self, tts_script: str) -> Optional[Dict]:
-        """
-        Generate TTS audio with timing data for lip-sync and subtitles
-        Args:
-            tts_script: Text to convert to speech
-        Returns:
-            Dict with audio URL, duration, and timing data
-        """
-        try:
-            logger.info("🎙️  Generating TTS audio...")
-            logger.info(f"  → Script length: {len(tts_script)} characters")
-            # Get voice from config
-            voice_name = self.config.get('default_voice', 'en-US-AriaNeural')
-            # Generate TTS with timing data
-            tts_result = await self.api_clients.generate_tts(
-                tts_script,
-                voice_name=voice_name
             )
-            if tts_result:
-                duration = tts_result.get('duration', 0)
-                logger.info(f"  ✓ TTS generated: {duration:.2f}s duration")
-                logger.info(f"  ✓ Audio URL: {tts_result.get('audio_url', 'N/A')}")
-            return tts_result
-        except Exception as e:
-            logger.error(f"  ✗ TTS generation failed: {e}")
-            return None
-    def _validate_assets(self, assets: Dict[str, Any]) -> bool:
-        """
-        Validate that critical assets were generated successfully
-        Args:
-            assets: Dict of generated assets
-        Returns:
-            True if valid, False otherwise
-        """
-        critical_assets = ['tts_audio', 'selected_videos']
-        optional_assets = ['hook_video', 'background_music']
-        # Check critical assets
-        for asset_name in critical_assets:
-            if not assets.get(asset_name):
-                logger.error(f"❌ Critical asset missing: {asset_name}")
-                return False
-        # Warn about optional assets
-        for asset_name in optional_assets:
-            if not assets.get(asset_name):
-                logger.warning(f"⚠️  Optional asset missing: {asset_name}")
-        logger.info("✓ Asset validation passed")
-        return True
-    def _get_fallback_videos(self) -> List[Dict]:
-        """
-        Get fallback videos if AI selection fails
-        Returns:
-            List of default video selections
-        """
-        bucket_name = self.config.get('gcs_bucket_name', 'somira-videos')
-        return [
-            {
-                'id': 1,
-                'url': f"gs://{bucket_name}/library/video1.mp4",
-                'keyword': 'product',
-                'timing': '0-5',
-                'style': 'general',
-                'reason': 'Fallback selection'
-            },
-            {
-                'id': 15,
-                'url': f"gs://{bucket_name}/library/video15.mp4",
-                'keyword': 'lifestyle',
-                'timing': '5-10',
-                'style': 'general',
-                'reason': 'Fallback selection'
-            },
-            {
-                'id': 30,
-                'url': f"gs://{bucket_name}/library/video30.mp4",
-                'keyword': 'usage',
-                'timing': '10-15',
-                'style': 'general',
-                'reason': 'Fallback selection'
-            }
-        ]
-    def _log_step_completion(self, step: int, data: Dict[str, Any]):
-        """Log step completion with summary"""
-        step_names = {
-            1: "Asset Generation",
-            2: "Video Rendering",
-            3: "Subtitle Addition",
-            4: "Cloud Upload"
-        }
-        elapsed = time.time() - self.pipeline_start_time if self.pipeline_start_time else 0
-        logger.info(f"✓ Step {step} ({step_names.get(step, 'Unknown')}) completed [{elapsed:.2f}s total]")
     async def health_check(self) -> Dict[str, bool]:
-        """
-        Check health of all API connections
-        Returns:
-            Dict with service health status
-        """
-        logger.info("🏥 Running health check...")
-        health = {
-            'gemini': False,
-            'runwayml': False,
-            'tts': False,
-            'gcs': False
-        }
         try:
-            # Test Gemini
-            test_prompt = "Hello"
-            await self.api_clients.enhance_prompt(test_prompt)
-            health['gemini'] = True
-            logger.info("  ✓ Gemini API: Connected")
         except Exception as e:
-            logger.error(f"  ✗ Gemini API: {e}")
         try:
-            # Test GCS (just check bucket exists)
-            bucket = self.api_clients.gcs_bucket
-            bucket.exists()
-            health['gcs'] = True
-            logger.info("  ✓ Google Cloud Storage: Connected")
         except Exception as e:
-            logger.error(f"  ✗ Google Cloud Storage: {e}")
-        # RunwayML and TTS are harder to test without using credits
-        # So we just check if API keys are configured
-        if self.api_clients.runway_api_key:
-            health['runwayml'] = True
-            logger.info("  ✓ RunwayML API: Configured")
-        else:
-            logger.error("  ✗ RunwayML API: Not configured")
-        if self.api_clients.tts_client:
-            health['tts'] = True
-            logger.info("  ✓ TTS API: Configured")
         else:
-            logger.error("  ✗ TTS API: Not configured")
-        all_healthy = all(health.values())
-        logger.info(f"\n{'✅' if all_healthy else '⚠️'} Health check {'passed' if all_healthy else 'failed'}")
-        return health

 """
+Main automation orchestrator with production-ready video pipeline
 """
 import asyncio
 import os
 import time
 from typing import Dict, List, Optional, Any
+from pathlib import Path
 from api_clients import APIClients
 from video_renderer import VideoRenderer
+from asset_selector import AssetSelector
 from utils import logger
         self.config = config
         self.api_clients = APIClients(config)
         self.video_renderer = VideoRenderer(config)
+        self.asset_selector = AssetSelector(config)
         self.pipeline_start_time = None
+    async def simple_demo(self):
+        """Simple demo with proper audio handling"""
+        logger.info("🎬 Starting Simple Demo with Audio Fix...")
+        try:
+            # Create videos
+            logger.info("1. Creating video clips...")
+            from moviepy.editor import ColorClip
+            # Create simple color videos
+            clip1 = ColorClip(size=(640, 480), color=(255, 0, 0), duration=2)
+            clip1 = clip1.set_fps(24)
+            clip1_path = '/tmp/simple_red.mp4'
+            clip1.write_videofile(clip1_path, verbose=False, logger=None)
+            clip1.close()
+            clip2 = ColorClip(size=(640, 480), color=(0, 255, 0), duration=2)
+            clip2 = clip2.set_fps(24)
+            clip2_path = '/tmp/simple_green.mp4'
+            clip2.write_videofile(clip2_path, verbose=False, logger=None)
+            clip2.close()
+            logger.info("   ✅ Videos created")
+            # Create proper audio files using a different approach
+            logger.info("2. Creating proper audio files...")
+            # Method 1: Use a very simple approach - create WAV files directly
+            import wave
+            import struct
+            import numpy as np
+            # Create a simple sine wave WAV file
+            def create_sine_wave(filename, duration=4, freq=440, sample_rate=44100):
+                # Generate sine wave
+                t = np.linspace(0, duration, int(sample_rate * duration), endpoint=False)
+                audio_data = 0.3 * np.sin(2 * np.pi * freq * t)
+                # Convert to 16-bit PCM
+                audio_data = (audio_data * 32767).astype(np.int16)
+                # Write WAV file
+                with wave.open(filename, 'w') as wav_file:
+                    wav_file.setnchannels(1)  # Mono
+                    wav_file.setsampwidth(2)  # 16-bit
+                    wav_file.setframerate(sample_rate)
+                    wav_file.writeframes(audio_data.tobytes())
+            # Create audio files
+            tts_audio_path = '/tmp/tts_audio.wav'
+            bg_audio_path = '/tmp/bg_audio.wav'
+            create_sine_wave(tts_audio_path, duration=4, freq=440)  # A tone
+            create_sine_wave(bg_audio_path, duration=4, freq=220)   # Lower tone
+            logger.info("   ✅ Audio files created")
+            # Test video rendering
+            logger.info("3. Testing video rendering...")
+            simple_assets = {
+                'selected_videos': [
+                    {
+                        'local_path': clip1_path,
+                        'duration': 2,
+                        'reason': 'Red clip'
+                    },
+                    {
+                        'local_path': clip2_path,
+                        'duration': 2,
+                        'reason': 'Green clip'
+                    }
+                ],
+                'tts_audio': {
+                    'local_path': tts_audio_path,
+                    'duration': 4
+                },
+                'tts_script': 'Simple demo with proper audio.',
+                'background_music_local': bg_audio_path
+            }
+            output_path = await self.video_renderer.render_video(simple_assets)
+            logger.info(f"\n🎉 DEMO SUCCESSFUL!")
+            logger.info(f"📹 Video created: {output_path}")
+            return True
+        except Exception as e:
+            logger.error(f"❌ Demo failed: {e}")
+            import traceback
+            logger.error(f"📋 Debug: {traceback.format_exc()}")
+            return False
+    async def local_test(self):
+        """Run a local test without external APIs"""
+        logger.info("🧪 Running local functionality test...")
+        try:
+            # Test 1: Check if we can create basic video clips
+            logger.info("1. Testing video clip creation...")
+            from moviepy.editor import ColorClip
+            test_clip = ColorClip(size=(100, 100), color=(255, 0, 0), duration=1)
+            test_clip = test_clip.set_fps(24)  # Add FPS
+            test_clip.write_videofile('/tmp/test_color.mp4', verbose=False, logger=None)
+            test_clip.close()
+            logger.info("   ✅ Video clip creation: OK")
+            # Test 2: Check if we can create audio clips
+            logger.info("2. Testing audio clip creation...")
+            from moviepy.editor import AudioClip
+            import numpy as np
+            def make_tone(duration):
+                return lambda t: 0.1 * np.sin(440 * 2 * np.pi * t)
+            test_audio = AudioClip(make_tone(1), duration=1)
+            test_audio.write_audiofile('/tmp/test_audio.mp3', verbose=False, logger=None)
+            test_audio.close()
+            logger.info("   ✅ Audio clip creation: OK")
+            # Test 3: Check video rendering with simple assets
+            logger.info("3. Testing video rendering pipeline...")
+            test_assets = {
+                'selected_videos': [
+                    {
+                        'local_path': '/tmp/test_color.mp4',
+                        'duration': 1,
+                        'reason': 'Test video'
+                    }
+                ],
+                'tts_audio': {
+                    'local_path': '/tmp/test_audio.mp3',
+                    'duration': 1
+                },
+                'tts_script': 'Test script.',
+                'background_music_local': '/tmp/test_audio.mp3'
+            }
+            output_path = await self.video_renderer.render_video(test_assets)
+            logger.info(f"   ✅ Video rendering: OK - {output_path}")
+            logger.info("\n🎉 Local functionality test passed!")
+            return True
+        except Exception as e:
+            logger.error(f"❌ Local test failed: {e}")
+            return False
+    async def execute_pipeline(self, content_strategy: Dict[str, str], tts_script: str) -> Dict[str, Any]:
+        """
+        Execute complete production video pipeline with better error handling
         """
         self.pipeline_start_time = time.time()
+        logger.info("🚀 Starting Production Video Pipeline")
         try:
+            # Step 1: Generate all assets in parallel
+            logger.info("\n📦 STEP 1: Parallel Asset Generation")
+            assets = await self._generate_assets_parallel(content_strategy, tts_script)
+            # Check if we have minimum required assets
+            if not assets.get('selected_videos') or not assets.get('tts_audio'):
+                raise ValueError("Missing critical assets: videos or TTS audio")
+            # Step 2: Download all remote assets
+            logger.info("\n⬇️ STEP 2: Downloading Remote Assets")
+            await self._download_assets(assets)
+            # Step 3: Render final video
+            logger.info("\n🎬 STEP 3: Video Composition & Rendering")
+            final_video_path = await self.video_renderer.render_video(assets)
+            # Step 4: Upload to cloud storage
+            logger.info("\n☁️ STEP 4: Cloud Storage Upload")
+            final_url = await self.api_clients.store_in_gcs(final_video_path, 'video')
+            # Pipeline completion
             elapsed_time = time.time() - self.pipeline_start_time
+            logger.info(f"\n✅ Pipeline completed in {elapsed_time:.2f}s")
             return {
                 'success': True,
                 'final_url': final_url,
+                'local_path': final_video_path,
                 'duration': elapsed_time,
+                'assets_metadata': {
+                    'hook_video': assets.get('hook_video', {}).get('task_id'),
+                    'selected_videos_count': len(assets.get('selected_videos', [])),
+                    'total_duration': sum(v.get('duration', 0) for v in assets.get('selected_videos', []))
                 }
             }
         except Exception as e:
             elapsed_time = time.time() - self.pipeline_start_time if self.pipeline_start_time else 0
+            logger.error(f"\n❌ Pipeline failed after {elapsed_time:.2f}s: {e}")
             return {
                 'success': False,
                 'error': str(e),
+                'duration': elapsed_time
             }
+    async def _generate_assets_parallel(self, content_strategy: Dict, tts_script: str) -> Dict:
+        """Generate all assets in parallel for maximum efficiency"""
         tasks = {
+            'hook_video': self._generate_hook_video(content_strategy),
+            'selected_videos': self.asset_selector.select_videos(tts_script),
+            'tts_audio': self.api_clients.generate_tts(tts_script),
         }
+        # Execute all async tasks concurrently
+        results = {}
+        for task_name, task in tasks.items():
+            try:
+                results[task_name] = await task
                 logger.info(f"✓ {task_name} completed")
+            except Exception as e:
+                logger.error(f"❌ {task_name} failed: {e}")
+                results[task_name] = None
+        # Add synchronous operations
+        results['background_music_url'] = self.asset_selector.select_background_music()
+        results['tts_script'] = tts_script
+        return results
+    async def _generate_hook_video(self, strategy: Dict) -> Optional[Dict]:
+        """Generate hook video using RunwayML"""
         try:
+            prompt = strategy.get('runway_prompt') or strategy.get('gemini_prompt')
+            if not prompt:
+                logger.warning("No prompt available for hook video")
+                return None
+            # Enhance prompt with Gemini
+            enhanced_prompt = await self.api_clients.enhance_prompt(prompt)
+            # Generate video
             video_data = await self.api_clients.generate_video(
                 enhanced_prompt,
+                duration=5  # 5-second hook video
             )
             return video_data
         except Exception as e:
+            logger.error(f"Hook video generation failed: {e}")
             return None
+    async def _download_assets(self, assets: Dict):
+        """Download all remote assets to local files"""
+        download_tasks = []
+        # Download hook video
+        if assets.get('hook_video') and assets['hook_video'].get('video_url'):
+            download_tasks.append(
+                self._download_to_local(
+                    assets['hook_video']['video_url'],
+                    'hook_video.mp4',
+                    assets['hook_video']
+                )
             )
+        # Download library videos
+        for i, video in enumerate(assets.get('selected_videos', [])):
+            if video.get('url'):
+                download_tasks.append(
+                    self._download_to_local(
+                        video['url'],
+                        f'library_video_{i}.mp4',
+                        video
+                    )
+                )
+        # Download background music
+        if assets.get('background_music_url'):
+            download_tasks.append(
+                self._download_to_local(
+                    assets['background_music_url'],
+                    'background_music.mp3',
+                    assets,
+                    'background_music_local'
+                )
+            )
+        # Download TTS audio
+        if assets.get('tts_audio') and assets['tts_audio'].get('audio_url'):
+            download_tasks.append(
+                self._download_to_local(
+                    assets['tts_audio']['audio_url'],
+                    'tts_audio.mp3',
+                    assets['tts_audio'],
+                    'local_path'
+                )
+            )
+        # Execute all downloads concurrently
+        if download_tasks:
+            await asyncio.gather(*download_tasks, return_exceptions=True)
+    async def _download_to_local(self, url: str, filename: str, target_dict: Dict, key: str = 'local_path'):
+        """Download file from URL and store local path in target dictionary"""
+        try:
+            local_path = await self.api_clients.download_file(url, filename)
+            target_dict[key] = local_path
+            logger.info(f"✓ Downloaded {filename} from {url}")
+        except Exception as e:
+            logger.error(f"❌ Failed to download {filename}: {e}")
     async def health_check(self) -> Dict[str, bool]:
+        """Comprehensive health check of all components"""
+        logger.info("🏥 Running comprehensive health check...")
+        # Check API clients
+        api_health = await self.api_clients.health_check()
+        # Check asset selector
         try:
+            asset_selector_healthy = len(self.asset_selector.video_library) > 0
+            if not asset_selector_healthy:
+                logger.warning("  ⚠️ Asset Selector: Video library is empty")
         except Exception as e:
+            asset_selector_healthy = False
+            logger.error(f"  ❌ Asset Selector: {e}")
+        # Check video renderer
         try:
+            video_renderer_healthy = self.video_renderer.temp_dir.exists()
+            if not video_renderer_healthy:
+                logger.warning("  ⚠️ Video Renderer: Temp directory issue")
         except Exception as e:
+            video_renderer_healthy = False
+            logger.error(f"  ❌ Video Renderer: {e}")
+        # Combine all health statuses
+        health_status = {
+            **api_health,
+            'asset_selector': asset_selector_healthy,
+            'video_renderer': video_renderer_healthy
+        }
+        # Print summary
+        operational_services = sum(health_status.values())
+        total_services = len(health_status)
+        print(f"\n📊 Health Summary: {operational_services}/{total_services} services operational")
+        if operational_services == total_services:
+            print("🎉 System is fully operational and ready for production!")
+        elif operational_services >= total_services - 2:
+            print("⚠️  System is mostly operational, but some features may be limited")
         else:
+            print("❌ System has significant issues that need attention")
+        return health_status
+    async def basic_test(self):
+        """Basic test without external APIs"""
+        logger.info("🧪 Running basic pipeline test...")
+        # Use local test assets
+        test_assets = {
+            'selected_videos': [
+                {
+                    'url': 'https://example.com/video1.mp4',
+                    'duration': 2,
+                    'reason': 'Test video 1',
+                    'local_path': '/tmp/test_video1.mp4'  # You'd need to create this
+                }
+            ],
+            'tts_audio': {
+                'local_path': '/tmp/test_audio.mp3',  # You'd need to create this
+                'duration': 10
+            },
+            'background_music_local': '/tmp/test_music.mp3',
+            'tts_script': 'Test script for video generation.'
+        }
+        try:
+            final_video_path = await self.video_renderer.render_video(test_assets)
+            logger.info(f"✅ Basic test passed: {final_video_path}")
+            return True
+        except Exception as e:
+            logger.error(f"❌ Basic test failed: {e}")
+            return False

src/main.py CHANGED Viewed

@@ -159,41 +159,44 @@ async def run_pipeline(
 async def health_check_command(automation: ContentAutomation):
     """Run health check on all services"""
-    health_status = await automation.health_check()
-    if all(health_status.values()):
-        logger.info("\n✅ All systems operational!")
-        return 0
-    else:
-        logger.error("\n❌ Some systems are not operational")
         return 1
 async def test_command(automation: ContentAutomation):
-    """Run a quick test of the pipeline with minimal resources"""
-    logger.info("\n🧪 Running test pipeline...")
-    test_strategy = {
-        'gemini_prompt': 'A simple product shot of a modern massager device',
-        'runway_prompt': 'Static product shot of a sleek white massager on a clean background',
-        'style': 'minimal',
-        'aspect_ratio': '9:16',
-        'duration': 5,
-        'brand': 'Test'
-    }
-    test_script = "This is a test of the text-to-speech system. It should be brief."
-    result = await automation.execute_pipeline(test_strategy, test_script)
-    if result.get('success'):
-        logger.info("\n✅ Test completed successfully!")
         return 0
     else:
-        logger.error(f"\n❌ Test failed: {result.get('error', 'Unknown error')}")
         return 1
 def parse_arguments():
     """Parse command line arguments"""
     parser = argparse.ArgumentParser(

 async def health_check_command(automation: ContentAutomation):
     """Run health check on all services"""
+    try:
+        health_status = await automation.health_check()
+        print("\n" + "="*50)
+        print("🏥 SYSTEM HEALTH CHECK RESULTS")
+        print("="*50)
+        for service, status in health_status.items():
+            icon = "✅" if status else "❌"
+            print(f"{icon} {service.upper():<15} {'OPERATIONAL' if status else 'ISSUE DETECTED'}")
+        if all(health_status.values()):
+            print("\n🎉 All systems are ready for production!")
+            return 0
+        else:
+            print("\n⚠️  Some services need attention before running the pipeline.")
+            print("   Check the logs above for details.")
+            return 1
+    except Exception as e:
+        logger.error(f"Health check failed: {e}")
         return 1
 async def test_command(automation: ContentAutomation):
+    """Run simple demo test"""
+    logger.info("\n🧪 Running Simple Demo Test...")
+    success = await automation.simple_demo()
+    if success:
+        logger.info("\n✅ Demo test completed successfully!")
+        logger.info("🎉 Your video automation system is working!")
         return 0
     else:
+        logger.error(f"\n❌ Demo test failed")
         return 1
 def parse_arguments():
     """Parse command line arguments"""
     parser = argparse.ArgumentParser(

src/video_renderer.py CHANGED Viewed

@@ -1,62 +1,389 @@
 """
-Video rendering and subtitle engine
 """
 import os
-from utils import logger
 class VideoRenderer:
-    def __init__(self, config):
         self.config = config
-    async def render_video(self, assets):
-        """Render final video by merging all assets"""
-        logger.info("Rendering video with assets...")
-        # Simplified implementation - replace with actual video rendering
-        # This would use moviepy or similar library
-        hook_video = assets.get('hook_video')
-        background_music = assets.get('background_music')
-        selected_videos = assets.get('selected_videos', [])
-        tts_audio = assets.get('tts_audio')
-        logger.info(f"Merging {len(selected_videos)} selected videos")
-        logger.info(f"Using hook video: {hook_video}")
-        logger.info(f"Using background music: {background_music}")
-        # Placeholder for actual video rendering logic
-        output_path = "outputs/videos/rendered_video.mp4"
-        logger.info(f"Video rendered to: {output_path}")
-        return output_path
-    async def add_subtitles(self, video_path, tts_script):
-        """Add subtitles to video"""
-        logger.info("Adding subtitles to video...")
-        # Simplified implementation - replace with actual subtitle engine
-        # This would add subtitles in the middle of the screen
-        subtitles = self._generate_subtitle_segments(tts_script)
-        logger.info(f"Generated {len(subtitles)} subtitle segments")
-        # Placeholder for actual subtitle rendering
-        output_path = video_path.replace('.mp4', '_subtitled.mp4')
-        logger.info(f"Subtitled video saved to: {output_path}")
-        return output_path
-    def _generate_subtitle_segments(self, text):
-        """Generate subtitle segments from text"""
-        sentences = [s.strip() + '.' for s in text.split('.') if s.strip()]
-        segments = []
-        for i, sentence in enumerate(sentences):
-            segments.append({
-                'text': sentence,
-                'start_time': i * 3,  # 3 seconds per segment
-                'end_time': (i + 1) * 3,
-                'position': 'middle'  # Your nuance: middle of screen
-            })
-        return segments

 """
+Production video rendering engine with proper error handling and resource management
 """
+# FIX FOR PIL ANTIALIAS ISSUE - ADD THIS AT THE VERY TOP
+import PIL.Image
+if not hasattr(PIL.Image, 'ANTIALIAS'):
+    PIL.Image.ANTIALIAS = PIL.Image.LANCZOS
 import os
+import tempfile
+from typing import List, Dict, Optional
+from pathlib import Path
+# Rest of your imports...
+from moviepy.editor import VideoFileClip, AudioFileClip, CompositeVideoClip, concatenate_videoclips, TextClip, CompositeAudioClip
+import numpy as np
+import textwrap
+from utils import logger, format_duration
 class VideoRenderer:
+    def __init__(self, config: Dict):
         self.config = config
+        self.temp_dir = Path(tempfile.mkdtemp())
+        logger.info(f"Initialized VideoRenderer with temp dir: {self.temp_dir}")
+    async def render_video(self, assets: Dict, video_config: Optional[Dict] = None) -> str:
+        """
+        Render final video composition with all assets
+        Args:
+            assets: Dictionary containing all video/audio assets
+            video_config: Video configuration (aspect ratio, style, etc.)
+        Returns:
+            Path to rendered video file
+        """
+        try:
+            logger.info("🎬 Starting video rendering pipeline")
+            # Validate inputs
+            if not self._validate_assets(assets):
+                raise ValueError("Invalid assets provided for video rendering")
+            # Load and prepare all assets
+            video_clips = await self._prepare_video_clips(assets)
+            audio_clips = await self._prepare_audio_clips(assets)
+            # Create video sequence
+            final_video = await self._create_video_sequence(video_clips, video_config)
+            # Add audio
+            final_video = await self._add_audio_track(final_video, audio_clips)
+            # Add subtitles if script provided
+            if assets.get('tts_script'):
+                final_video = await self._add_subtitles(final_video, assets['tts_script'])
+            # Render final video
+            output_path = await self._render_final_video(final_video)
+            # Cleanup temporary files
+            self._cleanup_temp_files(video_clips + [final_video])
+            logger.info(f"✅ Video rendering completed: {output_path}")
+            return output_path
+        except Exception as e:
+            logger.error(f"❌ Video rendering failed: {e}")
+            raise
+    async def _prepare_video_clips(self, assets: Dict) -> List[VideoFileClip]:
+        """Load and prepare all video clips"""
+        clips = []
+        try:
+            # Load RunwayML hook video
+            if assets.get('hook_video'):
+                hook_clip = VideoFileClip(assets['hook_video']['local_path'])
+                hook_clip = hook_clip.without_audio()
+                clips.append(('hook', hook_clip))
+                logger.info(f"✓ Loaded hook video: {hook_clip.duration:.2f}s")
+            # Load library videos
+            for i, lib_video in enumerate(assets.get('selected_videos', [])):
+                if lib_video.get('local_path'):
+                    lib_clip = VideoFileClip(lib_video['local_path'])
+                    lib_clip = lib_clip.without_audio()
+                    clips.append((f'library_{i}', lib_clip))
+                    logger.info(f"✓ Loaded library video {i}: {lib_clip.duration:.2f}s")
+            return [clip for _, clip in clips]
+        except Exception as e:
+            logger.error(f"❌ Failed to prepare video clips: {e}")
+            # Cleanup on error
+            for name, clip in clips:
+                clip.close()
+            raise
+    async def _prepare_audio_clips(self, assets: Dict) -> List[AudioFileClip]:
+        """Load and prepare all audio clips with proper error handling"""
+        clips = []
+        try:
+            # Load TTS audio
+            if assets.get('tts_audio') and assets['tts_audio'].get('local_path'):
+                try:
+                    tts_clip = AudioFileClip(assets['tts_audio']['local_path'])
+                    # Ensure the clip has proper duration
+                    if tts_clip.duration > 0:
+                        clips.append(('tts', tts_clip))
+                        logger.info(f"✓ Loaded TTS audio: {tts_clip.duration:.2f}s")
+                    else:
+                        logger.warning("⚠️ TTS audio has zero duration")
+                        tts_clip.close()
+                except Exception as e:
+                    logger.error(f"❌ Failed to load TTS audio: {e}")
+            # Load background music
+            if assets.get('background_music_local'):
+                try:
+                    bg_clip = AudioFileClip(assets['background_music_local'])
+                    # Ensure the clip has proper duration
+                    if bg_clip.duration > 0:
+                        # Reduce volume using volumex instead of custom function
+                        bg_clip = bg_clip.volumex(0.3)
+                        clips.append(('background', bg_clip))
+                        logger.info(f"✓ Loaded background music: {bg_clip.duration:.2f}s")
+                    else:
+                        logger.warning("⚠️ Background music has zero duration")
+                        bg_clip.close()
+                except Exception as e:
+                    logger.error(f"❌ Failed to load background music: {e}")
+            return [clip for _, clip in clips]
+        except Exception as e:
+            logger.error(f"❌ Failed to prepare audio clips: {e}")
+            # Cleanup on error
+            for name, clip in clips:
+                try:
+                    clip.close()
+                except:
+                    pass
+            raise
+    async def _create_video_sequence(self, video_clips: List[VideoFileClip],
+                                   video_config: Optional[Dict]) -> VideoFileClip:
+        """Create the final video sequence with proper timing"""
+        try:
+            if not video_clips:
+                raise ValueError("No video clips available for sequence")
+            # Calculate total available duration (max 15 seconds)
+            max_duration = 15.0
+            current_duration = sum(clip.duration for clip in video_clips)
+            if current_duration > max_duration:
+                logger.warning(f"⚠️ Video sequence too long ({current_duration:.1f}s), will trim to {max_duration}s")
+                video_clips = self._trim_clips_to_fit(video_clips, max_duration)
+            # Resize all clips to target aspect ratio (9:16 vertical)
+            target_size = (1080, 1920)  # 9:16 vertical
+            resized_clips = [self._resize_for_vertical(clip, target_size) for clip in video_clips]
+            # Create sequence
+            final_sequence = concatenate_videoclips(resized_clips)
+            logger.info(f"✓ Created video sequence: {final_sequence.duration:.2f}s")
+            return final_sequence
+        except Exception as e:
+            logger.error(f"❌ Failed to create video sequence: {e}")
+            for clip in video_clips:
+                clip.close()
+            raise
+    def _resize_for_vertical(self, clip: VideoFileClip, target_size: tuple) -> VideoFileClip:
+        """Resize clip to fit vertical 9:16 aspect ratio"""
+        target_w, target_h = target_size
+        clip_aspect = clip.w / clip.h
+        target_aspect = target_w / target_h
+        if clip_aspect > target_aspect:
+            # Clip is wider, fit to height and crop width
+            new_clip = clip.resize(height=target_h)
+        else:
+            # Clip is taller, fit to width and crop height
+            new_clip = clip.resize(width=target_w)
+        # Center crop to exact size using a more compatible method
+        try:
+            # Try the new method first
+            new_clip = new_clip.crop(
+                x_center=new_clip.w / 2,
+                y_center=new_clip.h / 2,
+                width=target_w,
+                height=target_h
+            )
+        except Exception:
+            # Fallback method for cropping
+            x1 = (new_clip.w - target_w) // 2
+            y1 = (new_clip.h - target_h) // 2
+            new_clip = new_clip.crop(x1=x1, y1=y1, x2=x1+target_w, y2=y1+target_h)
+        return new_clip
+    def _trim_clips_to_fit(self, clips: List[VideoFileClip], max_duration: float) -> List[VideoFileClip]:
+        """Trim video clips to fit within max duration"""
+        trimmed_clips = []
+        remaining_duration = max_duration
+        for clip in clips:
+            if remaining_duration <= 0:
+                break
+            use_duration = min(clip.duration, remaining_duration)
+            if use_duration < clip.duration:
+                trimmed_clip = clip.subclip(0, use_duration)
+                trimmed_clips.append(trimmed_clip)
+                logger.info(f"Trimmed clip from {clip.duration:.1f}s to {use_duration:.1f}s")
+            else:
+                trimmed_clips.append(clip)
+            remaining_duration -= use_duration
+        return trimmed_clips
+    async def _add_audio_track(self, video_clip: VideoFileClip, audio_clips: List[AudioFileClip]) -> VideoFileClip:
+        """Add audio track to video with proper timing"""
+        if not audio_clips:
+            return video_clip
+        try:
+            # Filter out invalid audio clips
+            valid_audio_clips = []
+            for clip in audio_clips:
+                if clip.duration > 0:
+                    valid_audio_clips.append(clip)
+                else:
+                    logger.warning(f"⚠️ Skipping audio clip with zero duration")
+                    clip.close()
+            if not valid_audio_clips:
+                return video_clip
+            # Mix all valid audio clips
+            mixed_audio = CompositeAudioClip(valid_audio_clips)
+            # Ensure audio doesn't exceed video duration
+            video_duration = video_clip.duration
+            if mixed_audio.duration > video_duration:
+                logger.info(f"Trimming audio from {mixed_audio.duration:.2f}s to {video_duration:.2f}s")
+                mixed_audio = mixed_audio.subclip(0, video_duration)
+            # Add audio to video
+            video_with_audio = video_clip.set_audio(mixed_audio)
+            logger.info(f"✓ Added audio track: {mixed_audio.duration:.2f}s")
+            return video_with_audio
+        except Exception as e:
+            logger.error(f"❌ Failed to add audio track: {e}")
+            # Cleanup audio clips
+            for clip in audio_clips:
+                try:
+                    clip.close()
+                except:
+                    pass
+            return video_clip
+    async def _add_subtitles(self, video_clip: VideoFileClip, script: str) -> CompositeVideoClip:
+        """Add animated subtitles to video"""
+        try:
+            phrases = self._split_script_into_phrases(script)
+            text_clips = []
+            total_duration = video_clip.duration
+            duration_per_phrase = total_duration / len(phrases)
+            fade_duration = 0.3
+            target_width, target_height = video_clip.size
+            for i, phrase in enumerate(phrases):
+                start_time = i * duration_per_phrase
+                # Word wrapping for vertical format
+                max_chars_per_line = 25
+                wrapped_text = '\n'.join(textwrap.wrap(phrase, width=max_chars_per_line))
+                # Create text clip
+                text_clip = TextClip(
+                    txt=wrapped_text,
+                    fontsize=65,
+                    color='yellow' if i % 2 == 1 else 'white',
+                    font='Helvetica',
+                    stroke_color='black',
+                    stroke_width=4,
+                    method='caption',
+                    size=(int(target_width * 0.85), None)
+                )
+                # Position in center-upper area (safe zone for vertical video)
+                vertical_position = int(target_height * 0.40)
+                text_clip = text_clip.set_position(('center', vertical_position))
+                text_clip = text_clip.set_start(start_time)
+                text_clip = text_clip.set_duration(duration_per_phrase)
+                # Add fade effects manually
+                text_clip = text_clip.crossfadein(fade_duration).crossfadeout(fade_duration)
+                text_clips.append(text_clip)
+            # Combine video with subtitles
+            final_video = CompositeVideoClip([video_clip] + text_clips)
+            logger.info(f"✓ Added {len(text_clips)} subtitle segments")
+            return final_video
+        except Exception as e:
+            logger.error(f"❌ Failed to add subtitles: {e}")
+            return video_clip
+    def _split_script_into_phrases(self, script: str) -> List[str]:
+        """Split script into subtitle phrases"""
+        # Simple sentence splitting - can be enhanced with NLP
+        sentences = [s.strip() + '.' for s in script.split('.') if s.strip()]
+        return sentences[:6]  # Limit to 6 phrases max
+    async def _render_final_video(self, video_clip: VideoFileClip) -> str:
+        """Render final video to file"""
+        output_path = self.temp_dir / "final_video.mp4"
+        try:
+            logger.info("📹 Rendering final video file...")
+            video_clip.write_videofile(
+                str(output_path),
+                codec='libx264',
+                audio_codec='aac',
+                temp_audiofile=str(self.temp_dir / 'temp_audio.m4a'),
+                remove_temp=True,
+                fps=24,
+                verbose=False,
+                logger=None  # Suppress moviepy progress bars
+            )
+            logger.info(f"✓ Final video rendered: {output_path}")
+            return str(output_path)
+        except Exception as e:
+            logger.error(f"❌ Final video rendering failed: {e}")
+            raise
+        finally:
+            video_clip.close()
+    def _validate_assets(self, assets: Dict) -> bool:
+        """Validate that required assets are present"""
+        required = ['selected_videos', 'tts_audio']
+        for req in required:
+            if not assets.get(req):
+                logger.error(f"Missing required asset: {req}")
+                return False
+        if not assets.get('selected_videos'):
+            logger.error("No selected videos provided")
+            return False
+        return True
+    def _cleanup_temp_files(self, clips: List):
+        """Clean up temporary video/audio clips"""
+        for clip in clips:
+            try:
+                clip.close()
+            except:
+                pass
+    def __del__(self):
+        """Cleanup on destruction"""
+        try:
+            import shutil
+            if self.temp_dir.exists():
+                shutil.rmtree(self.temp_dir)
+        except:
+            pass