YTapp / System_overview.md
nexusbert's picture
Initial Ytapp app
825e544

System overview – Ytapp YouTube Music Mood-based Recommender

Independent FastAPI service for mood-based YouTube Music recommendations using emotionAI's DeepFace + Gemini API approach.

Features

  • Text emotion analysis (Gemini API) β†’ YouTube Music song recommendations
  • Face emotion analysis (DeepFace fast / Gemini API accurate) β†’ YouTube Music song recommendations
  • Search songs, artists, and get song details
  • Standalone service (independent of main VibeCheck app)

Setup

  1. Install dependencies:

    pip install -r requirements.txt --break-system-packages
    
  2. Configure Gemini API Key (Optional but recommended):

    • Create a .env file in the Ytapp directory
    • Add: GEMINI_API_KEY=your_api_key_here
    • Without Gemini API key, face analysis falls back to DeepFace only
    • Text analysis requires Gemini API key
  3. Optional: OAuth Authentication

    • For authenticated requests (library management, playlists, etc.), set up OAuth:
    • Get Client ID and Secret from YouTube Data API
    • Select OAuth client ID β†’ TVs and Limited Input devices
    • Run: ytmusicapi oauth
    • Follow instructions to create oauth.json
    • Pass credentials to YTMusic() if needed

Running Locally

python app.py

The service will run on http://localhost:7860

Docker

docker build -t ytapp .
docker run -p 7860:7860 -e GEMINI_API_KEY=your_key_here ytapp

API Endpoints

Health Check

  • GET /health - Service health status

Search

  • GET /search?query=...&limit=20 - Search for songs
  • GET /song/{video_id} - Get song details
  • GET /artists/search?query=...&limit=10 - Search for artists
  • GET /artists/{artist_id}/songs?limit=50 - Get artist songs

Mood-based Recommendations

  • POST /mood/text - Get song recommendation from text mood (uses Gemini API)
    {"text": "I feel happy today!"}
    
  • POST /mood/face - Get song recommendation from face image (tries Gemini, falls back to DeepFace)
    Form data: file (image/jpeg)
    
  • POST /mood/face/live - Fast face analysis using DeepFace only (for live video feeds)
    Form data: file (image/jpeg)
    

Response Format

Mood endpoints return:

{
  "mood_label": "joy",
  "mood_score": 0.95,
  "video_id": "dQw4w9WgXcQ",
  "title": "Song Title",
  "artists": ["Artist Name"],
  "album": "Album Name",
  "duration": "3:45",
  "image_url": "https://...",
  "external_url": "https://music.youtube.com/watch?v=..."
}

Emotion Detection Methods

Face Analysis

  1. Primary (if Gemini API key available): Uses Gemini 1.5 Flash for accurate emotion detection
  2. Fallback: Uses DeepFace for fast offline emotion detection
  3. Live mode: Uses DeepFace only for real-time video feeds

Text Analysis

  • Uses Gemini API to analyze emotional tone of text
  • Requires GEMINI_API_KEY to be set

Emotion Mapping

  • Joy/Happy β†’ Happy, upbeat, dance music
  • Sad β†’ Sad songs, ballads, emotional music
  • Anger β†’ Rock, metal, intense music
  • Fear β†’ Calm, ambient, meditation music
  • Surprise β†’ Experimental, indie, alternative music
  • Neutral β†’ Chill, background, easy listening
  • Disgust β†’ Alternative rock, indie music

Notes

  • DeepFace models are preloaded during Docker build for faster startup
  • Gemini API provides more accurate emotion detection but requires API key
  • DeepFace works offline and is faster for live video feeds
  • No authentication required for basic searches and mood recommendations
  • OAuth optional for library management and playlist creation