# Frame Extraction & Character Matching This package turns raw video into character reference catalogs and lets you match new frames against those references. It is designed to be deployed quickly (e.g., on Hugging Face Spaces) for interactive character discovery. ## Features - Shot-aware frame sampling to keep only useful stills. - Face detection, embedding, and clustering (MTCNN + InceptionResnet). - Automatic reference selection per character (sharpest, most frontal crop). - JSON catalog output and optional reference thumbnails. - Matching API/CLI for user-uploaded frames with multi-character support. - Gradio app template ready for Hugging Face hosting. ## Install ```bash cd projects/UMO-Qwen-Edit/data_curation_scripts/frame_extraction pip install -e . ``` ## CLI Usage ### Build a catalog from a video ```bash frame-catalog catalog \ --video-path data/source.mp4 \ --output-dir outputs/catalog \ --frame-interval 12 \ --min-track-length 5 ``` ### Match new frames against the catalog ```bash frame-catalog match \ --catalog-path outputs/catalog/catalog.json \ --frames-dir uploads/ \ --output-path outputs/matches.json ``` ## Deploy on Hugging Face Spaces 1. Copy this folder to a new Space (Python SDK). 2. Install dependencies with `pip install -e .`. 3. Upload a pre-built `catalog/catalog.json` plus the `references/` images. 4. Set environment variables in the Space: - `FRAME_CATALOG=/home/user/app/catalog/catalog.json` - `FRAME_OUTPUT_DIR=/home/user/app/output` 5. Set the Space entrypoint to `python -m frame_extraction.app`. ## Outputs - `catalog.json`: character reference metadata with embeddings and chosen frames. - `references/`: cropped reference images per character. - `matches.json`: mapping from user frames to character IDs with similarity scores. ## Roadmap - Integrate more robust trackers (DeepSort/ByteTrack). - Add active learning loop for manual character corrections. - Expose REST endpoints for automated ingestion.