Spaces:
Sleeping
Sleeping
| --- | |
| title: SmolVLM2 Video Highlights | |
| emoji: "🎬" | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: docker | |
| pinned: false | |
| license: apache-2.0 | |
| app_port: 7860 | |
| # SmolVLM2 HuggingFace Segment-Based Video Highlights API | |
| Generate intelligent video highlights using HuggingFace's segment-based approach | |
| This is a FastAPI service that uses HuggingFace's proven segment-based classification method with SmolVLM2-256M-Video-Instruct for reliable, consistent highlight generation. | |
| 🚀 Features | |
| Segment-Based Analysis: Processes videos in fixed 5-second segments for consistent AI classification | |
| Dual Criteria Generation: Creates two different highlight criteria sets and selects the most selective one | |
| SmolVLM2-256M-Video-Instruct: Faster processing with specialized video understanding | |
| Visual Effects: Optional fade transitions between segments for professional-quality output | |
| REST API: Upload videos and get generated video description + analysis file path | |
| 🔗 API Endpoints | |
| POST /upload-video - Upload video and receive analysis response | |
| GET /health - Health check | |
| 📱 Usage | |
| Via API | |
| # Upload video with optional parameters | |
| curl -X POST \ | |
| -F "video=@your_video.mp4" \ | |
| -F "segment_length=5.0" \ | |
| -F "model_name=HuggingFaceTB/SmolVLM2-256M-Video-Instruct" \ | |
| -F "with_effects=true" \ | |
| https://your-space-url.hf.space/upload-video | |
| Example response: | |
| { | |
| "success": true, | |
| "message": "Video description generated successfully", | |
| "video_description": "A concise description of the uploaded video...", | |
| "analysis_file": "/tmp/outputs/<uuid>_analysis.json" | |
| } | |
| Via Android App | |
| Use the provided Android client code to integrate with your mobile app. | |
| âš™ï¸ Configuration | |
| Default settings: | |
| Segment Length: 5 seconds (fixed segments for consistent classification) | |
| Model: SmolVLM2-256M-Video-Instruct (faster processing) | |
| Effects: Enabled (fade transitions between segments) | |
| Dual Criteria: Two prompt variations for robust selection | |
| ðŸ› ï¸ Technology Stack | |
| SmolVLM2-256M-Video-Instruct: Efficient vision-language model optimized for video understanding | |
| HuggingFace Transformers: Latest transformer models and inference | |
| FastAPI: Modern web framework for APIs | |
| FFmpeg: Video processing with advanced filter support | |
| PyTorch: Deep learning framework with device optimization | |
| 🎯 Perfect For | |
| Social media content creators | |
| Educational video processing | |
| Meeting/lecture summarization | |
| Sports highlight generation | |
| Entertainment content curation | |
| �� License | |
| Apache 2.0 - Free for commercial and personal use | |
| 🤠Contributing | |
| Built with â¤ï¸ using Hugging Face Transformers and open-source AI models. | |