Spaces:
Sleeping
Sleeping
| title: English Accent Classifier | |
| emoji: 🗣️ | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: 5.30.0 | |
| app_file: app.py | |
| pinned: false | |
| # English Accent Classifier with Video Analysis | |
| This Gradio application analyzes English accents from audio extracted from video files. You can provide a video either via a direct URL or by uploading a file from your local machine. | |
| ## How it Works | |
| 1. **Input Video:** Provide a video URL (MP4, Loom, Dropbox, Google Drive direct links) or upload a video file. | |
| 2. **Video Processing:** The application downloads/processes the video. | |
| 3. **Audio Extraction:** The full audio and a short segment (15 seconds) are extracted. | |
| 4. **Language Detection:** The short audio is transcribed, and the language is detected. | |
| 5. **Accent Classification (if English):** A longer audio segment (adjustable duration) is analyzed for English accent. | |
| 6. **Results:** The detected language, predicted accent, confidence scores, and an audio player for the full extracted audio are displayed. | |
| ## Features | |
| * **English Accent Classification:** Predicts the accent in English audio. | |
| * **Language Detection:** Ensures the audio is English before accent analysis. | |
| * **Flexible Video Input:** Supports URLs and file uploads. | |
| * **Adjustable Analysis Duration:** Users can set the audio analysis length. | |
| * **Audio Playback:** Allows users to listen to the extracted audio. | |
| ## Tech Stack | |
| * [Gradio](https://gradio.app/): Interactive web UI. | |
| * [Hugging Face Transformers](https://huggingface.co/transformers/): Pre-trained models and pipelines. | |
| * [Requests](https://requests.readthedocs.io/en/latest/): Downloading video files. | |
| * [MoviePy](https://zulko.github.io/moviepy/): Video editing for audio extraction. | |
| * [PyTorch](https://pytorch.org/): Underlying deep learning framework. | |
| * [Soundfile](https://pysoundfile.readthedocs.io/en/latest/): Audio file handling. | |
| ## Models Used | |
| * **Accent Classification:** `dima806/english_accents_classification` | |
| * **Language Detection:** `alexneakameni/language_detection` | |
| * **Automatic Speech Recognition:** `openai/whisper-tiny.en` | |
| ## Usage | |
| You can interact with the application directly in your browser. Provide a video URL or upload a file, adjust the analysis duration, and click "Analyze Video". The results will be displayed below. | |
| ### Input Formats | |
| * **Uploaded Video Files:** `.mp4` | |
| * **Video URLs:** | |
| * Direct MP4 links (ending in `.mp4`) | |
| * Loom video share links (`https://www.loom.com/share/...`) | |
| * Dropbox direct download links (MP4 links ending in `?dl=1`) | |
| * Google Drive direct download links (`https://drive.google.com/uc?id=...&export=download`) | |
| ### Unsupported Formats | |
| * Webpages embedding videos (e.g., YouTube, news articles). | |
| * Dropbox shared folder links. | |
| ## FFmpeg Requirement | |
| This application requires [FFmpeg](https://ffmpeg.org/) to be installed on your system for audio extraction from video files. Follow the installation instructions for your operating system on the FFmpeg website. | |
| ## Troubleshooting | |
| * **"Invalid URL"**: Ensure the URL meets the specified format requirements. | |
| * **Audio/Video Processing Errors**: Likely due to missing or incorrectly configured FFmpeg. | |
| * **Transcription Errors**: Audio may be unclear or contain little speech in the initial 15 seconds. | |
| * **Non-English Language Detection**: The model is designed for English accent classification only. | |
| ## Citation | |
| If you use this application in your work, please consider citing the original models and the libraries used. | |
| ```bibtex | |
| @misc{huggingface_transformers, | |
| author = dima806, | |
| title = dima806/english_accents_classification, | |
| year = Oct 19, 2024, | |
| howpublished = https://huggingface.co/dima806/english_accents_classification | |
| author = alexneakameni, | |
| title = language_detection, | |
| year = Oct 19, 2024, | |
| howpublished = https://huggingface.co/alexneakameni/language_detection | |
| } |