Spaces:

tusker123
/

accent_classifier

Sleeping

App Files Files Community

accent_classifier / README.md

tusker123

Update README.md

14cfa78 verified 7 months ago

preview code

raw

history blame contribute delete

3.91 kB

	---
	title: English Accent Classifier
	emoji: 🗣️
	colorFrom: blue
	colorTo: purple
	sdk: gradio
	sdk_version: 5.30.0
	app_file: app.py
	pinned: false
	---

	# English Accent Classifier with Video Analysis

	This Gradio application analyzes English accents from audio extracted from video files. You can provide a video either via a direct URL or by uploading a file from your local machine.

	## How it Works

	1. Input Video: Provide a video URL (MP4, Loom, Dropbox, Google Drive direct links) or upload a video file.
	2. Video Processing: The application downloads/processes the video.
	3. Audio Extraction: The full audio and a short segment (15 seconds) are extracted.
	4. Language Detection: The short audio is transcribed, and the language is detected.
	5. Accent Classification (if English): A longer audio segment (adjustable duration) is analyzed for English accent.
	6. Results: The detected language, predicted accent, confidence scores, and an audio player for the full extracted audio are displayed.

	## Features

	* English Accent Classification: Predicts the accent in English audio.
	* Language Detection: Ensures the audio is English before accent analysis.
	* Flexible Video Input: Supports URLs and file uploads.
	* Adjustable Analysis Duration: Users can set the audio analysis length.
	* Audio Playback: Allows users to listen to the extracted audio.

	## Tech Stack

	* [Gradio](https://gradio.app/): Interactive web UI.
	* [Hugging Face Transformers](https://huggingface.co/transformers/): Pre-trained models and pipelines.
	* [Requests](https://requests.readthedocs.io/en/latest/): Downloading video files.
	* [MoviePy](https://zulko.github.io/moviepy/): Video editing for audio extraction.
	* [PyTorch](https://pytorch.org/): Underlying deep learning framework.
	* [Soundfile](https://pysoundfile.readthedocs.io/en/latest/): Audio file handling.

	## Models Used

	* Accent Classification: `dima806/english_accents_classification`
	* Language Detection: `alexneakameni/language_detection`
	* Automatic Speech Recognition: `openai/whisper-tiny.en`

	## Usage

	You can interact with the application directly in your browser. Provide a video URL or upload a file, adjust the analysis duration, and click "Analyze Video". The results will be displayed below.

	### Input Formats

	* Uploaded Video Files: `.mp4`
	* Video URLs:
	* Direct MP4 links (ending in `.mp4`)
	* Loom video share links (`https://www.loom.com/share/...`)
	* Dropbox direct download links (MP4 links ending in `?dl=1`)
	* Google Drive direct download links (`https://drive.google.com/uc?id=...&export=download`)

	### Unsupported Formats

	* Webpages embedding videos (e.g., YouTube, news articles).
	* Dropbox shared folder links.

	## FFmpeg Requirement

	This application requires [FFmpeg](https://ffmpeg.org/) to be installed on your system for audio extraction from video files. Follow the installation instructions for your operating system on the FFmpeg website.

	## Troubleshooting

	* "Invalid URL": Ensure the URL meets the specified format requirements.
	* Audio/Video Processing Errors: Likely due to missing or incorrectly configured FFmpeg.
	* Transcription Errors: Audio may be unclear or contain little speech in the initial 15 seconds.
	* Non-English Language Detection: The model is designed for English accent classification only.

	## Citation

	If you use this application in your work, please consider citing the original models and the libraries used.

	```bibtex
	@misc{huggingface_transformers,
	author = dima806,
	title = dima806/english_accents_classification,
	year = Oct 19, 2024,
	howpublished = https://huggingface.co/dima806/english_accents_classification

	author = alexneakameni,
	title = language_detection,
	year = Oct 19, 2024,
	howpublished = https://huggingface.co/alexneakameni/language_detection

	}