Spaces:
Sleeping
Sleeping
File size: 3,810 Bytes
0dd1fd8 5f7fae9 51b4db4 aa5ec1b 51b4db4 aa5ec1b cda95c2 aa5ec1b cda95c2 aa5ec1b cda95c2 aa5ec1b 51b4db4 aa5ec1b 5f7fae9 aa5ec1b 5f7fae9 aa5ec1b d0dd89e aa5ec1b d0dd89e aa5ec1b b44d107 aa5ec1b d0dd89e aa5ec1b d0dd89e aa5ec1b d0dd89e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 | import streamlit as st
from downloader import download_video
from extractor import extract_audio
from detector import detect_accent
import asyncio
# Setup async event loop
try:
asyncio.get_running_loop()
except RuntimeError:
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
st.set_page_config(page_title="Accentometer", page_icon="๐๏ธ")
# Sidebar navigation
st.sidebar.title("๐๏ธ Accentometer")
page = st.sidebar.radio("", ["Home", "Info"])
# ---------------------------
# Page 1: Home
# ---------------------------
if page == "Home":
st.title("๐๏ธ Accentometer")
# โ
Example video links
st.markdown("### Try with these example MP4 links:")
st.text("Australian: https://docs.google.com/uc?export=download&id=12dKoO-jgWgjor_aQpovbtwOulMQBEEps")
st.text("British: https://docs.google.com/uc?export=download&id=10aiY_0dnsWXqhxeNL54n5LYe6l4XquR8")
st.text("American: https://docs.google.com/uc?export=download&id=1k1wfSxmQ-ZbYCKNZrb0aF8xfM6JWEfKQ")
st.markdown(
"""
๐ **Instructions**: Copy one of the above MP4 URLs and paste it into the field below, then click **Analyze** to detect the accent, and get a transcript.
"""
)
st.markdown(""" --- """)
st.write("Paste a direct public MP4 video URL to **classify the English accent, get transcript**.")
video_url = st.text_input("Direct Video URL (MP4 only):")
if st.button("Analyze"):
if not video_url:
st.error("Please provide a valid MP4 video URL.")
else:
try:
with st.spinner("Downloading video..."):
video_file_path = download_video(video_url)
with st.spinner("Extracting audio..."):
audio_file = extract_audio(video_file_path)
with st.spinner("Transcribing & detecting accent..."):
accent, confidence, transcript = detect_accent(audio_file)
st.success(f"Accent: **{accent}**")
st.metric(label="Confidence", value=f"{confidence}%")
st.text_area("Transcript", transcript, height=150, disabled=True)
except Exception as e:
st.error(f"Error: {str(e)}")
# ---------------------------
# Page 2: Info
# ---------------------------
elif page == "Info":
st.title("โน๏ธ About Accentometer")
st.markdown("""
### ๐ง What is Accentometer?
**Accentometer** is a simple **demo** application for English accent classification.
It doesn't use any advanced machine learning or deep learning models beyond basic heuristics.
---
### โ๏ธ How It Works
1. **Video Downloader**
Accepts only **direct MP4 links**.
(Loom URLs arenโt supported in this free demo version โ integration would require a paid Loom API or other backend setup.)
2. **Audio Extraction**
Uses the `moviepy` library to extract the audio from the video.
3. **Speech-to-Text**
Applies the `openai/whisper-base` model from Hugging Face to transcribe the audio into text.
4. **Accent Detection**
Runs simple **hand-crafted heuristic rules** over the transcribed text to classify the accent into:
- ๐บ๐ธ American
- ๐ฌ๐ง British
- ๐ฆ๐บ Australian
- ๐ฎ๐ณ Indian
- โ Unknown
---
### ๐ Possible Improvements
If you're interested in expanding this demo into a more robust application:
- **Support More Input Options**: Upload local MP4 files, YouTube links, Loom links (via API), Additional video formats, etc.
- **Improve Accent Detection**: Train a custom model using a labeled accent dataset, or integrate a model like `accent-id-commonaccent_xlsr-en-english`, or other approaches.
---
""")
|