Spaces:

AmirFARES
/

accentometer

Sleeping

File size: 3,810 Bytes

0dd1fd8
5f7fae9
 
 
51b4db4
aa5ec1b
 
51b4db4
 
 
 
 
 
aa5ec1b
 
 
 
 
 
 
 
 
 
 
 
 
cda95c2
aa5ec1b
cda95c2
 
aa5ec1b
 
cda95c2
 
 
 
 
 
 
aa5ec1b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
51b4db4
aa5ec1b
 
5f7fae9
aa5ec1b
 
5f7fae9
aa5ec1b
 
 
 
 
 
 
d0dd89e
aa5ec1b
d0dd89e
aa5ec1b
b44d107
aa5ec1b
d0dd89e
aa5ec1b
 
d0dd89e
aa5ec1b
 
d0dd89e

import streamlit as st
from downloader import download_video
from extractor import extract_audio
from detector import detect_accent
import asyncio

# Setup async event loop
try:
    asyncio.get_running_loop()
except RuntimeError:
    loop = asyncio.new_event_loop()
    asyncio.set_event_loop(loop)

st.set_page_config(page_title="Accentometer", page_icon="🎙️")

# Sidebar navigation
st.sidebar.title("🎙️ Accentometer")
page = st.sidebar.radio("", ["Home", "Info"])

# ---------------------------
# Page 1: Home
# ---------------------------
if page == "Home":
    st.title("🎙️ Accentometer")

    # ✅ Example video links
    st.markdown("### Try with these example MP4 links:")

    st.text("Australian: https://docs.google.com/uc?export=download&id=12dKoO-jgWgjor_aQpovbtwOulMQBEEps")
    st.text("British:    https://docs.google.com/uc?export=download&id=10aiY_0dnsWXqhxeNL54n5LYe6l4XquR8")
    st.text("American:   https://docs.google.com/uc?export=download&id=1k1wfSxmQ-ZbYCKNZrb0aF8xfM6JWEfKQ")

    st.markdown(
        """
        📌 **Instructions**: Copy one of the above MP4 URLs and paste it into the field below, then click **Analyze** to detect the accent, and get a transcript.
        """
    )

    st.markdown(""" --- """)
    st.write("Paste a direct public MP4 video URL to **classify the English accent, get transcript**.")

    video_url = st.text_input("Direct Video URL (MP4 only):")

    if st.button("Analyze"):
        if not video_url:
            st.error("Please provide a valid MP4 video URL.")
        else:
            try:
                with st.spinner("Downloading video..."):
                    video_file_path = download_video(video_url)

                with st.spinner("Extracting audio..."):
                    audio_file = extract_audio(video_file_path)

                with st.spinner("Transcribing & detecting accent..."):
                    accent, confidence, transcript = detect_accent(audio_file)

                st.success(f"Accent: **{accent}**")
                st.metric(label="Confidence", value=f"{confidence}%")
                st.text_area("Transcript", transcript, height=150, disabled=True)

            except Exception as e:
                st.error(f"Error: {str(e)}")


# ---------------------------
# Page 2: Info
# ---------------------------
elif page == "Info":

    st.title("ℹ️ About Accentometer")

    st.markdown("""
    ### 🧠 What is Accentometer?

    **Accentometer** is a simple **demo** application for English accent classification.  
    It doesn't use any advanced machine learning or deep learning models beyond basic heuristics.

    ---

    ### ⚙️ How It Works

    1. **Video Downloader**  
    Accepts only **direct MP4 links**.  
    (Loom URLs aren’t supported in this free demo version — integration would require a paid Loom API or other backend setup.)

    2. **Audio Extraction**  
    Uses the `moviepy` library to extract the audio from the video.

    3. **Speech-to-Text**  
    Applies the `openai/whisper-base` model from Hugging Face to transcribe the audio into text.

    4. **Accent Detection**  
    Runs simple **hand-crafted heuristic rules** over the transcribed text to classify the accent into:
    - 🇺🇸 American
    - 🇬🇧 British
    - 🇦🇺 Australian
    - 🇮🇳 Indian
    - ❓ Unknown

    ---

    ### 🚀 Possible Improvements

    If you're interested in expanding this demo into a more robust application:

    - **Support More Input Options**: Upload local MP4 files, YouTube links, Loom links (via API), Additional video formats, etc.
    - **Improve Accent Detection**: Train a custom model using a labeled accent dataset, or integrate a model like `accent-id-commonaccent_xlsr-en-english`, or other approaches.

    ---
    """)