Spaces:

adiitya29
/

Multilingual-ASR

Running

adiitya29 commited on 8 days ago

Commit

8dced3a

1 Parent(s): c17b8b3

readme.md file cleaned

Files changed (1) hide show

README.md CHANGED Viewed

@@ -1,41 +0,0 @@
-# Multilingual Automatic Speech Recognition (ASR)
-This project provides a web application to upload audio files, detect spoken language, convert speech to text, and download transcripts. It leverages pre-trained Wav2Vec models from Hugging Face and uses Gradio for the frontend interface.
-## Features
-- Upload audio files
-- (Optional) Detect spoken language
-- Speech-to-text conversion via Hugging Face Wav2Vec
-- Save and manage transcription history
-- Download transcripts
-## Setup
-1. **Clone the repository** (or download the source code).
-2. **Create a virtual environment**:
-   ```bash
-   python -m venv venv
-   source venv/bin/activate
-   ```
-3. **Install dependencies**:
-   ```bash
-   pip install -r requirements.txt
-   ```
-## Usage
-To start the Gradio web interface, run:
-```bash
-python app.py
-```
-Open the local URL provided in the terminal in your browser.
-## Project Structure
-- `app.py`: Main entry point for the Gradio interface.
-- `app/`: Module containing logic for audio processing, ASR inference, language detection, and history management.
-- `data/`: Folder to hold sample audio files and exported histories.
-- `notebooks/`: Jupyter notebooks for experiments and fine-tuning.
-- `tests/`: Unit testing suite.