adiitya29 commited on
Commit
8dced3a
·
1 Parent(s): c17b8b3

readme.md file cleaned

Browse files
Files changed (1) hide show
  1. README.md +0 -41
README.md CHANGED
@@ -1,41 +0,0 @@
1
- # Multilingual Automatic Speech Recognition (ASR)
2
-
3
- This project provides a web application to upload audio files, detect spoken language, convert speech to text, and download transcripts. It leverages pre-trained Wav2Vec models from Hugging Face and uses Gradio for the frontend interface.
4
-
5
- ## Features
6
- - Upload audio files
7
- - (Optional) Detect spoken language
8
- - Speech-to-text conversion via Hugging Face Wav2Vec
9
- - Save and manage transcription history
10
- - Download transcripts
11
-
12
- ## Setup
13
-
14
- 1. **Clone the repository** (or download the source code).
15
- 2. **Create a virtual environment**:
16
- ```bash
17
- python -m venv venv
18
- source venv/bin/activate
19
- ```
20
- 3. **Install dependencies**:
21
- ```bash
22
- pip install -r requirements.txt
23
- ```
24
-
25
- ## Usage
26
-
27
- To start the Gradio web interface, run:
28
-
29
- ```bash
30
- python app.py
31
- ```
32
-
33
- Open the local URL provided in the terminal in your browser.
34
-
35
- ## Project Structure
36
-
37
- - `app.py`: Main entry point for the Gradio interface.
38
- - `app/`: Module containing logic for audio processing, ASR inference, language detection, and history management.
39
- - `data/`: Folder to hold sample audio files and exported histories.
40
- - `notebooks/`: Jupyter notebooks for experiments and fine-tuning.
41
- - `tests/`: Unit testing suite.