ConcertIDC
/

WhisperAI-Speech-To-Text

Model card Files Files and versions

YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Features

Multilingual Transcription: Automatically transcribes audio in various languages using OpenAI’s Whisper model.
Speaker Diarization: Detects different speakers in the audio and labels the transcription accordingly.
File Upload: Allows users to upload an audio file, which is then processed for transcription and speaker diarization.
Timestamped File Naming: Uploaded files are saved with a unique timestamp in the filename.

Requirements

Make sure to have the following Python libraries installed. You can install them using pip and the requirements.txt file provided.

Installation

Clone the repository:

git clone https://github.com/your-repository-url.git
cd your-repository

Create and activate a virtual environment (optional but recommended):

python -m venv env
source env/bin/activate  # On Windows, use `env\Scripts\activate`

Install the dependencies:
```
pip install -r requirements.txt
```
Install Hugging Face authentication token for pyannote audio (if required):
- Create an account on Hugging Face (https://huggingface.co/).
- Obtain an API token from your account.
- Use the token in your app by setting it as an environment variable or directly in the code:
```
use_auth_token="your_token"
```

Usage

Run the Streamlit app:
```
streamlit run app.py
```
The app will launch in your browser. Select an audio file (MP3, WAV, or M4A format) from your system.
The file will be uploaded to the upload directory, and the transcription will begin.
After processing, the app will display:
- The detected language of the audio.
- The transcription with speaker labels.

Models

Whisper Model

Used for multilingual transcription.
The model is loaded using the whisper Python package.

PyAnnote Model

Used for speaker diarization to detect speakers in the audio.
The model is loaded using the pyannote.audio library.

Troubleshooting

Diarization Model Issues

If you face issues with loading the diarization model, ensure you have:

Installed the correct dependencies.
Set up the Hugging Face token if required.

Model Load Failures

If the models fail to load, ensure that:

The internet connection is stable.
The model files are downloaded correctly.

File Upload Issues

If the file upload is not working correctly:

Ensure the upload folder exists in your project directory.
Make sure the file path is correct and accessible.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support