marquesafonso's picture
update readme
84b4e4e
|
raw
history blame
1.6 kB

Multilang ASR Captioner

A multilingual automatic speech recognition and video captioning tool using faster whisper on cpu.

Requirements and Instalations

To run this tool you will need the following sofware installed on your computer:

Once you are at your desired working directory, run the following commands on your terminal:

git clone git@github.com:marquesafonso/multilang-asr-captioner.git

pip install pipenv

pipenv install

Note that this assumes a proper Git installation and ssh key configuration.

Quick start

Command Line Interface

Run the following code to your example using the CLI. The example is based on a youtube video url (optional):

pipenv run python .\cli.py --invideo_filename '<your_file_name>' --video_url 'https://www.youtube.com/watch?v=<your_youtube_video>' --max_words_per_line 8

Fontsize, Font, Background Color and Text Color arguments are available:

pipenv run python .\cli.py --invideo_filename '<your_file>' --video_url 'https://www.youtube.com/watch?v=<your_youtube_video>' --max_words_per_line 8 --fontsize 28 --font "Arial-Bold" --bg_color None --text_color 'white'

API

A FastAPI API is also made available.

To start the API run:

pipenv run python main.py

Then check the landing page.

From there you will see the submit_video endpoint and the documentation