File size: 2,248 Bytes
8d7da34
 
 
 
 
 
34f8392
8ea5f70
8d7da34
34f8392
8d7da34
 
34f8392
458faa7
 
bec78d1
 
 
458faa7
0f91105
458faa7
97246dc
0f91105
97246dc
0f91105
97246dc
0f91105
97246dc
0f91105
f489fc8
 
 
 
 
 
 
 
0f91105
97246dc
 
 
 
bec78d1
97246dc
 
0f91105
 
458faa7
 
 
ab67987
 
458faa7
ab67987
458faa7
 
 
 
 
 
ab67987
 
97246dc
 
 
458faa7
f489fc8
8cdcb92
 
97246dc
8cdcb92
 
97246dc
458faa7
97246dc
458faa7
bec78d1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
---
title: Multilang Asr Captioner
sdk: docker
emoji: πŸ“š
colorFrom: red
colorTo: blue
app_file: main.py
app_port: 8000
pinned: true
license: cc-by-nc-4.0
short_description: Multilingual ASR and video captioning tool
---

## Multilang ASR Captioner

A multilingual automatic speech recognition and video captioning tool using faster whisper. 

Supports real-time translation to english. Runs on consumer grade cpu.

<video width="400" height="300" src="https://github.com/marquesafonso/multilang-asr-captioner/assets/79766107/fcff8ac1-cdfc-4400-821c-f797d84c2d8a"></video>

## Requirements and Instalations

### Docker (preferred)

You'll need to install [docker](https://www.docker.com/products/docker-desktop/).

Then, follow the steps below.

1. clone the repo
```{bash}
git clone git@github.com:marquesafonso/multilang-asr-captioner.git
```
2. Build and run the container using docker-compose
```{bash}
docker compose up
```

Check the [landing page](http://127.0.0.1:8000). 

From there you will see the [submit_video endpoint](http://127.0.0.1:8000/submit_video/) and the [documentation](http://127.0.0.1:8000/docs/)

**Tip**: on Linux or Mac localhost will resolve directly to 0.0.0.0 but on windows you will need to change it to 127.0.0.1 or localhost

### Local

To run this tool locally on your computer you will need the following sofware installed:
+ [ImageMagick](https://imagemagick.org/script/download.php)
+ [Python (3.11)](https://www.python.org/downloads/release/python-3116/)

Once you are at your desired working directory, run the following commands on your terminal:

```{bash}
git clone git@github.com:marquesafonso/multilang-asr-captioner.git

pip install pipenv

pipenv install
```

Note that this assumes a proper Git installation and ssh key configuration. 

## Quick start (local)

### API

A FastAPI API is available. To start the API locally, run:

```
pipenv run python main.py
```

Then check the [landing page](http://127.0.0.1:8000).

From there you will see the [submit_video endpoint](http://127.0.0.1:8000/submit_video/) and the [documentation](http://127.0.0.1:8000/docs/)

**Tip**: on Linux or Mac localhost will resolve directly to 0.0.0.0 but on windows you will need to change it to 127.0.0.1 or localhost