Timestamps info
Browse files
README.md
CHANGED
|
@@ -278,13 +278,12 @@ can be run with batched inference. It can also be extended to predict sequence l
|
|
| 278 |
>>> sample = ds[0]["audio"]
|
| 279 |
|
| 280 |
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
|
| 281 |
-
"mˈɪstɚ kwˈɪltɚ ˈɪz
|
| 282 |
|
| 283 |
>>> # we can also return timestamps for the predictions
|
| 284 |
-
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=
|
| 285 |
-
|
| 286 |
-
Whisper did not predict an ending timestamp, which can happen if audio is cut off in the middle of a word. Also make sure WhisperTimeStampLogitsProcessor was used during generation.
|
| 287 |
|
|
|
|
| 288 |
```
|
| 289 |
|
| 290 |
Refer to the blog post [ASR Chunking](https://huggingface.co/blog/asr-chunking) for more details on the chunking algorithm.
|
|
|
|
| 278 |
>>> sample = ds[0]["audio"]
|
| 279 |
|
| 280 |
>>> prediction = pipe(sample.copy(), batch_size=8)["text"]
|
| 281 |
+
"mˈɪstɚ kwˈɪltɚ ˈɪz ðɪ əpˈɑsəl əv ðə ˈmɪdəl klˈæsɪz ˈænd wˈɪɹ glˈæd tˈɪ wˈɛlkəm ˈhɪz gˈɑspəl"
|
| 282 |
|
| 283 |
>>> # we can also return timestamps for the predictions
|
| 284 |
+
>>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps="word")["chunks"]
|
|
|
|
|
|
|
| 285 |
|
| 286 |
+
[{'text': 'mˈɪstɚ', 'timestamp': (0.42, 0.78)}, {'text': ' kwˈɪltɚ', 'timestamp': (0.78, 1.2)}, {'text': ' ˈɪz', 'timestamp': (1.2, 1.4)}, {'text': ' ðɪ', 'timestamp': (1.4, 1.52)}, {'text': ' əpˈɑsəl', 'timestamp': (1.52, 2.08)}, {'text': ' əv', 'timestamp': (2.08, 2.26)}, {'text': ' ðə', 'timestamp': (2.26, 2.36)}, {'text': ' ˈmɪdəl', 'timestamp': (2.36, 2.6)}, {'text': ' klˈæsɪz', 'timestamp': (2.6, 3.22)}, {'text': ' ˈænd', 'timestamp': (3.22, 3.42)}, {'text': ' wˈɪɹ', 'timestamp': (3.42, 3.66)}, {'text': ' glˈæd', 'timestamp': (3.66, 4.02)}, {'text': ' tˈɪ', 'timestamp': (4.02, 4.18)}, {'text': ' wˈɛlkəm', 'timestamp': (4.18, 4.58)}, {'text': ' ˈhɪz', 'timestamp': (4.58, 4.82)}, {'text': ' gˈɑspəl', 'timestamp': (4.82, 5.38)}]
|
| 287 |
```
|
| 288 |
|
| 289 |
Refer to the blog post [ASR Chunking](https://huggingface.co/blog/asr-chunking) for more details on the chunking algorithm.
|