File size: 868 Bytes
7934b29
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
.. _punctuation_capitalization_models:

Punctuation And Capitalization Models
==============================================

Automatic Speech Recognition (ASR) systems typically generate text with no punctuation and capitalization of the words.
There are two issues with non-punctuated ASR output:

- it could be difficult to read and understand
- models for some downstream tasks, such as named entity recognition, machine translation, or text-to-speech synthesis, are
  usually trained on punctuated datasets and using raw ASR output as the input to these models could deteriorate their
  performance


NeMo provides two types of Punctuation And Capitalization Models:

Lexical only model:

.. toctree::
   :maxdepth: 1

   punctuation_and_capitalization   


Lexical and audio model:

.. toctree::
   :maxdepth: 1

   punctuation_and_capitalization_lexical_audio