Translate timed word data into subtitle lines with speed analysis
Identify speakers in an audio transcript
Transcribe audio to word-level timestamps
Generate SRT subtitles from audio and timestamps
Align transcript words to audio and get timestamps