# NeMo Forced Aligner (NFA)
Try it out: HuggingFace Space 🎤 | Tutorial: "How to use NFA?" 🚀 | Blog post: "How does forced alignment work?" 📚
NFA is a tool for generating token-, word- and segment-level timestamps of speech in audio using NeMo's CTC-based Automatic Speech Recognition models. You can provide your own reference text, or use ASR-generated transcription. You can use NeMo's ASR Model checkpoints out of the box in [14+ languages](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/asr/results.html#speech-recognition-languages), or train your own model. NFA can be used on long audio files of 1+ hours duration (subject to your hardware and the ASR model used).
## Quickstart
1. Install [NeMo](https://github.com/NVIDIA/NeMo#installation).
2. Prepare a NeMo-style manifest containing the paths of audio files you would like to process, and (optionally) their text.
3. Run NFA's `align.py` script with the desired config, e.g.:
``` bash
python /tools/nemo_forced_aligner/align.py \
pretrained_name="stt_en_fastconformer_hybrid_large_pc" \
manifest_filepath= \
output_dir=
```
## Documentation
More documentation is available [here](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/tools/nemo_forced_aligner.html).