File size: 1,694 Bytes

6c575f4
55b4e7f
6c575f4
 
 
 
 
 
 
 
 
9497563
6c575f4
4cf2fed
 
3681f88
 
41361cb
 
6c575f4
9729feb
 
 
 
6c575f4
3681f88
 
 
 
b8663ff
 
8e497df
 
 
 
 
 
 
 
 
6c575f4
3681f88
 
 
aa5acf5

---
license: cc-by-nc-4.0
language:
- nl
pipeline_tag: automatic-speech-recognition
---

# Model

This repository contains the second version of our Automatic Speech Recognition and Subtitle Generation model, with improved architecture and trained on 14000 hours of Flemish broadcast subtitled speech data. 
It can generate both an exact verbatim transcription with annotation tags as well as a fully formatted and cleaned up subtitle transcription.
It outputs both modalities with separate decoders.

This repository contains the large variant of the model with 180M parameters.

**Version**: April 2024

# Usage

This repository only hosts the pre-trained model itself and the configuration files. 
To download this model, see the instructions [here](https://huggingface.co/docs/hub/models-downloading).

Usage of this model, as well as our other ASR models, is integrated in [our Github codebase](https://github.com/nelfproject/NeLF_Transcription_ASR).
Please refer to the Github for installation.

# Webservice

This model can also be accessed through the [webservice of the NeLF Project](https://nelfproject.be/web_service.php). After requesting access, you can upload audio or video files and they will be transcribed according to the desired settings. 

# Citation

If you use this model, please cite the research paper:
```bibtex
@article{poncelet2024,
    author = "Poncelet, Jakob and Van hamme, Hugo",
    title = "Leveraging Broadcast Media Subtitle Transcripts for Automatic Speech Recognition and Subtitling",
    year={2024},
    journal={arXiv preprint arXiv:2502.03212},
    url = {https://arxiv.org/abs/2502.03212}
```

# Contact

Jakob Poncelet: jakob.poncelet@kuleuven.be