| | --- |
| | library_name: transformers |
| | license: apache-2.0 |
| | language: |
| | - en |
| | pipeline_tag: audio-to-audio |
| | base_model: |
| | - naver-ai/USTM |
| | --- |
| | |
| | # Model Card |
| |
|
| | <!-- Provide a quick summary of what the model is/does. --> |
| | Unified Spoken Dialog Model (USDM), a model obtained by supervised fine-tuning the Unified Speech-Text Model (USTM) on the DailyTalk dataset, preprocessed into single-turn conversations. |
| |
|
| | ## Paralinguistics-Aware Speech-Empowered LLMs for Natural Conversation [NeurIPS 2024] |
| |
|
| | - **Repository:** https://github.com/naver-ai/usdm |
| | - **Paper:** https://openreview.net/forum?id=NjewXJUDYq |
| | - **Project Page:** https://unifiedsdm.github.io/ |
| |
|
| |
|
| | ## BibTeX |
| |
|
| | ``` |
| | @inproceedings{ |
| | kim2024paralinguisticsaware, |
| | title={Paralinguistics-Aware Speech-Empowered Large Language Models for Natural Conversation}, |
| | author={Heeseung Kim and Soonshin Seo and Kyeongseok Jeong and Ohsung Kwon and Soyoon Kim and Jungwhan Kim and Jaehong Lee and Eunwoo Song and Myungwoo Oh and Jung-Woo Ha and Sungroh Yoon and Kang Min Yoo}, |
| | booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems}, |
| | year={2024}, |
| | url={https://openreview.net/forum?id=NjewXJUDYq} |
| | } |
| | ``` |