Speak or Stay Silent: Context-Aware Turn-Taking in Multi-Party Dialogue
Paper • 2603.11409 • Published
LoRA adapter for Qwen/Qwen2.5-7B fine-tuned on all three domains (AMI, Friends, SPGI) for proactive response prediction in multi-party conversations. Given a conversational context and a current utterance, the model predicts whether a target speaker will SPEAK next or remain SILENT.
This model was trained on all three datasets combined:
| Dataset | Description | Domain |
|---|---|---|
| AMI Corpus | Meeting transcripts with explicit addressee annotations | Meeting recordings (3–7 participants) |
| Friends Corpus | TV show transcripts with scene-level speaker turns and explicit addressee mentions | Scripted sitcom dialogue (Friends) |
| SPGI Corpus | Earnings call transcripts with addressee inferred heuristically from conversation structure (question-answer pairs, moderator patterns) | Financial Q&A (earnings calls) |
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-7B")
model = PeftModel.from_pretrained(base_model, "kraken07/qwen2.5-7b-all-dataset")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-7B")
# Your input format should match training: context turns + current turn
# Output: SPEAK or SILENT prediction for the target speaker
If you use this model, please cite our work:
@misc{bhagtani2026speakstaysilentcontextaware,
title={Speak or Stay Silent: Context-Aware Turn-Taking in Multi-Party Dialogue},
author={Bhagtani, Kratika and Anand, Mrinal and Xu, Yu Chen and Yadav, Amit Kumar Singh},
year={2026},
archivePrefix={arXiv},
url={https://arxiv.org/abs/2603.11409}
}
Base model
Qwen/Qwen2.5-7B