arxiv:2601.17277

PingPong: A Natural Benchmark for Multi-Turn Code-Switching Dialogues

Published on Jan 24

· Submitted by

Mohammad Rifqi Farhansyah on Jan 27

Upvote

Authors:

Mohammad Rifqi Farhansyah ,

Hanif Muhammad Zhafran ,

Farid Adilazuarda ,

Genta Indra Winata ,

Alham Fikri Aji

Abstract

Code-switching presents complex challenges in multilingual communication that current language models struggle to address effectively.

AI-generated summary

Code-switching is a widespread practice among the world's multilingual majority, yet few benchmarks accurately reflect its complexity in everyday communication. We present PingPong, a benchmark for natural multi-party code-switching dialogues covering five language-combination variations, some of which are trilingual. Our dataset consists of human-authored conversations among 2 to 4 participants covering authentic, multi-threaded structures where replies frequently reference much earlier points in the dialogue. We demonstrate that our data is significantly more natural and structurally diverse than machine-generated alternatives, offering greater variation in message length, speaker dominance, and reply distance. Based on these dialogues, we define three downstream tasks: Question Answering, Dialogue Summarization, and Topic Classification. Evaluations of several state-of-the-art language models on PingPong reveal that performance remains limited on code-switched inputs, underscoring the urgent need for more robust NLP systems capable of addressing the intricacies of real-world multilingual discourse.

View arXiv page View PDF Add to collection

Community

rifqifarhansyah

Paper author Paper submitter about 7 hours ago

PingPong: A Natural Benchmark for Multi-Turn Code-Switching Dialogues

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2601.17277 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2601.17277 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2601.17277 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.