Spaces:

kriti0608
/

SpeechIntentEval

Sleeping

App Files Files Community

SpeechIntentEval / README.md

kriti0608

Update README.md

230001d verified 5 months ago

preview code

raw

history blame contribute delete

2.06 kB

A newer version of the Gradio SDK is available: 6.13.0

Upgrade

metadata

license: apache-2.0
title: ' SpeechIntentEval'
sdk: gradio
emoji: 🌖
colorFrom: indigo
colorTo: blue
pinned: true

SpeechIntentEval — Benchmark for Social & Indirect Speech Understanding

Large language models perform well on direct, literal instructions, but they often fail when people communicate through hints, emotions, politeness, hedging, or sarcasm.

Everyday speech is not explicit:

“It’s freezing in here.” → change the temperature
“I probably should finish that paper…” → plan or offer help
“Sure, whatever.” → sarcasm, dismissal, frustration

SpeechIntentEval evaluates whether a model understands these socially-loaded signals.

What this demo does

Paste:

a User Utterance (how a real human would speak)
a Model Response

The system will:

infer intent category (e.g., indirect request, emotional complaint)
provide a human explanation
judge if the response understood & acted on the implied meaning

This focuses on pragmatics, not keyword toxicity or raw accuracy.

How to use the demo

Paste a natural sentence:
“It’s freezing in here.”
Paste your model’s reply:
“Let me turn up the heat for you.”
Click Evaluate.

You’ll see:

inferred intent
explanation
verdict on whether the reply handled it correctly

Current version

This v1 uses lightweight heuristics so the UI illustrates the concept.
Planned upgrades:

fine-tuned classifier
annotated dataset of speech acts
integration with FairEval and JailBreakDefense
regression testing for safety teams

Spaces:

kriti0608
/

SpeechIntentEval

Sleeping

SpeechIntentEval — Benchmark for Social & Indirect Speech Understanding

What this demo does

How to use the demo

Current version

Links

Built by Kriti Behl