ducdatit2002
/

OA-RUMER

emotion-recognition

escalation-detection

customer-service

Model card Files Files and versions

OA-RUMER / README.md

Codex

Rename dataset assets for modeling

cfb1af4 about 13 hours ago

|

history blame contribute delete

2.91 kB

	---
	license: mit
	language:
	- vi
	tags:
	- vietnamese
	- speech
	- emotion-recognition
	- escalation-detection
	- customer-service
	- multimodal
	- model-assets
	- model-ready
	---

	# OA-RUMER Model Assets

	OA-RUMER is a Vietnamese customer-service modeling asset package for
	turn-level emotion recognition and escalation/de-escalation analysis. This
	repository contains metadata plus model-ready CSV/JSON splits; it does not
	include the full raw audio corpus.

	## Contents

	- `model_assets/metadata/calls_metadata.csv`: call-level audio metadata.
	- `model_assets/metadata/model_assets_summary.md`: high-level asset summary.
	- `model_assets/model_ready/oa_rumer_full_phowhisper_3class/`: primary 3-class
	turn-level splits and summaries.
	- `model_assets/model_ready/oa_rumer_full_phowhisper/`: original full label
	variant.
	- `model_assets/model_ready/text_only_transition_3class/`: customer-transition
	split files for escalation modeling.

	## Repository Layout

	\| Path \| Contents \|
	\|---\|---\|
	\| `model_assets/metadata/` \| Call-level metadata and model assets summary \|
	\| `model_assets/model_ready/` \| Final CSV/JSON splits ready for modeling \|

	## Labels

	- Emotion labels: `neutral`, `positive`, `negative`
	- Original emotion labels include `negative_low` and `negative_high`
	- Escalation labels: `stable`, `de-escalation`, `escalation`
	- Role labels: `customer`, `agent`
	- Overlap labels: `no_overlap`, `backchannel_overlap`,
	`interruption_overlap`, `conflict_overlap`, `uncertain_overlap`

	The cleaned split CSVs use `label_confidence` for annotation confidence.

	## Audio Paths

	Raw WAV files are not included in this repository. The CSV files still point to
	`data_audio_set/*.wav`; place the local audio folder at `data_audio_set/` when
	running audio-based experiments.

	## Notes

	- `negative_high` is merged into `negative` for the main 3-class runs.
	- Escalation can be modeled at the customer-transition level using
	`model_assets/model_ready/text_only_transition_3class/`.
	- Audio paths in the CSV files point to `data_audio_set/*.wav`.

	## PhoBERT Context Baselines

	The local experiment runner at `experiments/run_phobert_context_baselines.sh`
	adds text-only ablations around a frozen PhoBERT turn encoder plus an optional
	Transformer context encoder.

	\| Model \| Text \| Audio \| Role \| Context \| Overlap \| MT \|
	\|---\|---\|---\|---\|---\|---\|---\|
	\| Text-only PhoBERT \| Yes \| No \| No \| No \| No \| No \|
	\| Text Context Transformer \| Yes \| No \| No \| Yes \| No \| No \|
	\| Text+Role Context \| Yes \| No \| Yes \| Yes \| No \| No \|
	\| Text+Role Transition Context \| Yes \| No \| Yes \| Yes \| No \| Yes \|
	\| Text+Role Agent Context Transition \| Yes \| No \| Yes \| Yes \| No \| No \|
	\| OA-RUMER \| Yes \| Yes \| Yes \| Yes \| Yes \| Yes \|

	Run a smoke trial:

	```bash
	DEVICE=auto experiments/run_phobert_context_baselines.sh trial
	```

	Run the full 3-class suite:

	```bash
	DEVICE=auto EPOCHS=8 experiments/run_phobert_context_baselines.sh full
	```