OA-RUMER / README.md
Codex
Rename dataset assets for modeling
cfb1af4
metadata
license: mit
language:
  - vi
tags:
  - vietnamese
  - speech
  - emotion-recognition
  - escalation-detection
  - customer-service
  - multimodal
  - model-assets
  - model-ready

OA-RUMER Model Assets

OA-RUMER is a Vietnamese customer-service modeling asset package for turn-level emotion recognition and escalation/de-escalation analysis. This repository contains metadata plus model-ready CSV/JSON splits; it does not include the full raw audio corpus.

Contents

  • model_assets/metadata/calls_metadata.csv: call-level audio metadata.
  • model_assets/metadata/model_assets_summary.md: high-level asset summary.
  • model_assets/model_ready/oa_rumer_full_phowhisper_3class/: primary 3-class turn-level splits and summaries.
  • model_assets/model_ready/oa_rumer_full_phowhisper/: original full label variant.
  • model_assets/model_ready/text_only_transition_3class/: customer-transition split files for escalation modeling.

Repository Layout

Path Contents
model_assets/metadata/ Call-level metadata and model assets summary
model_assets/model_ready/ Final CSV/JSON splits ready for modeling

Labels

  • Emotion labels: neutral, positive, negative
  • Original emotion labels include negative_low and negative_high
  • Escalation labels: stable, de-escalation, escalation
  • Role labels: customer, agent
  • Overlap labels: no_overlap, backchannel_overlap, interruption_overlap, conflict_overlap, uncertain_overlap

The cleaned split CSVs use label_confidence for annotation confidence.

Audio Paths

Raw WAV files are not included in this repository. The CSV files still point to data_audio_set/*.wav; place the local audio folder at data_audio_set/ when running audio-based experiments.

Notes

  • negative_high is merged into negative for the main 3-class runs.
  • Escalation can be modeled at the customer-transition level using model_assets/model_ready/text_only_transition_3class/.
  • Audio paths in the CSV files point to data_audio_set/*.wav.

PhoBERT Context Baselines

The local experiment runner at experiments/run_phobert_context_baselines.sh adds text-only ablations around a frozen PhoBERT turn encoder plus an optional Transformer context encoder.

Model Text Audio Role Context Overlap MT
Text-only PhoBERT Yes No No No No No
Text Context Transformer Yes No No Yes No No
Text+Role Context Yes No Yes Yes No No
Text+Role Transition Context Yes No Yes Yes No Yes
Text+Role Agent Context Transition Yes No Yes Yes No No
OA-RUMER Yes Yes Yes Yes Yes Yes

Run a smoke trial:

DEVICE=auto experiments/run_phobert_context_baselines.sh trial

Run the full 3-class suite:

DEVICE=auto EPOCHS=8 experiments/run_phobert_context_baselines.sh full