---
arxiv: 2603.15818
license: mit
tags:
  - multimodal
  - emotion-recognition
  - ambivalence
  - hesitancy
  - ABAW10
---

# ConflictAwareAH — Ambivalence/Hesitancy Recognition

Pre-trained weights for the Conflict-Aware Multimodal Fusion model (ABAW10 Challenge, AVGF1 0.715).

## Usage

GitHub: https://github.com/Bekhouche/ConflictAwareAH

```python
import torch
from bah.models import ConflictAwareAHModel
from huggingface_hub import hf_hub_download

ckpt_path = hf_hub_download(repo_id="Bekhouche/ConflictAwareAH", filename="best_model.pt")
ckpt = torch.load(ckpt_path, map_location="cpu")
args = ckpt["args"]

# Infer fusion_type from checkpoint keys
state_keys = set(ckpt["model"].keys())
fusion_type = args.get("fusion_type") or ("6token" if any("fusion_transformer" in k for k in state_keys) else "concat")

model = ConflictAwareAHModel(
    video_model=args["video_model"],
    audio_model=args["audio_model"],
    text_model=args["text_model"],
    dropout=0.0,
    freeze_encoders=args.get("freeze_encoders", True),
    unfreeze_top_k=args.get("unfreeze_top_k", 0),
    num_transformer_layers=args.get("num_layers", 2),
    fusion_type=fusion_type,
)
model.load_state_dict(ckpt["model"], strict=True)
model.eval()

text_blend = ckpt.get("text_blend", args.get("text_blend", 0.5))
```

## Config

- Encoders: VideoMAE-Base, HuBERT-Base, RoBERTa-GoEmotions (frozen)
- Dropout: 0.4
- Text blend (inference): 0.5