| --- |
| language: en |
| tags: [narrative-context, film-analysis, multi-task, pytorch, transformer, position-aware] |
| --- |
| # Narrative Context Module 2 (Position-Aware) |
|
|
| Cross-scene feature evolution Transformer for film narrative understanding. |
| Consumes 256-d scene embeddings from [`wrathofgod/scene-perception-m1-unfreeze-deberta-small`](https://huggingface.co/wrathofgod/scene-perception-m1-unfreeze-deberta-small). |
|
|
| ## Architecture Upgrades vs Previous M2 |
|
|
| | Component | Old M2 | New M2 | |
| |-----------|--------|--------| |
| | Feature dim | 304-d | 308-d (sin/cos position) | |
| | Positional encoding | window-relative only | window-relative + film-absolute MLP | |
| | Feature evolution | none | DeltaEncoder (GLU) as extra token | |
| | Sequence length | 5 | 7 ([CLS] + 5 scenes + [DELTA]) | |
| | Context fusion | last token only | CLS ⊕ current-scene via fusion gate | |
| | Transformer depth | 4L × 8H, FFN=512 | 6L × 8H, FFN=768 | |
| | Label smoothing | none | ε=0.1 | |
|
|
| ## Input |
| 5-scene causal window [t-4 … t] per film. |
| Per-scene feature: 305-d (M1 embedding + metadata + sin/cos position). |
| Film position (0-1 scalar) fed separately to FilmPositionEncoder. |
|
|
| ## 7 Prediction Heads |
| | # | Head | Type | Output | |
| |---|------|------|--------| |
| | 1 | scene_valence_continuous | regression | -1.0 to 1.0 | |
| | 2 | tension_level | regression | 1 to 10 | |
| | 3 | arousal_level | regression | 1 to 10 | |
| | 4 | emotional_shift_trigger | binary | True / False | |
| | 5 | narrative_arc_position | 5-class | Setup / Rising / Climax / Falling / Resolution | |
| | 6 | foreshadowing_type | 4-class | None / Foreshadow / Payoff / Echo | |
| | 7 | transition_type | 5-class | attacca / fade / segue / silence / cut | |
|
|