Papers
arxiv:2605.18734

EgoExoMem: Cross-View Memory Reasoning over Synchronized Egocentric and Exocentric Videos

Published on May 18
Authors:
,
,
,
,
,
,
,
,
,

Abstract

EgoExoMem introduces a benchmark for cross-view memory reasoning using synchronized egocentric and exocentric videos, with E²-Select enabling improved frame selection for dual-view retrieval.

Egocentric memory is widely used in embodied intelligence, but it may be insufficient for comprehensive spatial-temporal reasoning. Inspired by human recall from both field and observer perspectives, we introduce EgoExoMem, the first benchmark for cross-view memory reasoning over synchronized egocentric and exocentric videos. EgoExoMem contains 2.6K high-quality MCQs across eight temporal, spatial, and cross-view QA types. To support dual-view retrieval, we propose E^2-Select, a training-free frame selection method for synchronized ego-exo videos. It combines relevance-based budget allocation with per-view k-DPP sampling to handle view asymmetry and cross-view temporal consistency. Experiments show that ego and exo views provide complementary memory cues, while existing MLLMs remain far from solving the benchmark: the best model reaches only 55.3%. E^2-Select achieves state-of-the-art performance of 58.2% over frame-selection and RAG-based memory baselines. Further analysis reveals systematic view-preference conflicts between question framing and answer grounding, underscoring the novelty and challenge of cross-view memory reasoning.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2605.18734
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2605.18734 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.18734 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.