AURA: Always-On Understanding and Real-Time Assistance via Video Streams
Paper • 2604.04184 • Published • 36
A real-time multimodal streaming system powered by our AURA model, supporting continuous video understanding with speech interaction.
For demo deployment instructions and full source code, please refer to our GitHub repository.
@article{aura2026,
title={AURA: Always-On Understanding and Real-Time Assistance via Video Streams},
author={Lu, Xudong and Bo, Yang and Chen, Jinpeng and Li, Shuhan and Guo, Xintong and Guan, Huankang and Liu, Fang and Xu, Dunyuan and Sun, Peiwen and Sun, Heyang and Liu, Rui and Li, Hongsheng},
journal={arXiv preprint arXiv:2604.04184},
year={2026}
}
Base model
Qwen/Qwen3-VL-8B-Instruct