PersonaVLM: Long-Term Personalized Multimodal LLMs (CVPR 2026)

πŸŽ‰ News: Our paper "PersonaVLM: Long-Term Personalized Multimodal LLMs" is accepted to CVPR 2026!

🌟 Introduction

PersonaVLM is an innovative personalized multimodal agent framework designed for long-term personalization. It transforms a general-purpose MLLM into a personalized assistant by integrating three key capabilities:

  1. Remembering: Proactively extracts and summarizes multimodal memories into a personalized database.
  2. Reasoning: Conducts multi-turn reasoning by retrieving relevant memories from a multi-type memory architecture (core, semantic, episodic, and procedural).
  3. Response Alignment: Infers the user's evolving personality using a Momentum-based Personality Evolving Mechanism (PEM) to ensure aligned outputs.

πŸ“Š Persona-MME Benchmark

We establish Persona-MME, a comprehensive benchmark comprising over 2,000 curated interaction cases across 14 fine-grained tasks to assess long-term MLLM personalization.

πŸ”— Official Resources

This project consists of several components. You can access the model weights, training data, benchmark, and code via the links below:

Resource Link
🌐 Project Page https://PersonaVLM.github.io
πŸ’» Official Code GitHub: PersonaVLM
πŸ€— Model Weights HF: PersonaVLM (Qwen2.5-VL-7B)
πŸ“Š Benchmark HF: Persona-MME (2,000+ cases)
πŸ“‚ Training Data HF: PersonaVLM-Dataset (80k+ samples)

βœ’οΈ Citation

If you find our work helpful, please cite our paper:

@inproceedings{nie2026personavlm,
  title={PersonaVLM: Long-Term Personalized Multimodal LLMs},
  author={Nie, Chang and Fu, Chaoyou and Zhang, Yifan and Yang, Haihua and Shan, Caifeng},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2026}
}
Downloads last month
41
Safetensors
Model size
8B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for ClareNie/PersonaVLM

Finetuned
(1014)
this model

Datasets used to train ClareNie/PersonaVLM