| | --- |
| | license: mit |
| | datasets: |
| | - ShuhuaiRen/TimeIT |
| | language: |
| | - en |
| | --- |
| | |
| | # TimeChat Model Card |
| |
|
| | ## Model details |
| |
|
| | **Model type:** |
| | TimeChat is an open-source chatbot trained by fine-tuning LLaMA-2 on time-sensitive video-centric instruction-following data (See [TimeIT-Instruct-104k](https://huggingface.co/datasets/ShuhuaiRen/TimeIT)). |
| | It is an auto-regressive language model, based on the transformer architecture. |
| |
|
| | **Model date:** |
| | TimeChat-7B was trained in November 2023. |
| |
|
| | **Paper or resources for more information:** |
| | [Paper](https://arxiv.org/abs/2312.02051), [Code](https://github.com/RenShuhuai-Andy/TimeChat) |
| |
|
| | ## License |
| | Llama 2 is licensed under the LLAMA 2 Community License, |
| | Copyright (c) Meta Platforms, Inc. All Rights Reserved. |
| |
|
| | **Where to send questions or comments about the model:** |
| | https://github.com/RenShuhuai-Andy/TimeChat/issues |
| |
|
| | ## Intended use |
| | **Primary intended uses:** |
| | The primary use of TimeChat is research on large multimodal models and chatbots. |
| |
|
| | **Primary intended users:** |
| | The primary intended users of the model are researchers and hobbyists in computer vision, natural language processing, machine learning, and artificial intelligence. |
| |
|
| | ## Training dataset |
| | - 104K time-sensitive video-centric instruction-tuning data from [TimeIT-Instruct-104k](https://huggingface.co/datasets/ShuhuaiRen/TimeIT). |
| | - 73K video instruction-tuning data from [Valley-Instruct-73k](https://huggingface.co/datasets/luoruipu1/Valley-Instruct-73k). |
| |
|
| | ## Evaluation dataset |
| | Three tasks of long video understanding, i.e., dense video captioning (YouCook2), temporal grounding (Charades-STA), and highlight detection (QVHighlights). |
| |
|
| | ## Citation |
| | If you find our project useful, hope you can star our repo and cite our paper as follows: |
| |
|
| | ``` |
| | @article{Ren2023TimeChat, |
| | title={TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding}, |
| | author={Shuhuai Ren and Linli Yao and Shicheng Li and Xu Sun and Lu Hou}, |
| | journal={ArXiv}, |
| | year={2023}, |
| | volume={abs/2312.02051}, |
| | } |
| | ``` |