Add model card for Chain-of-Frames

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +44 -0
README.md ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: video-text-to-text
3
+ library_name: transformers
4
+ ---
5
+
6
+ # Chain-of-Frames
7
+
8
+ Chain-of-Frames (CoF) is a framework to obtain video LLMs whose reasoning steps are grounded in, and explicitly refer to, relevant frames. It employs a single-stage reasoning approach with explicit references to frame IDs, which helps reduce temporal inconsistencies in the reasoning process without relying on auxiliary modules for frame selection or caption generation.
9
+
10
+ - **Paper:** [Chain-of-Frames: Advancing Video Understanding in Multimodal LLMs via Frame-Aware Reasoning](https://huggingface.co/papers/2506.00318)
11
+ - **Repository:** [https://github.com/SaraGhazanfari/CoF](https://github.com/SaraGhazanfari/CoF)
12
+
13
+ ## Sample Usage
14
+
15
+ The model loading and evaluation procedures are similar to those used in the InternVL repository. You can load the model using the `transformers` library as follows:
16
+
17
+ ```python
18
+ import torch
19
+ from transformers import AutoTokenizer, AutoModel
20
+
21
+ model_path = "path/to/CoF-model" # replace with specific checkpoint path or repo ID
22
+ model = AutoModel.from_pretrained(
23
+ model_path,
24
+ torch_dtype=torch.bfloat16,
25
+ low_cpu_mem_usage=True,
26
+ use_flash_attn=True,
27
+ trust_remote_code=True).eval().cuda()
28
+
29
+ tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True, use_fast=False)
30
+ generation_config = dict(max_new_tokens=2048, do_sample=False)
31
+ ```
32
+
33
+ ## Citation
34
+
35
+ If you use this work, please consider citing:
36
+
37
+ ```bibtex
38
+ @article{ghazanfari2025chainofframes,
39
+ title={Chain-of-Frames: Advancing Video Understanding in Multimodal LLMs via Frame-Aware Reasoning},
40
+ author={Sara Ghazanfari and Francesco Croce and Nicolas Flammarion and Prashanth Krishnamurthy and Farshad Khorrami and Siddharth Garg},
41
+ year={2025},
42
+ journal={arXiv preprint arxiv:2506.00318}
43
+ }
44
+ ```