findcard12138 commited on
Commit
abc5575
·
verified ·
1 Parent(s): 51907cf

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +9 -5
README.md CHANGED
@@ -25,11 +25,6 @@ This repo contains the **pretrained weights** that are intended to serve as the
25
  - **Offline SFT**: instruction-following and reasoning on full video segments
26
  - **Realtime SFT**: low-latency streaming video understanding and response
27
 
28
- > [!IMPORTANT]
29
- > ### 🌟 Our Mission & Community Invitation
30
- > **We have filled the gap in cross-attention-based foundation models for video understanding.**
31
- >
32
- > We warmly welcome experts in **Representation Learning** and **Model Efficiency** to explore, experiment, and innovate on top of our architecture. Let's push the boundaries of video intelligence and advance the open-source community together!
33
 
34
 
35
  #### Model Architecture
@@ -189,6 +184,15 @@ For full environment setup (including optional FlashAttention2 extras), see the
189
  - This is a **base** model directory. Quality/latency characteristics (offline SFT, real-time streaming, etc.) depend on the specific fine-tuned checkpoints and inference pipeline.
190
  - The Python source files in this directory are referenced via `auto_map` in `config.json`, so `trust_remote_code=True` is typically required when loading from this local folder.
191
 
 
 
 
 
 
 
 
 
 
192
  ## Citation
193
 
194
  ```bibtex
 
25
  - **Offline SFT**: instruction-following and reasoning on full video segments
26
  - **Realtime SFT**: low-latency streaming video understanding and response
27
 
 
 
 
 
 
28
 
29
 
30
  #### Model Architecture
 
184
  - This is a **base** model directory. Quality/latency characteristics (offline SFT, real-time streaming, etc.) depend on the specific fine-tuned checkpoints and inference pipeline.
185
  - The Python source files in this directory are referenced via `auto_map` in `config.json`, so `trust_remote_code=True` is typically required when loading from this local folder.
186
 
187
+
188
+ > [!IMPORTANT]
189
+ > ### 🌟 Our Mission & Community Invitation
190
+ > **We have filled the gap in cross-attention-based foundation models for video understanding.**
191
+ >
192
+ > We warmly welcome experts in **Representation Learning** and **Model Efficiency** to explore, experiment, and innovate on top of our architecture. Let's push the boundaries of video intelligence and advance the open-source community together!
193
+
194
+
195
+
196
  ## Citation
197
 
198
  ```bibtex