Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -25,11 +25,6 @@ This repo contains the **pretrained weights** that are intended to serve as the
|
|
| 25 |
- **Offline SFT**: instruction-following and reasoning on full video segments
|
| 26 |
- **Realtime SFT**: low-latency streaming video understanding and response
|
| 27 |
|
| 28 |
-
> [!IMPORTANT]
|
| 29 |
-
> ### 🌟 Our Mission & Community Invitation
|
| 30 |
-
> **We have filled the gap in cross-attention-based foundation models for video understanding.**
|
| 31 |
-
>
|
| 32 |
-
> We warmly welcome experts in **Representation Learning** and **Model Efficiency** to explore, experiment, and innovate on top of our architecture. Let's push the boundaries of video intelligence and advance the open-source community together!
|
| 33 |
|
| 34 |
|
| 35 |
#### Model Architecture
|
|
@@ -189,6 +184,15 @@ For full environment setup (including optional FlashAttention2 extras), see the
|
|
| 189 |
- This is a **base** model directory. Quality/latency characteristics (offline SFT, real-time streaming, etc.) depend on the specific fine-tuned checkpoints and inference pipeline.
|
| 190 |
- The Python source files in this directory are referenced via `auto_map` in `config.json`, so `trust_remote_code=True` is typically required when loading from this local folder.
|
| 191 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 192 |
## Citation
|
| 193 |
|
| 194 |
```bibtex
|
|
|
|
| 25 |
- **Offline SFT**: instruction-following and reasoning on full video segments
|
| 26 |
- **Realtime SFT**: low-latency streaming video understanding and response
|
| 27 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 28 |
|
| 29 |
|
| 30 |
#### Model Architecture
|
|
|
|
| 184 |
- This is a **base** model directory. Quality/latency characteristics (offline SFT, real-time streaming, etc.) depend on the specific fine-tuned checkpoints and inference pipeline.
|
| 185 |
- The Python source files in this directory are referenced via `auto_map` in `config.json`, so `trust_remote_code=True` is typically required when loading from this local folder.
|
| 186 |
|
| 187 |
+
|
| 188 |
+
> [!IMPORTANT]
|
| 189 |
+
> ### 🌟 Our Mission & Community Invitation
|
| 190 |
+
> **We have filled the gap in cross-attention-based foundation models for video understanding.**
|
| 191 |
+
>
|
| 192 |
+
> We warmly welcome experts in **Representation Learning** and **Model Efficiency** to explore, experiment, and innovate on top of our architecture. Let's push the boundaries of video intelligence and advance the open-source community together!
|
| 193 |
+
|
| 194 |
+
|
| 195 |
+
|
| 196 |
## Citation
|
| 197 |
|
| 198 |
```bibtex
|