findcard12138 commited on
Commit
8e0065c
·
verified ·
1 Parent(s): 8a2c43f

Upload moss-video-sft

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -30,7 +30,7 @@ This checkpoint is intended for:
30
 
31
  #### Model Architecture
32
 
33
- MOSS-Video-Preview is built on a **Llama-3.2-Vision** backbone, featuring a **Pioneering Image-Video Isomorphic Cross-Attention Architecture**:
34
 
35
  - **Native Unified Design**: Unlike traditional projection methods, our architecture provides native, unified support for both image and video understanding, ensuring seamless temporal consistency.
36
  - **Deep Multimodal Fusion**: Leveraging specialized Cross-Attention mechanisms to achieve high-fidelity alignment between visual temporal features and linguistic context.
 
30
 
31
  #### Model Architecture
32
 
33
+ MOSS-Video-Preview is built on a **Llama-3.2-Vision** backbone, featuring a **Pioneering Image-Video Unified Cross-Attention Architecture**:
34
 
35
  - **Native Unified Design**: Unlike traditional projection methods, our architecture provides native, unified support for both image and video understanding, ensuring seamless temporal consistency.
36
  - **Deep Multimodal Fusion**: Leveraging specialized Cross-Attention mechanisms to achieve high-fidelity alignment between visual temporal features and linguistic context.