Upload folder using huggingface_hub
Browse files
README.md
CHANGED
|
@@ -296,9 +296,7 @@ MOSS-VL-Instruct-0408 represents an early milestone in the MOSS-VL roadmap, and
|
|
| 296 |
- ๐งฎ **Math & Code Reasoning** โ While the current checkpoint already exhibits solid general reasoning, we plan to substantially strengthen its mathematical reasoning and code reasoning capabilities, especially in multimodal contexts.
|
| 297 |
- โก **Real-Time Streaming Variant** โ The upcoming **MOSS-VL-RealTime** will extend MOSS-VL to low-latency, streaming video understanding, enabling interactive applications such as live video chat, real-time event detection, and online assistants.
|
| 298 |
- ๐ฏ **RL Post-Training** โ We are working on a reinforcement learning post-training stage to further align the model with human preferences and to unlock stronger multi-step reasoning behaviors on top of the SFT foundation.
|
| 299 |
-
|
| 300 |
-
- ๐ **Audio Modality Integration** โ Bringing audio understanding into the pipeline, so MOSS-VL can jointly reason over the visual and acoustic streams of a video โ speech, ambient sound, music, and their interaction with on-screen events.
|
| 301 |
-
- ๐ **Parameter Scaling** โ Releasing additional model sizes across the MOSS-VL series to cover a wider range of compute budgets and deployment scenarios.
|
| 302 |
|
| 303 |
> [!NOTE]
|
| 304 |
> We welcome community feedback and contributions on any of these directions.
|
|
|
|
| 296 |
- ๐งฎ **Math & Code Reasoning** โ While the current checkpoint already exhibits solid general reasoning, we plan to substantially strengthen its mathematical reasoning and code reasoning capabilities, especially in multimodal contexts.
|
| 297 |
- โก **Real-Time Streaming Variant** โ The upcoming **MOSS-VL-RealTime** will extend MOSS-VL to low-latency, streaming video understanding, enabling interactive applications such as live video chat, real-time event detection, and online assistants.
|
| 298 |
- ๐ฏ **RL Post-Training** โ We are working on a reinforcement learning post-training stage to further align the model with human preferences and to unlock stronger multi-step reasoning behaviors on top of the SFT foundation.
|
| 299 |
+
|
|
|
|
|
|
|
| 300 |
|
| 301 |
> [!NOTE]
|
| 302 |
> We welcome community feedback and contributions on any of these directions.
|