BWGZK
/

DeepIntuit

Safetensors

Model card Files Files and versions

xet

Community

BWGZK commited on Mar 16

Commit

f3f9382

verified ·

1 Parent(s): 63cfbe2

Create README.md

Browse files

Files changed (1) hide show

README.md +56 -0

README.md ADDED Viewed

	@@ -0,0 +1,56 @@

+---
+license: apache-2.0
+datasets:
+- violetcliff/SmartHome-Bench
+base_model:
+- Qwen/Qwen2.5-VL-7B-Instruct
+---
+# DeepIntuit
+## Model Description
+**DeepIntuit** is a reasoning-enhanced video understanding model designed for **open-instance video classification**. Instead of directly mapping visual features to labels, the model learns to generate **intrinsic reasoning traces** that guide the final classification decision, improving robustness under large intra-class variation.
+The model is introduced in:
+**From Imitation to Intuition: Intrinsic Reasoning for Open-Instance Video Classification**
+📄 Paper: [https://arxiv.org/abs/2603.10300](https://arxiv.org/abs/2603.10300)
+💻 Code: [https://github.com/BWGZK-keke/DeepIntuit](https://github.com/BWGZK-keke/DeepIntuit)
+---
+## Training Pipeline
+DeepIntuit is trained through a three-stage pipeline:
+1. **Cold Start Alignment**
+   Supervised training to initialize structured reasoning generation.
+2. **Reasoning Refinement (GRPO)**
+   Reinforcement learning improves reasoning quality and prediction consistency.
+3. **Intuitive Calibration**
+   A lightweight classifier is trained on generated reasoning traces for stable prediction.
+---
+## Intended Use
+DeepIntuit is designed for research on:
+* video understanding
+* open-instance video classification
+* reasoning-enhanced multimodal learning
+* safety-sensitive video analysis
+## Citation
+```bibtex
+@article{zhang2026deepintuit,
+  title={From Imitation to Intuition: Intrinsic Reasoning for Open-Instance Video Classification},
+  author={Zhang, Ke and Zhao, Xiangchen and Tian, Yunjie and Zheng, Jiayu and Patel, Vishal M and Fu, Di},
+  year={2026}
+}
+```