BWGZK
/

DeepIntuit

Model card Files Files and versions

DeepIntuit / README.md

BWGZK's picture

Create README.md

f3f9382 verified 5 days ago

|

history blame contribute delete

1.68 kB

	---
	license: apache-2.0
	datasets:
	- violetcliff/SmartHome-Bench
	base_model:
	- Qwen/Qwen2.5-VL-7B-Instruct
	---

	# DeepIntuit

	## Model Description

	DeepIntuit is a reasoning-enhanced video understanding model designed for open-instance video classification. Instead of directly mapping visual features to labels, the model learns to generate intrinsic reasoning traces that guide the final classification decision, improving robustness under large intra-class variation.

	The model is introduced in:

	From Imitation to Intuition: Intrinsic Reasoning for Open-Instance Video Classification
	📄 Paper: [https://arxiv.org/abs/2603.10300](https://arxiv.org/abs/2603.10300)
	💻 Code: [https://github.com/BWGZK-keke/DeepIntuit](https://github.com/BWGZK-keke/DeepIntuit)

	---

	## Training Pipeline

	DeepIntuit is trained through a three-stage pipeline:

	1. Cold Start Alignment
	Supervised training to initialize structured reasoning generation.

	2. Reasoning Refinement (GRPO)
	Reinforcement learning improves reasoning quality and prediction consistency.

	3. Intuitive Calibration
	A lightweight classifier is trained on generated reasoning traces for stable prediction.

	---

	## Intended Use

	DeepIntuit is designed for research on:

	* video understanding
	* open-instance video classification
	* reasoning-enhanced multimodal learning
	* safety-sensitive video analysis


	## Citation

	```bibtex
	@article{zhang2026deepintuit,
	title={From Imitation to Intuition: Intrinsic Reasoning for Open-Instance Video Classification},
	author={Zhang, Ke and Zhao, Xiangchen and Tian, Yunjie and Zheng, Jiayu and Patel, Vishal M and Fu, Di},
	year={2026}
	}
	```