BWGZK
/

DeepIntuit

Model card Files Files and versions

DeepIntuit / README.md

nielsr's picture

nielsr HF Staff

Add model card for DeepIntuit

f90a24c verified 5 days ago

|

2.48 kB

	---
	pipeline_tag: video-classification
	library_name: transformers
	tags:
	- video-reasoning
	- VLM
	- reinforcement-learning
	---
	---

	# DeepIntuit

	[DeepIntuit](https://bwgzk-keke.github.io/DeepIntuit/) is a progressive framework for open-instance video classification that evolves models from simple feature imitation to intrinsic reasoning.

	- Paper: [From Imitation to Intuition: Intrinsic Reasoning for Open-Instance Video Classification](https://huggingface.co/papers/2603.10300)
	- Repository: [BWGZK-keke/DeepIntuit](https://github.com/BWGZK-keke/DeepIntuit)
	- Project Page: [https://bwgzk-keke.github.io/DeepIntuit/](https://bwgzk-keke.github.io/DeepIntuit/)

	## Model Description

	DeepIntuit bridges the gap between traditional video encoders and the generalization capabilities of vision-language models (VLMs). Instead of directly predicting labels from visual features, it utilizes a three-stage reasoning pipeline:

	1. Cold-start supervised alignment: Initializes reasoning capability using supervised traces generated by a teacher model.
	2. Intrinsic reasoning refinement (Stage 1): Refines the reasoning ability using Group Relative Policy Optimization (GRPO) reinforcement learning to enhance coherence.
	3. Intuitive calibration (Stage 2): Trains a classifier on the intrinsic reasoning traces to ensure stable knowledge transfer and accurate classification results.

	This approach decouples reasoning generation from final decision-making, significantly improving robustness in scenarios with vast intra-class variations.

	## Installation

	The repository contains separate environments for each stage. For inference using the final model, set up the stage 2 environment:

	```bash
	git clone https://github.com/BWGZK-keke/DeepIntuit.git
	cd DeepIntuit/stage2_model
	pip install -r requirements.txt
	```

	## Sample Usage

	After setting up the environment, you can run inference using the following command provided in the official repository:

	```bash
	cd stage2_model
	python inference.py \
	--model_path BWGZK/DeepIntuit \
	--video_path path_to_your_video.mp4
	```

	## Citation

	If you find this work useful, please cite:

	```bibtex
	@article{zhang2026deepintuit,
	title={From Imitation to Intuition: Intrinsic Reasoning for Open-Instance Video Classification},
	author={Zhang, Ke and Zhao, Xiangchen and Tian, Yunjie and Zheng, Jiayu and Patel, Vishal M and Fu, Di},
	journal={arXiv preprint arXiv:2603.10300},
	year={2026}
	}
	```