BWGZK
/

DeepIntuit

Safetensors

Model card Files Files and versions

xet

Community

Add model card for DeepIntuit

by nielsr HF Staff - opened Mar 16

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+61

-0

Files changed (1) hide show

README.md +61 -0

README.md ADDED Viewed

	@@ -0,0 +1,61 @@

+---
+pipeline_tag: video-classification
+library_name: transformers
+tags:
+- video-reasoning
+- VLM
+- reinforcement-learning
+---
+---
+# DeepIntuit
+[DeepIntuit](https://bwgzk-keke.github.io/DeepIntuit/) is a progressive framework for **open-instance video classification** that evolves models from simple feature imitation to intrinsic reasoning.
+- **Paper:** [From Imitation to Intuition: Intrinsic Reasoning for Open-Instance Video Classification](https://huggingface.co/papers/2603.10300)
+- **Repository:** [BWGZK-keke/DeepIntuit](https://github.com/BWGZK-keke/DeepIntuit)
+- **Project Page:** [https://bwgzk-keke.github.io/DeepIntuit/](https://bwgzk-keke.github.io/DeepIntuit/)
+## Model Description
+DeepIntuit bridges the gap between traditional video encoders and the generalization capabilities of vision-language models (VLMs). Instead of directly predicting labels from visual features, it utilizes a three-stage reasoning pipeline:
+1.  **Cold-start supervised alignment:** Initializes reasoning capability using supervised traces generated by a teacher model.
+2.  **Intrinsic reasoning refinement (Stage 1):** Refines the reasoning ability using **Group Relative Policy Optimization (GRPO)** reinforcement learning to enhance coherence.
+3.  **Intuitive calibration (Stage 2):** Trains a classifier on the intrinsic reasoning traces to ensure stable knowledge transfer and accurate classification results.
+This approach decouples reasoning generation from final decision-making, significantly improving robustness in scenarios with vast intra-class variations.
+## Installation
+The repository contains separate environments for each stage. For inference using the final model, set up the stage 2 environment:
+```bash
+git clone https://github.com/BWGZK-keke/DeepIntuit.git
+cd DeepIntuit/stage2_model
+pip install -r requirements.txt
+```
+## Sample Usage
+After setting up the environment, you can run inference using the following command provided in the official repository:
+```bash
+cd stage2_model
+python inference.py \
+  --model_path BWGZK/DeepIntuit \
+  --video_path path_to_your_video.mp4
+```
+## Citation
+If you find this work useful, please cite:
+```bibtex
+@article{zhang2026deepintuit,
+  title={From Imitation to Intuition: Intrinsic Reasoning for Open-Instance Video Classification},
+  author={Zhang, Ke and Zhao, Xiangchen and Tian, Yunjie and Zheng, Jiayu and Patel, Vishal M and Fu, Di},
+  journal={arXiv preprint arXiv:2603.10300},
+  year={2026}
+}
+```