Safetensors
nielsr HF Staff commited on
Commit
f90a24c
·
verified ·
1 Parent(s): 63cfbe2

Add model card for DeepIntuit

Browse files

Hi! I'm Niels from the Hugging Face community science team. I noticed that this repository was missing a model card and metadata. I've opened this PR to add a description of the project, links to the paper and code, and the necessary metadata tags. This will help users find and use your model more effectively on the Hub.

Files changed (1) hide show
  1. README.md +61 -0
README.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: video-classification
3
+ library_name: transformers
4
+ tags:
5
+ - video-reasoning
6
+ - VLM
7
+ - reinforcement-learning
8
+ ---
9
+ ---
10
+
11
+ # DeepIntuit
12
+
13
+ [DeepIntuit](https://bwgzk-keke.github.io/DeepIntuit/) is a progressive framework for **open-instance video classification** that evolves models from simple feature imitation to intrinsic reasoning.
14
+
15
+ - **Paper:** [From Imitation to Intuition: Intrinsic Reasoning for Open-Instance Video Classification](https://huggingface.co/papers/2603.10300)
16
+ - **Repository:** [BWGZK-keke/DeepIntuit](https://github.com/BWGZK-keke/DeepIntuit)
17
+ - **Project Page:** [https://bwgzk-keke.github.io/DeepIntuit/](https://bwgzk-keke.github.io/DeepIntuit/)
18
+
19
+ ## Model Description
20
+
21
+ DeepIntuit bridges the gap between traditional video encoders and the generalization capabilities of vision-language models (VLMs). Instead of directly predicting labels from visual features, it utilizes a three-stage reasoning pipeline:
22
+
23
+ 1. **Cold-start supervised alignment:** Initializes reasoning capability using supervised traces generated by a teacher model.
24
+ 2. **Intrinsic reasoning refinement (Stage 1):** Refines the reasoning ability using **Group Relative Policy Optimization (GRPO)** reinforcement learning to enhance coherence.
25
+ 3. **Intuitive calibration (Stage 2):** Trains a classifier on the intrinsic reasoning traces to ensure stable knowledge transfer and accurate classification results.
26
+
27
+ This approach decouples reasoning generation from final decision-making, significantly improving robustness in scenarios with vast intra-class variations.
28
+
29
+ ## Installation
30
+
31
+ The repository contains separate environments for each stage. For inference using the final model, set up the stage 2 environment:
32
+
33
+ ```bash
34
+ git clone https://github.com/BWGZK-keke/DeepIntuit.git
35
+ cd DeepIntuit/stage2_model
36
+ pip install -r requirements.txt
37
+ ```
38
+
39
+ ## Sample Usage
40
+
41
+ After setting up the environment, you can run inference using the following command provided in the official repository:
42
+
43
+ ```bash
44
+ cd stage2_model
45
+ python inference.py \
46
+ --model_path BWGZK/DeepIntuit \
47
+ --video_path path_to_your_video.mp4
48
+ ```
49
+
50
+ ## Citation
51
+
52
+ If you find this work useful, please cite:
53
+
54
+ ```bibtex
55
+ @article{zhang2026deepintuit,
56
+ title={From Imitation to Intuition: Intrinsic Reasoning for Open-Instance Video Classification},
57
+ author={Zhang, Ke and Zhao, Xiangchen and Tian, Yunjie and Zheng, Jiayu and Patel, Vishal M and Fu, Di},
58
+ journal={arXiv preprint arXiv:2603.10300},
59
+ year={2026}
60
+ }
61
+ ```