DeepGlint-AI
/

UniDoc-RL-3B

Model card Files Files and versions

Add model card and metadata

#1

by nielsr HF Staff - opened Apr 18

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +32 -0

README.md ADDED Viewed

	@@ -0,0 +1,32 @@

+---
+library_name: transformers
+pipeline_tag: image-text-to-text
+---
+# UniDoc-RL-7B
+**UniDoc-RL** is a unified reinforcement learning framework for **visual document RAG**, where an LVLM agent jointly performs retrieval, reranking, active visual perception, and reasoning within a single decision process.
+This model is the 7B variant of the UniDoc-RL framework, built upon the Qwen2.5-VL architecture.
+- **Paper:** [UniDoc-RL: Coarse-to-Fine Visual RAG with Hierarchical Actions and Dense Rewards](https://huggingface.co/papers/2604.14967)
+- **Repository:** [https://github.com/deepglint/UniDoc-RL](https://github.com/deepglint/UniDoc-RL)
+## Overview
+UniDoc-RL formulates visual evidence acquisition as a hierarchical sequential decision-making problem. The model interacts with an external environment through structured actions such as `<search>`, `<select>`, `<bbox>`, and `<answer>`.
+This design enables the agent to progressively gather evidence from coarse page-level retrieval to fine-grained region inspection, allowing it to suppress irrelevant content and attend to information-dense regions. This approach is particularly effective for complex reasoning over charts, tables, and multi-page documents.
+The model was trained using Group Relative Policy Optimization (GRPO) with a dense multi-reward scheme to align agent behavior with multiple objectives without requiring a separate value network.
+## Citation
+```bibtex
+@misc{unidocrl2026,
+      title={UniDoc-RL: Coarse-to-Fine Visual RAG with Hierarchical Actions and Dense Rewards},
+      author={Jun Wang and Shuo Tan and Zelong Sun and Tiancheng Gu and Yongle Zhao and Ziyong Feng and Kaicheng Yang and Cewu Lu},
+      year={2026},
+      url={https://huggingface.co/papers/2604.14967}
+}
+```