Gnonymous
/

Web-CogReasoner

@@ -1,16 +1,46 @@
 ---
-license: apache-2.0
-language:
-- en
-- zh
 base_model:
 - Qwen/Qwen2.5-VL-7B-Instruct
 datasets:
 - Gnonymous/Web-CogDataset
 ---
-This model was the Web-CogReasoner model mentioned in the paper [Web-CogReasoner: Towards Knowledge-Induced Cognitive Reasoning for Web Agents](https://huggingface.co/papers/2508.01858).
-Web-CogReasoner is trained using our [Web-CogDataset](https://huggingface.co/datasets/Gnonymous/Web-CogDataset).
-It achieves 84.4 @ Web-CogBench, 86.3 @ VisualWebBench, 30.2% @ WebVoyager, 17.0% and 10.1% @ Online Multimodal-Mind2Web Cross-Tasks and Cross-Webs

 ---
 base_model:
 - Qwen/Qwen2.5-VL-7B-Instruct
 datasets:
 - Gnonymous/Web-CogDataset
+language:
+- en
+- zh
+license: apache-2.0
+pipeline_tag: image-text-to-text
 ---
+# Web-CogReasoner
+[**Web-CogReasoner**](https://huggingface.co/papers/2508.01858) is a knowledge-driven multimodal agent designed for cognitive reasoning in web environments. It introduces a paradigm shift by systematically building agent capabilities through a two-stage training process: knowledge content learning (Factual, Conceptual) and cognitive processes (Procedural).
+- **Paper:** [Web-CogReasoner: Towards Knowledge-Induced Cognitive Reasoning for Web Agents](https://huggingface.co/papers/2508.01858)
+- **Project Page:** [https://eohan.me/Web-CogReasoner](https://eohan.me/Web-CogReasoner)
+- **Repository:** [https://github.com/Gnonymous/Web-CogReasoner](https://github.com/Gnonymous/Web-CogReasoner)
+Web-CogReasoner is trained using the [Web-CogDataset](https://huggingface.co/datasets/Gnonymous/Web-CogDataset) and employs a novel knowledge-driven Chain-of-Thought (CoT) reasoning framework to generalize to unseen web tasks.
+## Performance
+Web-CogReasoner demonstrates significant superiority over existing models across various benchmarks:
+| Benchmark | Score |
+| :--- | :---: |
+| Web-CogBench | 84.4 |
+| VisualWebBench | 86.3 |
+| WebVoyager | 30.2% |
+| Online Multimodal-Mind2Web (Cross-Tasks) | 17.0% |
+| Online Multimodal-Mind2Web (Cross-Webs) | 10.1% |
+## Citation
+If you find this work helpful, please cite the following paper:
+```bibtex
+@article{guo2025web,
+  title={Web-CogReasoner: Towards Knowledge-Induced Cognitive Reasoning for Web Agents},
+  author={Guo, Yuhan and Guo, Cong and Sun, Aiwen and He, Hongliang and Yang, Xinyu and Lu, Yue and Zhang, Yingji and Guo, Xuntao and Zhang, Dong and Liu, Jianzhuang and others},
+  journal={arXiv preprint arXiv:2508.01858},
+  year={2025}
+}
+```