dle666
/

GeoFocus-7B

Model card Files Files and versions

Add model card for GeoFocus

#1

by nielsr HF Staff - opened Feb 10

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +44 -0

README.md ADDED Viewed

	@@ -0,0 +1,44 @@

+---
+license: apache-2.0
+library_name: transformers
+pipeline_tag: image-text-to-text
+datasets:
+- dle666/Global_Perceptor
+- dle666/Local_Perceptor
+- dle666/GeoFocus-test
+base_model: Qwen/Qwen2.5-VL-7B-Instruct
+---
+# GeoFocus-7B
+[GeoFocus](https://huggingface.co/papers/2602.08524) is a specialized Large Multimodal Model (LMM) designed for geometry problem-solving. It enhances global-to-local perception by blending global topology recognition with critical local structure perception.
+## Model Description
+Geometry problem-solving remains a significant challenge for LMMs, requiring not only global shape recognition but also attention to intricate local relationships. GeoFocus addresses this through two core modules:
+1.  **Critical Local Perceptor**: Automatically identifies and emphasizes critical local structures (e.g., angles, parallel lines, comparative distances) through thirteen theory-based perception templates.
+2.  **VertexLang**: A compact topology formal language that encodes global figures through vertex coordinates and connectivity relations, reducing training time while improving accuracy compared to bulky code-based encodings.
+GeoFocus-7B is built upon the **Qwen2.5-VL-7B** architecture and achieves significant improvements on geometry benchmarks like Geo3K, GeoQA, and FormalGeo7K.
+- **Paper:** [GeoFocus: Blending Efficient Global-to-Local Perception for Multimodal Geometry Problem-Solving](https://huggingface.co/papers/2602.08524)
+- **Repository:** [GitHub - dle666/GeoFocus](https://github.com/dle666/GeoFocus)
+## Training Data
+The model utilizes the following datasets:
+- [Global_Perceptor_Data](https://huggingface.co/datasets/dle666/Global_Perceptor)
+- [Local_Perceptor_Data](https://huggingface.co/datasets/dle666/Local_Perceptor)
+- [GeoFocus-test](https://huggingface.co/datasets/dle666/GeoFocus-test)
+## Acknowledgement
+This work benefits from several open-source projects including [Qwen2.5](https://github.com/QwenLM/Qwen2.5), [EasyR1](https://github.com/hiyouga/EasyR1), [verl](https://github.com/volcengine/verl), [NoisyRollout](https://github.com/NUS-TRAIL/NoisyRollout), and [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory).
+## Citation
+```bibtex
+@article{geofocus2026,
+  title={GeoFocus: Blending Efficient Global-to-Local Perception for Multimodal Geometry Problem-Solving},
+  author={Jiuhai Chen and Jianwei Yang and Haiping Wu and Dianqi Li and Jianfeng Gao and Tianyi Zhou and Bin Xiao},
+  journal={arXiv preprint arXiv:2602.08524},
+  year={2026}
+}
+```