nielsr HF Staff commited on
Commit
3a1f597
·
verified ·
1 Parent(s): d07399b

Add model card for GeoFocus

Browse files

Hi! I'm Niels from the community science team at Hugging Face.

This PR adds a model card for **GeoFocus-7B**. Based on the repository's configuration and the associated paper, I've added:
- Metadata tags (`pipeline_tag`, `library_name`, `license`, and `datasets`).
- A description of the GeoFocus framework, highlighting the Critical Local Perceptor and VertexLang modules.
- Links to the paper and the official GitHub repository.

This will help users discover and use your model more effectively on the Hub. Let me know if you'd like any changes!

Files changed (1) hide show
  1. README.md +44 -0
README.md ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: transformers
4
+ pipeline_tag: image-text-to-text
5
+ datasets:
6
+ - dle666/Global_Perceptor
7
+ - dle666/Local_Perceptor
8
+ - dle666/GeoFocus-test
9
+ base_model: Qwen/Qwen2.5-VL-7B-Instruct
10
+ ---
11
+
12
+ # GeoFocus-7B
13
+
14
+ [GeoFocus](https://huggingface.co/papers/2602.08524) is a specialized Large Multimodal Model (LMM) designed for geometry problem-solving. It enhances global-to-local perception by blending global topology recognition with critical local structure perception.
15
+
16
+ ## Model Description
17
+ Geometry problem-solving remains a significant challenge for LMMs, requiring not only global shape recognition but also attention to intricate local relationships. GeoFocus addresses this through two core modules:
18
+
19
+ 1. **Critical Local Perceptor**: Automatically identifies and emphasizes critical local structures (e.g., angles, parallel lines, comparative distances) through thirteen theory-based perception templates.
20
+ 2. **VertexLang**: A compact topology formal language that encodes global figures through vertex coordinates and connectivity relations, reducing training time while improving accuracy compared to bulky code-based encodings.
21
+
22
+ GeoFocus-7B is built upon the **Qwen2.5-VL-7B** architecture and achieves significant improvements on geometry benchmarks like Geo3K, GeoQA, and FormalGeo7K.
23
+
24
+ - **Paper:** [GeoFocus: Blending Efficient Global-to-Local Perception for Multimodal Geometry Problem-Solving](https://huggingface.co/papers/2602.08524)
25
+ - **Repository:** [GitHub - dle666/GeoFocus](https://github.com/dle666/GeoFocus)
26
+
27
+ ## Training Data
28
+ The model utilizes the following datasets:
29
+ - [Global_Perceptor_Data](https://huggingface.co/datasets/dle666/Global_Perceptor)
30
+ - [Local_Perceptor_Data](https://huggingface.co/datasets/dle666/Local_Perceptor)
31
+ - [GeoFocus-test](https://huggingface.co/datasets/dle666/GeoFocus-test)
32
+
33
+ ## Acknowledgement
34
+ This work benefits from several open-source projects including [Qwen2.5](https://github.com/QwenLM/Qwen2.5), [EasyR1](https://github.com/hiyouga/EasyR1), [verl](https://github.com/volcengine/verl), [NoisyRollout](https://github.com/NUS-TRAIL/NoisyRollout), and [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory).
35
+
36
+ ## Citation
37
+ ```bibtex
38
+ @article{geofocus2026,
39
+ title={GeoFocus: Blending Efficient Global-to-Local Perception for Multimodal Geometry Problem-Solving},
40
+ author={Jiuhai Chen and Jianwei Yang and Haiping Wu and Dianqi Li and Jianfeng Gao and Tianyi Zhou and Bin Xiao},
41
+ journal={arXiv preprint arXiv:2602.08524},
42
+ year={2026}
43
+ }
44
+ ```