Marco711 nielsr HF Staff commited on
Commit
21495cc
Β·
verified Β·
1 Parent(s): ef7ea85

Add metadata for pipeline tag, library name and link to paper (#1)

Browse files

- Add metadata for pipeline tag, library name and link to paper (243ff1ab6b49c940ae318cb05405268df38c93c4)


Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +78 -65
README.md CHANGED
@@ -1,65 +1,78 @@
1
- ---
2
- license: cc-by-nc-sa-4.0
3
- ---
4
-
5
- # 🌀️ Introduction
6
-
7
- While Vision Language Models (VLMs) show advancing reasoning capabilities, their application in meteorology is constrained by a domain gap and a reasoning faithfulness gap. Mainstream Reinforcement Fine-Tuning (RFT) can induce Self-Contradictory Reasoning (Self-Contra), where the reasoning process contradicts the final answer, which is unacceptable in this high-stakes domain.
8
-
9
- To address these challenges, we construct WeatherQA, a multimodal multiple-choice benchmark for meteorology comprising 15,400 entries that cover four themes and seven imaging modality tasks. We propose Logically Consistent Reinforcement Fine-Tuning (LoCo-RFT), which introduces a logical consistency reward to resolve Self-Contra. Based on this paradigm and WeatherQA, we present Weather-R1, the first reasoning VLM with logical faithfulness in meteorology, to the best of our knowledge. Weather-R1 (7B) achieves 52.9% accuracy on WeatherQA, a 9.8 percentage point gain over the baseline model Qwen2.5-VL-7B; it surpasses Supervised Fine-Tuning and RFT baselines, exceeds the original Qwen2.5-VL-32B, and improves out-of-domain ScienceQA performance by 4.98 percentage points.
10
-
11
- <div align="center">
12
- <img src="./asserts/Case_Study.png" width="70%" />
13
- <p><em>Response Comparison.</em></p>
14
- </div>
15
-
16
- # πŸ—‚οΈ Folder Structure
17
- This repository provides model checkpoints organized by training strategy and task:
18
-
19
- ```
20
- Weather-R1/
21
- β”œβ”€ LoCo-RFT/ # Weather-R1 checkpoints
22
- β”‚ β”œβ”€ WeatherQA-500hPa/
23
- β”‚ β”œβ”€ WeatherQA-850hPa/
24
- β”‚ β”œβ”€ WeatherQA-Land/
25
- β”‚ β”œβ”€ WeatherQA-Max-Temp/
26
- β”‚ β”œβ”€ WeatherQA-Min-Temp/
27
- β”‚ β”œβ”€ WeatherQA-Phenom/
28
- β”‚ └─ WeatherQA-Rain/
29
- β”œβ”€ RFT/ # Standard RFT checkpoints
30
- β”‚ β”œβ”€ WeatherQA-500hPa/
31
- β”‚ β”œβ”€ WeatherQA-850hPa/
32
- β”‚ β”œβ”€ WeatherQA-Land/
33
- β”‚ β”œβ”€ WeatherQA-Max-Temp/
34
- β”‚ β”œβ”€ WeatherQA-Min-Temp/
35
- β”‚ β”œβ”€ WeatherQA-Phenom/
36
- β”‚ └─ WeatherQA-Rain/
37
- └─ asserts/ # Figures used in README
38
- ```
39
-
40
- Each task folder contains HuggingFace-style model files such as `config.json`,
41
- `tokenizer.json`, and sharded weights like `model-00001-of-00004.safetensors`.
42
-
43
- # πŸš€ Training and Evaluation
44
-
45
- Please refer to our official repository: [Weather-R1](https://github.com/Marcowky/Weather-R1)
46
-
47
- # πŸ™ Acknowledgements
48
-
49
- Training code is built on [EasyR1](https://github.com/hiyouga/EasyR1).
50
-
51
- # πŸ“ Citation
52
-
53
- If you use Weather-R1 resources, please cite the following paper:
54
-
55
- ```bibtex
56
- @misc{wu2026weatherr1logicallyconsistentreinforcement,
57
- title={Weather-R1: Logically Consistent Reinforcement Fine-Tuning for Multimodal Reasoning in Meteorology},
58
- author={Kaiyu Wu and Pucheng Han and Hualong Zhang and Naigeng Wu and Keze Wang},
59
- year={2026},
60
- eprint={2601.14044},
61
- archivePrefix={arXiv},
62
- primaryClass={cs.CV},
63
- url={https://arxiv.org/abs/2601.14044},
64
- }
65
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-sa-4.0
3
+ pipeline_tag: image-text-to-text
4
+ library_name: transformers
5
+ tags:
6
+ - meteorology
7
+ - reasoning
8
+ - vlm
9
+ - weather
10
+ ---
11
+
12
+ # 🌀️ Weather-R1: Multimodal Reasoning in Meteorology
13
+
14
+ This repository contains the checkpoints for **Weather-R1**, as presented in the paper [Weather-R1: Logically Consistent Reinforcement Fine-Tuning for Multimodal Reasoning in Meteorology](https://huggingface.co/papers/2601.14044).
15
+
16
+ [**Paper (ArXiv)**](https://arxiv.org/abs/2601.14044) | [**Code (GitHub)**](https://github.com/Marcowky/Weather-R1)
17
+
18
+ # 🌀️ Introduction
19
+
20
+ While Vision Language Models (VLMs) show advancing reasoning capabilities, their application in meteorology is constrained by a domain gap and a reasoning faithfulness gap. Mainstream Reinforcement Fine-Tuning (RFT) can induce Self-Contradictory Reasoning (Self-Contra), where the reasoning process contradicts the final answer, which is unacceptable in this high-stakes domain.
21
+
22
+ To address these challenges, we construct WeatherQA, a multimodal multiple-choice benchmark for meteorology comprising 15,400 entries that cover four themes and seven imaging modality tasks. We propose Logically Consistent Reinforcement Fine-Tuning (LoCo-RFT), which introduces a logical consistency reward to resolve Self-Contra. Based on this paradigm and WeatherQA, we present Weather-R1, the first reasoning VLM with logical faithfulness in meteorology, to the best of our knowledge. Weather-R1 (7B) achieves 52.9% accuracy on WeatherQA, a 9.8 percentage point gain over the baseline model Qwen2.5-VL-7B; it surpasses Supervised Fine-Tuning and RFT baselines, exceeds the original Qwen2.5-VL-32B, and improves out-of-domain ScienceQA performance by 4.98 percentage points.
23
+
24
+ <div align="center\">
25
+ <img src="https://huggingface.co/Marcowky/Weather-R1/resolve/main/asserts/Case_Study.png" width="70%" />
26
+ <p><em>Response Comparison.</em></p>
27
+ </div>
28
+
29
+ # πŸ—‚οΈ Folder Structure
30
+ This repository provides model checkpoints organized by training strategy and task:
31
+
32
+ ```
33
+ Weather-R1/
34
+ β”œβ”€ LoCo-RFT/ # Weather-R1 checkpoints
35
+ β”‚ β”œβ”€ WeatherQA-500hPa/
36
+ β”‚ β”œβ”€ WeatherQA-850hPa/
37
+ β”‚ β”œβ”€ WeatherQA-Land/
38
+ β”‚ β”œβ”€ WeatherQA-Max-Temp/
39
+ β”‚ β”œβ”€ WeatherQA-Min-Temp/
40
+ β”‚ β”œβ”€ WeatherQA-Phenom/
41
+ β”‚ └─ WeatherQA-Rain/
42
+ β”œβ”€ RFT/ # Standard RFT checkpoints
43
+ β”‚ β”œβ”€ WeatherQA-500hPa/
44
+ β”‚ β”œβ”€ WeatherQA-850hPa/
45
+ β”‚ β”œβ”€ WeatherQA-Land/
46
+ β”‚ β”œβ”€ WeatherQA-Max-Temp/
47
+ β”‚ β”œβ”€ WeatherQA-Min-Temp/
48
+ β”‚ β”œβ”€ WeatherQA-Phenom/
49
+ β”‚ └─ WeatherQA-Rain/
50
+ └─ asserts/ # Figures used in README
51
+ ```
52
+
53
+ Each task folder contains HuggingFace-style model files such as `config.json`,
54
+ `tokenizer.json`, and sharded weights like `model-00001-of-00004.safetensors`.
55
+
56
+ # πŸš€ Training and Evaluation
57
+
58
+ Please refer to our official repository: [Weather-R1](https://github.com/Marcowky/Weather-R1)
59
+
60
+ # πŸ™ Acknowledgements
61
+
62
+ Training code is built on [EasyR1](https://github.com/hiyouga/EasyR1).
63
+
64
+ # πŸ“ Citation
65
+
66
+ If you use Weather-R1 resources, please cite the following paper:
67
+
68
+ ```bibtex
69
+ @misc{wu2026weatherr1logicallyconsistentreinforcement,
70
+ title={Weather-R1: Logically Consistent Reinforcement Fine-Tuning for Multimodal Reasoning in Meteorology},
71
+ author={Kaiyu Wu and Pucheng Han and Hualong Zhang and Naigeng Wu and Keze Wang},
72
+ year={2026},
73
+ eprint={2601.14044},
74
+ archivePrefix={arXiv},
75
+ primaryClass={cs.CV},
76
+ url={https://arxiv.org/abs/2601.14044},
77
+ }
78
+ ```