wayadventurer commited on
Commit
ef7ea85
·
1 Parent(s): d79f352

feat: 添加了 README

Browse files
Files changed (3) hide show
  1. .gitattributes +1 -0
  2. README.md +62 -0
  3. asserts/Case_Study.png +3 -0
.gitattributes CHANGED
@@ -34,3 +34,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  tokenizer.json filter=lfs diff=lfs merge=lfs -text
 
 
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  tokenizer.json filter=lfs diff=lfs merge=lfs -text
37
+ asserts/*.png filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,65 @@
1
  ---
2
  license: cc-by-nc-sa-4.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-nc-sa-4.0
3
  ---
4
+
5
+ # 🌤️ Introduction
6
+
7
+ While Vision Language Models (VLMs) show advancing reasoning capabilities, their application in meteorology is constrained by a domain gap and a reasoning faithfulness gap. Mainstream Reinforcement Fine-Tuning (RFT) can induce Self-Contradictory Reasoning (Self-Contra), where the reasoning process contradicts the final answer, which is unacceptable in this high-stakes domain.
8
+
9
+ To address these challenges, we construct WeatherQA, a multimodal multiple-choice benchmark for meteorology comprising 15,400 entries that cover four themes and seven imaging modality tasks. We propose Logically Consistent Reinforcement Fine-Tuning (LoCo-RFT), which introduces a logical consistency reward to resolve Self-Contra. Based on this paradigm and WeatherQA, we present Weather-R1, the first reasoning VLM with logical faithfulness in meteorology, to the best of our knowledge. Weather-R1 (7B) achieves 52.9% accuracy on WeatherQA, a 9.8 percentage point gain over the baseline model Qwen2.5-VL-7B; it surpasses Supervised Fine-Tuning and RFT baselines, exceeds the original Qwen2.5-VL-32B, and improves out-of-domain ScienceQA performance by 4.98 percentage points.
10
+
11
+ <div align="center">
12
+ <img src="./asserts/Case_Study.png" width="70%" />
13
+ <p><em>Response Comparison.</em></p>
14
+ </div>
15
+
16
+ # 🗂️ Folder Structure
17
+ This repository provides model checkpoints organized by training strategy and task:
18
+
19
+ ```
20
+ Weather-R1/
21
+ ├─ LoCo-RFT/ # Weather-R1 checkpoints
22
+ │ ├─ WeatherQA-500hPa/
23
+ │ ├─ WeatherQA-850hPa/
24
+ │ ├─ WeatherQA-Land/
25
+ │ ├─ WeatherQA-Max-Temp/
26
+ │ ├─ WeatherQA-Min-Temp/
27
+ │ ├─ WeatherQA-Phenom/
28
+ │ └─ WeatherQA-Rain/
29
+ ├─ RFT/ # Standard RFT checkpoints
30
+ │ ├─ WeatherQA-500hPa/
31
+ │ ├─ WeatherQA-850hPa/
32
+ │ ├─ WeatherQA-Land/
33
+ │ ├─ WeatherQA-Max-Temp/
34
+ │ ├─ WeatherQA-Min-Temp/
35
+ │ ├─ WeatherQA-Phenom/
36
+ │ └─ WeatherQA-Rain/
37
+ └─ asserts/ # Figures used in README
38
+ ```
39
+
40
+ Each task folder contains HuggingFace-style model files such as `config.json`,
41
+ `tokenizer.json`, and sharded weights like `model-00001-of-00004.safetensors`.
42
+
43
+ # 🚀 Training and Evaluation
44
+
45
+ Please refer to our official repository: [Weather-R1](https://github.com/Marcowky/Weather-R1)
46
+
47
+ # 🙏 Acknowledgements
48
+
49
+ Training code is built on [EasyR1](https://github.com/hiyouga/EasyR1).
50
+
51
+ # 📝 Citation
52
+
53
+ If you use Weather-R1 resources, please cite the following paper:
54
+
55
+ ```bibtex
56
+ @misc{wu2026weatherr1logicallyconsistentreinforcement,
57
+ title={Weather-R1: Logically Consistent Reinforcement Fine-Tuning for Multimodal Reasoning in Meteorology},
58
+ author={Kaiyu Wu and Pucheng Han and Hualong Zhang and Naigeng Wu and Keze Wang},
59
+ year={2026},
60
+ eprint={2601.14044},
61
+ archivePrefix={arXiv},
62
+ primaryClass={cs.CV},
63
+ url={https://arxiv.org/abs/2601.14044},
64
+ }
65
+ ```
asserts/Case_Study.png ADDED

Git LFS Details

  • SHA256: 8f8f1115b1bb580eed54e1302e7018e65fee66be04bd78c9b618a870a4c64a21
  • Pointer size: 132 Bytes
  • Size of remote file: 1.19 MB