Improve model card: Add pipeline tag, license, and detailed content

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +41 -52
README.md CHANGED
@@ -1,58 +1,46 @@
1
  ---
2
  library_name: transformers
3
- tags: []
 
 
 
 
4
  ---
5
 
6
- # Model Card for Model ID
7
-
8
- <!-- Provide a quick summary of what the model is/does. -->
9
 
 
10
 
 
11
 
12
  ## Model Details
13
 
14
  ### Model Description
15
 
16
- <!-- Provide a longer summary of what this model is. -->
17
-
18
- This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
19
-
20
- - **Developed by:** [More Information Needed]
21
- - **Funded by [optional]:** [More Information Needed]
22
- - **Shared by [optional]:** [More Information Needed]
23
- - **Model type:** [More Information Needed]
24
- - **Language(s) (NLP):** [More Information Needed]
25
- - **License:** [More Information Needed]
26
- - **Finetuned from model [optional]:** [More Information Needed]
27
 
28
- ### Model Sources [optional]
 
 
 
 
29
 
30
- <!-- Provide the basic links for the model. -->
31
-
32
- - **Repository:** [More Information Needed]
33
- - **Paper [optional]:** [More Information Needed]
34
- - **Demo [optional]:** [More Information Needed]
35
 
36
  ## Uses
37
 
38
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
-
40
  ### Direct Use
41
 
42
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
-
44
- [More Information Needed]
45
 
46
  ### Downstream Use [optional]
47
 
48
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
-
50
  [More Information Needed]
51
 
52
  ### Out-of-Scope Use
53
 
54
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
-
56
  [More Information Needed]
57
 
58
  ## Bias, Risks, and Limitations
@@ -69,9 +57,7 @@ Users (both direct and downstream) should be made aware of the risks, biases and
69
 
70
  ## How to Get Started with the Model
71
 
72
- Use the code below to get started with the model.
73
-
74
- [More Information Needed]
75
 
76
  ## Training Details
77
 
@@ -79,12 +65,14 @@ Use the code below to get started with the model.
79
 
80
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
 
82
- [More Information Needed]
83
 
84
  ### Training Procedure
85
 
86
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
 
 
 
88
  #### Preprocessing [optional]
89
 
90
  [More Information Needed]
@@ -92,7 +80,7 @@ Use the code below to get started with the model.
92
 
93
  #### Training Hyperparameters
94
 
95
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
 
97
  #### Speeds, Sizes, Times [optional]
98
 
@@ -102,10 +90,10 @@ Use the code below to get started with the model.
102
 
103
  ## Evaluation
104
 
105
- <!-- This section describes the evaluation protocols and provides the results. -->
106
-
107
  ### Testing Data, Factors & Metrics
108
 
 
 
109
  #### Testing Data
110
 
111
  <!-- This should link to a Dataset Card if possible. -->
@@ -126,7 +114,7 @@ Use the code below to get started with the model.
126
 
127
  ### Results
128
 
129
- [More Information Needed]
130
 
131
  #### Summary
132
 
@@ -144,11 +132,11 @@ Use the code below to get started with the model.
144
 
145
  Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
 
147
- - **Hardware Type:** [More Information Needed]
148
- - **Hours used:** [More Information Needed]
149
- - **Cloud Provider:** [More Information Needed]
150
- - **Compute Region:** [More Information Needed]
151
- - **Carbon Emitted:** [More Information Needed]
152
 
153
  ## Technical Specifications [optional]
154
 
@@ -168,17 +156,18 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
168
 
169
  [More Information Needed]
170
 
171
- ## Citation [optional]
172
 
173
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
 
175
- **BibTeX:**
176
-
177
- [More Information Needed]
178
-
179
- **APA:**
180
-
181
- [More Information Needed]
 
182
 
183
  ## Glossary [optional]
184
 
 
1
  ---
2
  library_name: transformers
3
+ tags:
4
+ - reasoning
5
+ - qwen
6
+ license: apache-2.0
7
+ pipeline_tag: text-generation
8
  ---
9
 
10
+ # Model Card for Variational Reasoning for Language Models
 
 
11
 
12
+ This repository contains models for **Variational Reasoning for Language Models**, as presented in the paper [Variational Reasoning for Language Models](https://huggingface.co/papers/2509.22637).
13
 
14
+ We introduce a variational reasoning framework for language models that treats thinking traces as latent variables and optimizes them through variational inference. This work extends the evidence lower bound (ELBO) to a multi-trace objective for tighter bounds and proposes a forward-KL formulation that stabilizes the training of the variational posterior. It further shows that rejection sampling finetuning and binary-reward RL can be interpreted as local forward-KL objectives. Empirically validated on Qwen 2.5 and Qwen 3 model families across a wide range of reasoning tasks, this work provides a principled probabilistic perspective unifying variational inference with RL-style methods for improving reasoning ability.
15
 
16
  ## Model Details
17
 
18
  ### Model Description
19
 
20
+ The models in this repository are designed to enhance the reasoning capabilities of Language Models through a novel variational inference framework. They are built upon the Qwen 2.5 and Qwen 3 model families. Examples include `Variational-Reasoning-4B-Acc` and `Variational-Reasoning-8B-Acc`, which leverage Qwen3-4B-Base or Qwen3-8B-Base as backbones.
 
 
 
 
 
 
 
 
 
 
21
 
22
+ - **Developed by:** Xiangxin Zhou, Zichen Liu, Haonan Wang, Chao Du, Min Lin, Chongxuan Li, Liang Wang, and Tianyu Pang.
23
+ - **Model type:** Causal Language Model (`Qwen3ForCausalLM`).
24
+ - **Language(s) (NLP):** English
25
+ - **License:** Apache 2.0
26
+ - **Finetuned from model:** Qwen 2.5 and Qwen 3 model families (e.g., Qwen3-4B-Base, Qwen2.5-7B-Instruct).
27
 
28
+ ### Model Sources
29
+ - **Repository:** [https://github.com/sail-sg/variational-reasoning](https://github.com/sail-sg/variational-reasoning)
30
+ - **Paper:** [https://huggingface.co/papers/2509.22637](https://huggingface.co/papers/2509.22637)
 
 
31
 
32
  ## Uses
33
 
 
 
34
  ### Direct Use
35
 
36
+ These models are intended to be used for advanced text generation tasks that require strong reasoning abilities. They can be integrated into various analytical or conversational AI scenarios to generate thoughtful and coherent responses.
 
 
37
 
38
  ### Downstream Use [optional]
39
 
 
 
40
  [More Information Needed]
41
 
42
  ### Out-of-Scope Use
43
 
 
 
44
  [More Information Needed]
45
 
46
  ## Bias, Risks, and Limitations
 
57
 
58
  ## How to Get Started with the Model
59
 
60
+ For detailed instructions on setting up environments, training, and evaluation, please refer to the [official GitHub repository](https://github.com/sail-sg/variational-reasoning).
 
 
61
 
62
  ## Training Details
63
 
 
65
 
66
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
67
 
68
+ The models are trained using various mixed datasets, such as `Variational-Posterior-4B-Acc-mix` and `Variational-Posterior-4B-GML-mix`, which are linked from the [GitHub repository](https://github.com/sail-sg/variational-reasoning).
69
 
70
  ### Training Procedure
71
 
72
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
73
 
74
+ The training procedure involves multiple steps: training an initial reasoning model ($\pi_{\theta_0}$), training a variational posterior ($q_\phi$), sampling from the posterior, estimating log likelihoods, and finally training the reasoning model ($\pi_\theta$) using accuracy-based or geometric mean likelihood estimators. Detailed scripts and configurations can be found in the [LLaMA-Factory subdirectory of the GitHub repository](https://github.com/sail-sg/variational-reasoning).
75
+
76
  #### Preprocessing [optional]
77
 
78
  [More Information Needed]
 
80
 
81
  #### Training Hyperparameters
82
 
83
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
84
 
85
  #### Speeds, Sizes, Times [optional]
86
 
 
90
 
91
  ## Evaluation
92
 
 
 
93
  ### Testing Data, Factors & Metrics
94
 
95
+ Detailed evaluation instructions and scripts can be found in the `SkyThought/variational_reasoning/eval/eval.sh` within the [GitHub repository](https://github.com/sail-sg/variational-reasoning).
96
+
97
  #### Testing Data
98
 
99
  <!-- This should link to a Dataset Card if possible. -->
 
114
 
115
  ### Results
116
 
117
+ Quantitative results and analysis are provided in the [paper](https://huggingface.co/papers/2509.22637).
118
 
119
  #### Summary
120
 
 
132
 
133
  Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
134
 
135
+ - **Hardware Type:** [More Information Needed]
136
+ - **Hours used:** [More Information Needed]
137
+ - **Cloud Provider:** [More Information Needed]
138
+ - **Compute Region:** [More Information Needed]
139
+ - **Carbon Emitted:** [More Information Needed]
140
 
141
  ## Technical Specifications [optional]
142
 
 
156
 
157
  [More Information Needed]
158
 
159
+ ## Citation
160
 
161
+ If you find this work useful, please consider citing our paper:
162
 
163
+ ```bib
164
+ @article{zhou2025variationalreasoninglanguagemodels,
165
+ title={Variational Reasoning for Language Models},
166
+ author={Xiangxin Zhou and Zichen Liu and Haonan Wang and Chao Du and Min Lin and Chongxuan Li and Liang Wang and Tianyu Pang},
167
+ journal={arXiv preprint arXiv:2509.22637},
168
+ year={2025}
169
+ }
170
+ ```
171
 
172
  ## Glossary [optional]
173