Improve model card: Add pipeline tag, library name, and prominent GitHub link
Browse filesThis PR enhances the model card for `xl-zhao/PromptCoT-2.0-Prompt-Generation-Model` by:
* Adding `pipeline_tag: text-generation` for better discoverability on the Hub.
* Adding `library_name: transformers` to enable the automated "Use in Transformers" widget with code snippets.
* Making the GitHub repository link `https://github.com/inclusionAI/PromptCoT` more prominent by adding it directly under the model title.
These updates improve the model's visibility and usability on the Hugging Face Hub.
README.md
CHANGED
|
@@ -1,28 +1,33 @@
|
|
| 1 |
-
---
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
|
|
|
|
|
|
|
|
|
| 8 |
# PromptCoT 2.0 — Problem Generation Model
|
| 9 |
|
| 10 |
This repository hosts the **Problem Generation Model (PGM)** used in [**PromptCoT 2.0**](https://arxiv.org/abs/2509.19894), a framework for **scalable prompt synthesis** that advances LLM reasoning in **mathematics** and **programming**.
|
| 11 |
|
|
|
|
|
|
|
| 12 |
---
|
| 13 |
|
| 14 |
## ✨ Overview
|
| 15 |
|
| 16 |
This checkpoint is the **Problem Generation Model (PGM)** of PromptCoT 2.0.
|
| 17 |
|
| 18 |
-
-
|
| 19 |
-
-
|
| 20 |
|
| 21 |
**How it fits into PromptCoT 2.0:**
|
| 22 |
PromptCoT 2.0 jointly trains two models via an EM optimization loop:
|
| 23 |
|
| 24 |
-
-
|
| 25 |
-
-
|
| 26 |
|
| 27 |
At inference time, the PGM is all you need: provide **concepts** and it will generate **(rationale → problem)** in one pass—without any handcrafted templates or domain-specific heuristics.
|
| 28 |
|
|
@@ -30,10 +35,10 @@ At inference time, the PGM is all you need: provide **concepts** and it will gen
|
|
| 30 |
|
| 31 |
## 📦 Model Details
|
| 32 |
|
| 33 |
-
-
|
| 34 |
-
-
|
| 35 |
-
-
|
| 36 |
-
-
|
| 37 |
|
| 38 |
---
|
| 39 |
|
|
@@ -100,8 +105,8 @@ The output will first include a **Rationale** (multi-step explanation of how the
|
|
| 100 |
|
| 101 |
The PGM is the **core component** powering the creation of:
|
| 102 |
|
| 103 |
-
*
|
| 104 |
-
*
|
| 105 |
|
| 106 |
---
|
| 107 |
|
|
@@ -110,28 +115,28 @@ The PGM is the **core component** powering the creation of:
|
|
| 110 |
PromptCoT 2.0 demonstrates that rationale-driven prompt synthesis yields **harder and more diverse problems** than existing datasets.
|
| 111 |
|
| 112 |
|
| 113 |
-
*
|
| 114 |
-
|
| 115 |
-
|
| 116 |
-
|
| 117 |
-
|
| 118 |
|
| 119 |
-
*
|
| 120 |
-
|
| 121 |
-
|
| 122 |
-
|
| 123 |
-
|
| 124 |
|
| 125 |
|
| 126 |
---
|
| 127 |
|
| 128 |
## 📂 Resources
|
| 129 |
|
| 130 |
-
*
|
| 131 |
-
*
|
| 132 |
-
*
|
| 133 |
-
*
|
| 134 |
-
*
|
| 135 |
|
| 136 |
---
|
| 137 |
|
|
|
|
| 1 |
+
---
|
| 2 |
+
base_model:
|
| 3 |
+
- Qwen/Qwen2.5-32B
|
| 4 |
+
language:
|
| 5 |
+
- en
|
| 6 |
+
license: mit
|
| 7 |
+
pipeline_tag: text-generation
|
| 8 |
+
library_name: transformers
|
| 9 |
+
---
|
| 10 |
+
|
| 11 |
# PromptCoT 2.0 — Problem Generation Model
|
| 12 |
|
| 13 |
This repository hosts the **Problem Generation Model (PGM)** used in [**PromptCoT 2.0**](https://arxiv.org/abs/2509.19894), a framework for **scalable prompt synthesis** that advances LLM reasoning in **mathematics** and **programming**.
|
| 14 |
|
| 15 |
+
Code: https://github.com/inclusionAI/PromptCoT
|
| 16 |
+
|
| 17 |
---
|
| 18 |
|
| 19 |
## ✨ Overview
|
| 20 |
|
| 21 |
This checkpoint is the **Problem Generation Model (PGM)** of PromptCoT 2.0.
|
| 22 |
|
| 23 |
+
- **Input:** a set of domain concepts (math or programming) and an optional difficulty tag.
|
| 24 |
+
- **Output:** a **rationale** (the structured “thinking process” that connects the concepts) **followed by** a fully formed **problem** (Olympiad-level math or coding task).
|
| 25 |
|
| 26 |
**How it fits into PromptCoT 2.0:**
|
| 27 |
PromptCoT 2.0 jointly trains two models via an EM optimization loop:
|
| 28 |
|
| 29 |
+
- **Rationale Generator** (*E-step*): infers rationales given concepts and problems, updated via reinforcement learning with reward signals.
|
| 30 |
+
- **Problem Generation Model (PGM)** (*M-step*): learns to produce rationale–problem pairs conditioned only on concepts.
|
| 31 |
|
| 32 |
At inference time, the PGM is all you need: provide **concepts** and it will generate **(rationale → problem)** in one pass—without any handcrafted templates or domain-specific heuristics.
|
| 33 |
|
|
|
|
| 35 |
|
| 36 |
## 📦 Model Details
|
| 37 |
|
| 38 |
+
- **Model type:** Causal language model for problem generation.
|
| 39 |
+
- **Training data:** Concept–rationale–problem triples synthesized and refined via PromptCoT 2.0.
|
| 40 |
+
- **Domains:** Mathematics (Olympiad-level) and Programming (competitive programming).
|
| 41 |
+
- **Initialization:** Warm-started from `Qwen2.5-32B-Base` with cold-start annotations (concepts & rationales) generated by instruction-tuned models.
|
| 42 |
|
| 43 |
---
|
| 44 |
|
|
|
|
| 105 |
|
| 106 |
The PGM is the **core component** powering the creation of:
|
| 107 |
|
| 108 |
+
* **Self-Play datasets** (math/code problems paired with verifiable answers or unit tests).
|
| 109 |
+
* **SFT datasets** (problems with complete reasoning traces distilled from teacher models).
|
| 110 |
|
| 111 |
---
|
| 112 |
|
|
|
|
| 115 |
PromptCoT 2.0 demonstrates that rationale-driven prompt synthesis yields **harder and more diverse problems** than existing datasets.
|
| 116 |
|
| 117 |
|
| 118 |
+
* **Self-Play (30B-A3B):**
|
| 119 |
+
Achieves strong gains in both mathematics and programming.
|
| 120 |
+
- **Math:** 92.1 on AIME24, 89.8 on AIME25, 76.7 on HMMT Feb25.
|
| 121 |
+
- **Code:** 74.2 on LiveCodeBench v5, 71.0 on v6, and 2079 Elo on Codeforces.
|
| 122 |
+
Overall, performance is competitive with Gemini 2.5 Pro / OpenAI o3 and surpasses strong open-source baselines.
|
| 123 |
|
| 124 |
+
* **SFT (7B, 100% synthetic):**
|
| 125 |
+
Demonstrates that fully synthetic data can rival or outperform human-written datasets.
|
| 126 |
+
- **Math:** 73.1 on AIME24, 65.6 on AIME25, 46.5 on HMMT Feb25.
|
| 127 |
+
- **Code:** 53.4 on LiveCodeBench v5, 48.9 on v6, and 1815 Elo on Codeforces.
|
| 128 |
+
These results exceed human-written baselines such as **OpenMathReasoning** and **OpenCodeReasoning**, highlighting the scalability of synthetic data.
|
| 129 |
|
| 130 |
|
| 131 |
---
|
| 132 |
|
| 133 |
## 📂 Resources
|
| 134 |
|
| 135 |
+
* 📄 [Paper (arXiv:2509.19894)](https://arxiv.org/abs/2509.19894)
|
| 136 |
+
* 🤗 [HF Collection](https://huggingface.co/collections/xl-zhao/promptcot-20-68d27cd73f2faef5a12f777d)
|
| 137 |
+
* 📚 [PromptCoT 2.0 SFT Data (4.8M prompts)](https://huggingface.co/datasets/xl-zhao/PromptCoT-2.0-SFT-4.8M)
|
| 138 |
+
* 🤖 [PromptCoT 2.0 SFT Model (7B)](https://huggingface.co/xl-zhao/PromptCoT-2.0-SFT-7B)
|
| 139 |
+
* 🎮 [Self-Play Models (4B, 30B-A3B)](https://huggingface.co/collections/xl-zhao/promptcot-20-68d27cd73f2faef5a12f777d)
|
| 140 |
|
| 141 |
---
|
| 142 |
|