davmel commited on
Commit
01f712a
·
verified ·
1 Parent(s): 464585e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +108 -83
README.md CHANGED
@@ -1,4 +1,10 @@
1
- # Active UltraFeedback
 
 
 
 
 
 
2
 
3
  **Active UltraFeedback** is a scalable pipeline for generating high-quality preference datasets to align large language models (LLMs), requiring only a set of prompts as input.
4
 
@@ -12,7 +18,7 @@ Our experiments demonstrate that **Active Learning strategies (specifically DRTS
12
 
13
  The datasets generated by our pipeline for **DRTS** and **DeltaUCB** consistently beat the actual `ultrafeedback_binarized_cleaned` and `tulu3` preference mixture datasets on our DPO/RM training setups with LoRA.
14
 
15
- Below are the detailed results across 4 different prompt distributions. For acquisition functions with multiple hyperparameter configurations, we report the best-performing setting.
16
 
17
  ---
18
 
@@ -33,7 +39,7 @@ Below are the detailed results across 4 different prompt distributions. For acqu
33
  | **InfoMax** | +0.463 | +0.287 | +0.096 | **+0.129** | +0.509 | +0.296 | +0.297 |
34
  | **MaxMinLCB** | +0.390 | -0.025 | **+0.244** | +0.070 | +0.453 | +0.250 | +0.230 |
35
 
36
- **DPO Performance** (Best HPs selected)
37
  | Method | GSM8K | IF Eval | Truthful QA | Alpaca Eval | **Mean** |
38
  | :--- | :---: | :---: | :---: | :---: | :---: |
39
  | *Baselines* | | | | | |
@@ -56,31 +62,31 @@ Below are the detailed results across 4 different prompt distributions. For acqu
56
  | Method | Factuality | Focus | Math | Precise IF | Safety | Ties | **Mean** |
57
  | :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
58
  | *Baselines* | | | | | | | |
59
- | Random | +0.443 | +0.209 | +0.156 | +0.133 | +0.417 | +0.310 | +0.278 |
60
- | UltraFeedback | +0.443 | +0.188 | +0.213 | +0.114 | +0.481 | +0.284 | +0.287 |
61
- | MaxMin | +0.377 | **+0.483** | +0.156 | +0.123 | +0.370 | **+0.400** | **+0.318** |
62
- | DeltaQwen | +0.195 | -0.034 | +0.028 | +0.067 | +0.216 | +0.126 | +0.100 |
63
  | *Ours* | | | | | | | |
64
- | **DRTS** | +0.396 | +0.090 | +0.107 | +0.033 | +0.344 | +0.225 | +0.199 |
65
  | **DeltaUCB** | +0.370 | +0.319 | **+0.194** | +0.033 | +0.346 | +0.310 | +0.262 |
66
- | **DTS** | +0.417 | -0.021 | +0.148 | **+0.077** | +0.450 | +0.245 | +0.219 |
67
- | **InfoMax** | **+0.429** | +0.122 | +0.162 | +0.030 | **+0.495** | +0.227 | +0.244 |
68
  | **MaxMinLCB** | +0.371 | -0.016 | +0.145 | +0.039 | +0.395 | +0.167 | +0.184 |
69
 
70
- **DPO Performance** (Best HPs selected)
71
  | Method | GSM8K | IF Eval | Truthful QA | Alpaca Eval | **Mean** |
72
  | :--- | :---: | :---: | :---: | :---: | :---: |
73
  | *Baselines* | | | | | |
74
- | Random | +0.020 | +0.004 | +0.004 | +0.025 | +0.013 |
75
- | UltraFeedback | +0.023 | +0.021 | +0.003 | +0.031 | +0.019 |
76
- | MaxMin | +0.043 | **+0.041** | +0.017 | +0.114 | +0.053 |
77
- | DeltaQwen | +0.043 | +0.030 | +0.023 | +0.183 | +0.069 |
78
  | *Ours* | | | | | |
79
- | **DRTS** | +0.065 | +0.019 | **+0.055** | **+0.197** | **+0.083** |
80
- | **DeltaUCB** | **+0.074** | +0.028 | +0.045 | +0.173 | +0.080 |
81
- | **DTS** | +0.003 | +0.004 | +0.002 | +0.028 | +0.009 |
82
- | **InfoMax** | +0.013 | +0.008 | +0.003 | +0.012 | +0.009 |
83
- | **MaxMinLCB** | -0.001 | +0.000 | +0.000 | +0.002 | -0.000 |
84
 
85
  ---
86
 
@@ -90,31 +96,31 @@ Below are the detailed results across 4 different prompt distributions. For acqu
90
  | Method | Factuality | Focus | Math | Precise IF | Safety | Ties | **Mean** |
91
  | :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
92
  | *Baselines* | | | | | | | |
93
- | Random | +0.443 | +0.209 | +0.156 | +0.133 | +0.417 | +0.310 | +0.278 |
94
- | UltraFeedback | +0.443 | +0.188 | **+0.213** | +0.114 | +0.481 | +0.284 | +0.287 |
95
- | MaxMin | +0.377 | **+0.483** | +0.156 | +0.123 | +0.370 | +0.400 | **+0.318** |
96
- | DeltaQwen | +0.195 | -0.034 | +0.028 | +0.067 | +0.216 | +0.126 | +0.100 |
97
  | *Ours* | | | | | | | |
98
- | **DRTS** | +0.439 | +0.386 | +0.151 | +0.064 | +0.415 | **+0.395** | +0.308 |
99
- | **DeltaUCB** | +0.463 | +0.350 | +0.164 | **+0.092** | +0.469 | +0.213 | +0.292 |
100
  | **DTS** | +0.419 | +0.087 | +0.186 | +0.083 | +0.411 | +0.297 | +0.247 |
101
  | **InfoMax** | **+0.476** | +0.383 | +0.153 | +0.042 | **+0.546** | +0.199 | +0.300 |
102
  | **MaxMinLCB** | +0.439 | +0.048 | +0.159 | +0.030 | +0.435 | +0.201 | +0.219 |
103
 
104
- **DPO Performance** (Best HPs selected)
105
  | Method | GSM8K | IF Eval | Truthful QA | Alpaca Eval | **Mean** |
106
  | :--- | :---: | :---: | :---: | :---: | :---: |
107
  | *Baselines* | | | | | |
108
- | Random | +0.026 | +0.012 | +0.012 | +0.035 | +0.021 |
109
- | UltraFeedback | +0.032 | -0.007 | +0.011 | +0.052 | +0.022 |
110
- | MaxMin | +0.074 | +0.025 | +0.052 | +0.222 | +0.092 |
111
- | DeltaQwen | +0.069 | **+0.030** | **+0.097** | **+0.299** | **+0.123** |
112
  | *Ours* | | | | | |
113
- | **DRTS** | +0.065 | +0.028 | +0.090 | +0.238 | +0.105 |
114
- | **DeltaUCB** | **+0.078** | +0.010 | +0.093 | +0.246 | +0.106 |
115
- | **DTS** | +0.011 | +0.000 | +0.006 | +0.024 | +0.010 |
116
- | **InfoMax** | +0.004 | +0.012 | +0.004 | +0.016 | +0.009 |
117
- | **MaxMinLCB** | -0.006 | +0.000 | +0.003 | +0.004 | -0.000 |
118
 
119
  ---
120
 
@@ -136,7 +142,7 @@ Below are the detailed results across 4 different prompt distributions. For acqu
136
  | **InfoMax** | +0.431 | +0.302 | +0.175 | +0.098 | +0.545 | +0.286 | +0.306 |
137
  | **MaxMinLCB** | +0.448 | +0.168 | +0.140 | +0.101 | +0.531 | +0.196 | +0.264 |
138
 
139
- **DPO Performance** (Best HPs selected)
140
  | Method | GSM8K | IF Eval | Truthful QA | Alpaca Eval | **Mean** |
141
  | :--- | :---: | :---: | :---: | :---: | :---: |
142
  | *Baselines* | | | | | |
@@ -166,40 +172,50 @@ Given a batch of prompts, the following steps are executed:
166
 
167
  ---
168
 
169
- ## 🤖 Source Models
170
-
171
- To ensure diversity and quality, we utilize a wide range of open-source models for completion generation. Please refer to the specific licenses for each model when using these datasets.
172
-
173
- **Qwen Series:**
174
- * `Qwen/Qwen2.5-0.5B-Instruct`, `Qwen/Qwen2.5-72B-Instruct`
175
- * `Qwen/Qwen3-0.6B`, `1.7B`, `14B`, `32B`
176
- * `Qwen/Qwen3-30B-A3B`, `235B-A22B`
177
-
178
- **Llama Series:**
179
- * `meta-llama/Llama-3.1-8B-Instruct`
180
- * `meta-llama/Llama-3.2-1B-Instruct`, `3B-Instruct`
181
- * `meta-llama/Llama-3.3-70B-Instruct`
182
-
183
- **NVIDIA Nemotron:**
184
- * `nvidia/Llama-3_3-Nemotron-Super-49B-v1`
185
- * `nvidia/Llama-3.1-Nemotron-70B-Instruct-HF`
186
- * `nvidia/Llama-3_1-Nemotron-Ultra-253B-v1`
187
-
188
- **Google Gemma:**
189
- * `google/gemma-3-1b-it`, `4b-it`, `12b-it`, `27b-it`
190
-
191
- **Mistral:**
192
- * `mistralai/Mistral-Small-24B-Instruct-2501`
193
- * `mistralai/Mistral-Large-Instruct-2411`
194
-
195
- **Others:**
196
- * `microsoft/Phi-4-mini-instruct`, `microsoft/phi-4`
197
- * `HuggingFaceTB/SmolLM2-1.7B-Instruct`
198
- * `CohereLabs/c4ai-command-a-03-2025`
199
- * `deepseek-ai/DeepSeek-V3`
200
- * `allenai/OLMo-2-0325-32B-Instruct`
201
- * `allenai/Llama-3.1-Tulu-3-70B`, `405B`
202
- * `moonshotai/Moonlight-16B-A3B-Instruct`
 
 
 
 
 
 
 
 
 
 
203
 
204
  ---
205
 
@@ -207,15 +223,15 @@ To ensure diversity and quality, we utilize a wide range of open-source models f
207
 
208
  ### 1. Installation
209
  Install the package in editable mode:
210
- ```bash
211
  pip install -e .
212
- ```
213
 
214
  ### 2. Running the Pipeline
215
  Run the main dataset generation script:
216
- ```bash
217
  python path/to/main_script.py
218
- ```
219
 
220
  ### 3. Configuration (Optional)
221
  To modify the pipeline parameters and steps, edit the configuration files in the `config/` directory.
@@ -226,13 +242,13 @@ To modify the pipeline parameters and steps, edit the configuration files in the
226
 
227
  ### Option 1: Docker/Podman (Recommended)
228
  Build the container image:
229
- ```bash
230
  podman build -t activeuf:latest .
231
- ```
232
 
233
  ### Option 2: `uv` (For Local Use)
234
  Create a `uv` environment with all dependencies.
235
- ```bash
236
  # Install uv
237
  curl -LsSf [https://astral.sh/uv/install.sh](https://astral.sh/uv/install.sh) | sh
238
  source $HOME/.local/bin/env
@@ -240,7 +256,7 @@ source $HOME/.local/bin/env
240
  # Sync dependencies
241
  uv sync --dev
242
  source .venv/bin/activate
243
- ```
244
 
245
  ---
246
 
@@ -250,18 +266,18 @@ For contributors and developers:
250
 
251
  ### Pre-commit Hooks
252
  This project uses `ruff` for linting and formatting.
253
- ```bash
254
  pre-commit install
255
- ```
256
 
257
  ### Manual Linting
258
- ```bash
259
  # Format code
260
  ruff format
261
 
262
  # Lint and auto-fix
263
  ruff check --fix
264
- ```
265
 
266
  ---
267
 
@@ -271,3 +287,12 @@ This project is licensed under the **MIT License**.
271
 
272
  **Note on Data Usage:**
273
  While the code and curated datasets in this repository are released under MIT, the datasets contain outputs generated by third-party models (listed above). Users are responsible for adhering to the respective licenses of these source models (e.g., Llama Community License, Apache 2.0, Qwen Research License) when using this data for training or commercial purposes.
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+
3
+ # TRIPLE BACKTICK FIX:
4
+ # I am using double backticks (``) in this string so the code block doesn't break in the chat.
5
+ # The script automatically converts them to real triple backticks (```) when saving.
6
+
7
+ readme_text = r"""# Active UltraFeedback
8
 
9
  **Active UltraFeedback** is a scalable pipeline for generating high-quality preference datasets to align large language models (LLMs), requiring only a set of prompts as input.
10
 
 
18
 
19
  The datasets generated by our pipeline for **DRTS** and **DeltaUCB** consistently beat the actual `ultrafeedback_binarized_cleaned` and `tulu3` preference mixture datasets on our DPO/RM training setups with LoRA.
20
 
21
+ Below are the detailed results across 4 different prompt distributions.
22
 
23
  ---
24
 
 
39
  | **InfoMax** | +0.463 | +0.287 | +0.096 | **+0.129** | +0.509 | +0.296 | +0.297 |
40
  | **MaxMinLCB** | +0.390 | -0.025 | **+0.244** | +0.070 | +0.453 | +0.250 | +0.230 |
41
 
42
+ **DPO Performance**
43
  | Method | GSM8K | IF Eval | Truthful QA | Alpaca Eval | **Mean** |
44
  | :--- | :---: | :---: | :---: | :---: | :---: |
45
  | *Baselines* | | | | | |
 
62
  | Method | Factuality | Focus | Math | Precise IF | Safety | Ties | **Mean** |
63
  | :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
64
  | *Baselines* | | | | | | | |
65
+ | Random | +0.407 | +0.106 | +0.151 | +0.092 | +0.422 | +0.157 | +0.223 |
66
+ | UltraFeedback | +0.419 | +0.068 | +0.189 | +0.058 | +0.440 | +0.228 | +0.234 |
67
+ | MaxMin | +0.410 | **+0.462** | +0.172 | +0.055 | **+0.531** | **+0.319** | **+0.325** |
68
+ | DeltaQwen | +0.238 | -0.023 | +0.011 | **+0.108** | +0.306 | +0.132 | +0.129 |
69
  | *Ours* | | | | | | | |
70
+ | **DRTS** | +0.423 | +0.233 | +0.164 | +0.055 | +0.377 | +0.285 | +0.256 |
71
  | **DeltaUCB** | +0.370 | +0.319 | **+0.194** | +0.033 | +0.346 | +0.310 | +0.262 |
72
+ | **DTS** | +0.417 | -0.021 | +0.148 | +0.077 | +0.450 | +0.245 | +0.219 |
73
+ | **InfoMax** | **+0.429** | +0.122 | +0.162 | +0.030 | +0.495 | +0.227 | +0.244 |
74
  | **MaxMinLCB** | +0.371 | -0.016 | +0.145 | +0.039 | +0.395 | +0.167 | +0.184 |
75
 
76
+ **DPO Performance**
77
  | Method | GSM8K | IF Eval | Truthful QA | Alpaca Eval | **Mean** |
78
  | :--- | :---: | :---: | :---: | :---: | :---: |
79
  | *Baselines* | | | | | |
80
+ | Random | +0.012 | +0.015 | +0.045 | +0.063 | +0.033 |
81
+ | UltraFeedback | +0.027 | **+0.054** | +0.043 | +0.071 | +0.048 |
82
+ | MaxMin | +0.049 | -0.011 | +0.128 | +0.270 | +0.108 |
83
+ | DeltaQwen | **+0.058** | +0.002 | **+0.152** | **+0.384** | **+0.149** |
84
  | *Ours* | | | | | |
85
+ | **DRTS** | +0.052 | +0.012 | +0.114 | +0.229 | +0.101 |
86
+ | **DeltaUCB** | +0.055 | +0.013 | +0.077 | +0.238 | +0.095 |
87
+ | **DTS** | +0.008 | +0.002 | +0.011 | +0.021 | +0.010 |
88
+ | **InfoMax** | +0.021 | +0.002 | +0.011 | +0.013 | +0.012 |
89
+ | **MaxMinLCB** | +0.003 | +0.010 | +0.004 | +0.018 | +0.008 |
90
 
91
  ---
92
 
 
96
  | Method | Factuality | Focus | Math | Precise IF | Safety | Ties | **Mean** |
97
  | :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
98
  | *Baselines* | | | | | | | |
99
+ | Random | +0.455 | +0.216 | **+0.205** | +0.077 | +0.466 | +0.193 | +0.269 |
100
+ | UltraFeedback | +0.407 | +0.114 | +0.175 | +0.064 | +0.433 | +0.247 | +0.240 |
101
+ | MaxMin | +0.410 | **+0.467** | +0.194 | +0.083 | +0.412 | **+0.380** | **+0.325** |
102
+ | DeltaQwen | +0.242 | -0.007 | +0.009 | **+0.151** | +0.279 | +0.241 | +0.153 |
103
  | *Ours* | | | | | | | |
104
+ | **DRTS** | +0.427 | +0.436 | +0.156 | +0.086 | +0.475 | +0.272 | +0.309 |
105
+ | **DeltaUCB** | +0.463 | +0.350 | +0.164 | +0.092 | +0.469 | +0.213 | +0.292 |
106
  | **DTS** | +0.419 | +0.087 | +0.186 | +0.083 | +0.411 | +0.297 | +0.247 |
107
  | **InfoMax** | **+0.476** | +0.383 | +0.153 | +0.042 | **+0.546** | +0.199 | +0.300 |
108
  | **MaxMinLCB** | +0.439 | +0.048 | +0.159 | +0.030 | +0.435 | +0.201 | +0.219 |
109
 
110
+ **DPO Performance**
111
  | Method | GSM8K | IF Eval | Truthful QA | Alpaca Eval | **Mean** |
112
  | :--- | :---: | :---: | :---: | :---: | :---: |
113
  | *Baselines* | | | | | |
114
+ | Random | +0.024 | +0.028 | +0.056 | +0.077 | +0.046 |
115
+ | UltraFeedback | +0.037 | -0.001 | +0.039 | +0.072 | +0.036 |
116
+ | MaxMin | +0.022 | -0.016 | **+0.150** | +0.289 | +0.111 |
117
+ | DeltaQwen | +0.055 | **+0.047** | +0.130 | **+0.316** | **+0.137** |
118
  | *Ours* | | | | | |
119
+ | **DRTS** | **+0.055** | +0.015 | +0.108 | +0.177 | +0.088 |
120
+ | **DeltaUCB** | +0.049 | +0.039 | +0.117 | +0.217 | +0.105 |
121
+ | **DTS** | +0.009 | +0.002 | +0.014 | +0.029 | +0.013 |
122
+ | **InfoMax** | +0.011 | +0.021 | +0.014 | +0.018 | +0.015 |
123
+ | **MaxMinLCB** | -0.010 | +0.019 | +0.010 | +0.021 | +0.009 |
124
 
125
  ---
126
 
 
142
  | **InfoMax** | +0.431 | +0.302 | +0.175 | +0.098 | +0.545 | +0.286 | +0.306 |
143
  | **MaxMinLCB** | +0.448 | +0.168 | +0.140 | +0.101 | +0.531 | +0.196 | +0.264 |
144
 
145
+ **DPO Performance**
146
  | Method | GSM8K | IF Eval | Truthful QA | Alpaca Eval | **Mean** |
147
  | :--- | :---: | :---: | :---: | :---: | :---: |
148
  | *Baselines* | | | | | |
 
172
 
173
  ---
174
 
175
+ ## 🤖 Source Models and Licenses
176
+
177
+ To ensure diversity and quality, we utilize a wide range of open-source models for completion generation. Below is the list of models used, along with their parameters and licenses.
178
+
179
+ | Model | Parameters (B) | License |
180
+ | :--- | :---: | :--- |
181
+ | **Qwen** | | |
182
+ | `Qwen/Qwen2.5-0.5B-Instruct` | 0.5 | Apache 2.0 |
183
+ | `Qwen/Qwen2.5-72B-Instruct` | 72 | Qwen |
184
+ | `Qwen/Qwen3-0.6B` | 0.6 | Apache 2.0 |
185
+ | `Qwen/Qwen3-1.7B` | 1.7 | Apache 2.0 |
186
+ | `Qwen/Qwen3-14B` | 14 | Apache 2.0 |
187
+ | `Qwen/Qwen3-30B-A3B` | 30 | Apache 2.0 |
188
+ | `Qwen/Qwen3-32B` | 32 | Apache 2.0 |
189
+ | `Qwen/Qwen3-235B-A22B` | 234 | Apache 2.0 |
190
+ | **Llama** | | |
191
+ | `meta-llama/Llama-3.1-8B-Instruct` | 8 | Llama 3 |
192
+ | `meta-llama/Llama-3.2-1B-Instruct` | 1 | Llama 3 |
193
+ | `meta-llama/Llama-3.2-3B-Instruct` | 3 | Llama 3 |
194
+ | `meta-llama/Llama-3.3-70B-Instruct` | 70 | Llama 3 |
195
+ | **Microsoft** | | |
196
+ | `microsoft/Phi-4-mini-instruct` | 4 | MIT |
197
+ | `microsoft/phi-4` | 14 | MIT |
198
+ | **Mistral** | | |
199
+ | `mistralai/Mistral-Small-24B-Instruct-2501` | 23 | Apache 2.0 |
200
+ | `mistralai/Mistral-Large-Instruct-2411` | 123 | MRL |
201
+ | **NVIDIA** | | |
202
+ | `nvidia/Llama-3.1-Nemotron-70B-Instruct-HF` | 70 | Llama 3 |
203
+ | `nvidia/Llama-3_3-Nemotron-Super-49B-v1` | 49 | Nvidia Open Model |
204
+ | `nvidia/Llama-3_1-Nemotron-Ultra-253B-v1` | 253 | Nvidia Open Model |
205
+ | **Gemma** | | |
206
+ | `google/gemma-3-1b-it` | 1 | Gemma |
207
+ | `google/gemma-3-4b-it` | 4 | Gemma |
208
+ | `google/gemma-3-12b-it` | 12 | Gemma |
209
+ | `google/gemma-3-27b-it` | 27 | Gemma |
210
+ | **AllenAI** | | |
211
+ | `allenai/OLMo-2-0325-32B-Instruct` | 32 | Apache 2.0 |
212
+ | `allenai/Llama-3.1-Tulu-3-70B` | 70 | Llama 3 |
213
+ | `allenai/Llama-3.1-Tulu-3-405B` | 405 | Llama 3 |
214
+ | **Other** | | |
215
+ | `HuggingFaceTB/SmolLM2-1.7B-Instruct` | 1.7 | Apache 2.0 |
216
+ | `moonshotai/Moonlight-16B-A3B-Instruct` | 16 | MIT |
217
+ | `CohereLabs/c4ai-command-a-03-2025` | 111 | CC by NC 4.0 |
218
+ | `deepseek-ai/DeepSeek-V3` | 671 | Deepseek |
219
 
220
  ---
221
 
 
223
 
224
  ### 1. Installation
225
  Install the package in editable mode:
226
+ ``bash
227
  pip install -e .
228
+ ``
229
 
230
  ### 2. Running the Pipeline
231
  Run the main dataset generation script:
232
+ ``bash
233
  python path/to/main_script.py
234
+ ``
235
 
236
  ### 3. Configuration (Optional)
237
  To modify the pipeline parameters and steps, edit the configuration files in the `config/` directory.
 
242
 
243
  ### Option 1: Docker/Podman (Recommended)
244
  Build the container image:
245
+ ``bash
246
  podman build -t activeuf:latest .
247
+ ``
248
 
249
  ### Option 2: `uv` (For Local Use)
250
  Create a `uv` environment with all dependencies.
251
+ ``bash
252
  # Install uv
253
  curl -LsSf [https://astral.sh/uv/install.sh](https://astral.sh/uv/install.sh) | sh
254
  source $HOME/.local/bin/env
 
256
  # Sync dependencies
257
  uv sync --dev
258
  source .venv/bin/activate
259
+ ``
260
 
261
  ---
262
 
 
266
 
267
  ### Pre-commit Hooks
268
  This project uses `ruff` for linting and formatting.
269
+ ``bash
270
  pre-commit install
271
+ ``
272
 
273
  ### Manual Linting
274
+ ``bash
275
  # Format code
276
  ruff format
277
 
278
  # Lint and auto-fix
279
  ruff check --fix
280
+ ``
281
 
282
  ---
283
 
 
287
 
288
  **Note on Data Usage:**
289
  While the code and curated datasets in this repository are released under MIT, the datasets contain outputs generated by third-party models (listed above). Users are responsible for adhering to the respective licenses of these source models (e.g., Llama Community License, Apache 2.0, Qwen Research License) when using this data for training or commercial purposes.
290
+ """
291
+
292
+ # Automatically replace double backticks with triple backticks for correct Markdown rendering
293
+ final_content = readme_text.replace("``", "```")
294
+
295
+ with open("README.md", "w", encoding="utf-8") as f:
296
+ f.write(final_content)
297
+
298
+ print("✅ README.md generated successfully with all tables and licenses included!")