Spaces:
No application file
No application file
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,4 +1,10 @@
|
|
| 1 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
|
| 3 |
**Active UltraFeedback** is a scalable pipeline for generating high-quality preference datasets to align large language models (LLMs), requiring only a set of prompts as input.
|
| 4 |
|
|
@@ -12,7 +18,7 @@ Our experiments demonstrate that **Active Learning strategies (specifically DRTS
|
|
| 12 |
|
| 13 |
The datasets generated by our pipeline for **DRTS** and **DeltaUCB** consistently beat the actual `ultrafeedback_binarized_cleaned` and `tulu3` preference mixture datasets on our DPO/RM training setups with LoRA.
|
| 14 |
|
| 15 |
-
Below are the detailed results across 4 different prompt distributions.
|
| 16 |
|
| 17 |
---
|
| 18 |
|
|
@@ -33,7 +39,7 @@ Below are the detailed results across 4 different prompt distributions. For acqu
|
|
| 33 |
| **InfoMax** | +0.463 | +0.287 | +0.096 | **+0.129** | +0.509 | +0.296 | +0.297 |
|
| 34 |
| **MaxMinLCB** | +0.390 | -0.025 | **+0.244** | +0.070 | +0.453 | +0.250 | +0.230 |
|
| 35 |
|
| 36 |
-
**DPO Performance**
|
| 37 |
| Method | GSM8K | IF Eval | Truthful QA | Alpaca Eval | **Mean** |
|
| 38 |
| :--- | :---: | :---: | :---: | :---: | :---: |
|
| 39 |
| *Baselines* | | | | | |
|
|
@@ -56,31 +62,31 @@ Below are the detailed results across 4 different prompt distributions. For acqu
|
|
| 56 |
| Method | Factuality | Focus | Math | Precise IF | Safety | Ties | **Mean** |
|
| 57 |
| :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|
| 58 |
| *Baselines* | | | | | | | |
|
| 59 |
-
| Random | +0.
|
| 60 |
-
| UltraFeedback | +0.
|
| 61 |
-
| MaxMin | +0.
|
| 62 |
-
| DeltaQwen | +0.
|
| 63 |
| *Ours* | | | | | | | |
|
| 64 |
-
| **DRTS** | +0.
|
| 65 |
| **DeltaUCB** | +0.370 | +0.319 | **+0.194** | +0.033 | +0.346 | +0.310 | +0.262 |
|
| 66 |
-
| **DTS** | +0.417 | -0.021 | +0.148 |
|
| 67 |
-
| **InfoMax** | **+0.429** | +0.122 | +0.162 | +0.030 |
|
| 68 |
| **MaxMinLCB** | +0.371 | -0.016 | +0.145 | +0.039 | +0.395 | +0.167 | +0.184 |
|
| 69 |
|
| 70 |
-
**DPO Performance**
|
| 71 |
| Method | GSM8K | IF Eval | Truthful QA | Alpaca Eval | **Mean** |
|
| 72 |
| :--- | :---: | :---: | :---: | :---: | :---: |
|
| 73 |
| *Baselines* | | | | | |
|
| 74 |
-
| Random | +0.
|
| 75 |
-
| UltraFeedback | +0.
|
| 76 |
-
| MaxMin | +0.
|
| 77 |
-
| DeltaQwen | +0.
|
| 78 |
| *Ours* | | | | | |
|
| 79 |
-
| **DRTS** | +0.
|
| 80 |
-
| **DeltaUCB** |
|
| 81 |
-
| **DTS** | +0.
|
| 82 |
-
| **InfoMax** | +0.
|
| 83 |
-
| **MaxMinLCB** |
|
| 84 |
|
| 85 |
---
|
| 86 |
|
|
@@ -90,31 +96,31 @@ Below are the detailed results across 4 different prompt distributions. For acqu
|
|
| 90 |
| Method | Factuality | Focus | Math | Precise IF | Safety | Ties | **Mean** |
|
| 91 |
| :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|
| 92 |
| *Baselines* | | | | | | | |
|
| 93 |
-
| Random | +0.
|
| 94 |
-
| UltraFeedback | +0.
|
| 95 |
-
| MaxMin | +0.
|
| 96 |
-
| DeltaQwen | +0.
|
| 97 |
| *Ours* | | | | | | | |
|
| 98 |
-
| **DRTS** | +0.
|
| 99 |
-
| **DeltaUCB** | +0.463 | +0.350 | +0.164 |
|
| 100 |
| **DTS** | +0.419 | +0.087 | +0.186 | +0.083 | +0.411 | +0.297 | +0.247 |
|
| 101 |
| **InfoMax** | **+0.476** | +0.383 | +0.153 | +0.042 | **+0.546** | +0.199 | +0.300 |
|
| 102 |
| **MaxMinLCB** | +0.439 | +0.048 | +0.159 | +0.030 | +0.435 | +0.201 | +0.219 |
|
| 103 |
|
| 104 |
-
**DPO Performance**
|
| 105 |
| Method | GSM8K | IF Eval | Truthful QA | Alpaca Eval | **Mean** |
|
| 106 |
| :--- | :---: | :---: | :---: | :---: | :---: |
|
| 107 |
| *Baselines* | | | | | |
|
| 108 |
-
| Random | +0.
|
| 109 |
-
| UltraFeedback | +0.
|
| 110 |
-
| MaxMin | +0.
|
| 111 |
-
| DeltaQwen | +0.
|
| 112 |
| *Ours* | | | | | |
|
| 113 |
-
| **DRTS** | +0.
|
| 114 |
-
| **DeltaUCB** |
|
| 115 |
-
| **DTS** | +0.
|
| 116 |
-
| **InfoMax** | +0.
|
| 117 |
-
| **MaxMinLCB** | -0.
|
| 118 |
|
| 119 |
---
|
| 120 |
|
|
@@ -136,7 +142,7 @@ Below are the detailed results across 4 different prompt distributions. For acqu
|
|
| 136 |
| **InfoMax** | +0.431 | +0.302 | +0.175 | +0.098 | +0.545 | +0.286 | +0.306 |
|
| 137 |
| **MaxMinLCB** | +0.448 | +0.168 | +0.140 | +0.101 | +0.531 | +0.196 | +0.264 |
|
| 138 |
|
| 139 |
-
**DPO Performance**
|
| 140 |
| Method | GSM8K | IF Eval | Truthful QA | Alpaca Eval | **Mean** |
|
| 141 |
| :--- | :---: | :---: | :---: | :---: | :---: |
|
| 142 |
| *Baselines* | | | | | |
|
|
@@ -166,40 +172,50 @@ Given a batch of prompts, the following steps are executed:
|
|
| 166 |
|
| 167 |
---
|
| 168 |
|
| 169 |
-
## 🤖 Source Models
|
| 170 |
-
|
| 171 |
-
To ensure diversity and quality, we utilize a wide range of open-source models for completion generation.
|
| 172 |
-
|
| 173 |
-
|
| 174 |
-
|
| 175 |
-
|
| 176 |
-
|
| 177 |
-
|
| 178 |
-
|
| 179 |
-
|
| 180 |
-
|
| 181 |
-
|
| 182 |
-
|
| 183 |
-
|
| 184 |
-
|
| 185 |
-
|
| 186 |
-
|
| 187 |
-
|
| 188 |
-
|
| 189 |
-
*
|
| 190 |
-
|
| 191 |
-
|
| 192 |
-
|
| 193 |
-
|
| 194 |
-
|
| 195 |
-
**
|
| 196 |
-
|
| 197 |
-
|
| 198 |
-
|
| 199 |
-
*
|
| 200 |
-
|
| 201 |
-
|
| 202 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 203 |
|
| 204 |
---
|
| 205 |
|
|
@@ -207,15 +223,15 @@ To ensure diversity and quality, we utilize a wide range of open-source models f
|
|
| 207 |
|
| 208 |
### 1. Installation
|
| 209 |
Install the package in editable mode:
|
| 210 |
-
``
|
| 211 |
pip install -e .
|
| 212 |
-
``
|
| 213 |
|
| 214 |
### 2. Running the Pipeline
|
| 215 |
Run the main dataset generation script:
|
| 216 |
-
``
|
| 217 |
python path/to/main_script.py
|
| 218 |
-
``
|
| 219 |
|
| 220 |
### 3. Configuration (Optional)
|
| 221 |
To modify the pipeline parameters and steps, edit the configuration files in the `config/` directory.
|
|
@@ -226,13 +242,13 @@ To modify the pipeline parameters and steps, edit the configuration files in the
|
|
| 226 |
|
| 227 |
### Option 1: Docker/Podman (Recommended)
|
| 228 |
Build the container image:
|
| 229 |
-
``
|
| 230 |
podman build -t activeuf:latest .
|
| 231 |
-
``
|
| 232 |
|
| 233 |
### Option 2: `uv` (For Local Use)
|
| 234 |
Create a `uv` environment with all dependencies.
|
| 235 |
-
``
|
| 236 |
# Install uv
|
| 237 |
curl -LsSf [https://astral.sh/uv/install.sh](https://astral.sh/uv/install.sh) | sh
|
| 238 |
source $HOME/.local/bin/env
|
|
@@ -240,7 +256,7 @@ source $HOME/.local/bin/env
|
|
| 240 |
# Sync dependencies
|
| 241 |
uv sync --dev
|
| 242 |
source .venv/bin/activate
|
| 243 |
-
``
|
| 244 |
|
| 245 |
---
|
| 246 |
|
|
@@ -250,18 +266,18 @@ For contributors and developers:
|
|
| 250 |
|
| 251 |
### Pre-commit Hooks
|
| 252 |
This project uses `ruff` for linting and formatting.
|
| 253 |
-
``
|
| 254 |
pre-commit install
|
| 255 |
-
``
|
| 256 |
|
| 257 |
### Manual Linting
|
| 258 |
-
``
|
| 259 |
# Format code
|
| 260 |
ruff format
|
| 261 |
|
| 262 |
# Lint and auto-fix
|
| 263 |
ruff check --fix
|
| 264 |
-
``
|
| 265 |
|
| 266 |
---
|
| 267 |
|
|
@@ -271,3 +287,12 @@ This project is licensed under the **MIT License**.
|
|
| 271 |
|
| 272 |
**Note on Data Usage:**
|
| 273 |
While the code and curated datasets in this repository are released under MIT, the datasets contain outputs generated by third-party models (listed above). Users are responsible for adhering to the respective licenses of these source models (e.g., Llama Community License, Apache 2.0, Qwen Research License) when using this data for training or commercial purposes.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import os
|
| 2 |
+
|
| 3 |
+
# TRIPLE BACKTICK FIX:
|
| 4 |
+
# I am using double backticks (``) in this string so the code block doesn't break in the chat.
|
| 5 |
+
# The script automatically converts them to real triple backticks (```) when saving.
|
| 6 |
+
|
| 7 |
+
readme_text = r"""# Active UltraFeedback
|
| 8 |
|
| 9 |
**Active UltraFeedback** is a scalable pipeline for generating high-quality preference datasets to align large language models (LLMs), requiring only a set of prompts as input.
|
| 10 |
|
|
|
|
| 18 |
|
| 19 |
The datasets generated by our pipeline for **DRTS** and **DeltaUCB** consistently beat the actual `ultrafeedback_binarized_cleaned` and `tulu3` preference mixture datasets on our DPO/RM training setups with LoRA.
|
| 20 |
|
| 21 |
+
Below are the detailed results across 4 different prompt distributions.
|
| 22 |
|
| 23 |
---
|
| 24 |
|
|
|
|
| 39 |
| **InfoMax** | +0.463 | +0.287 | +0.096 | **+0.129** | +0.509 | +0.296 | +0.297 |
|
| 40 |
| **MaxMinLCB** | +0.390 | -0.025 | **+0.244** | +0.070 | +0.453 | +0.250 | +0.230 |
|
| 41 |
|
| 42 |
+
**DPO Performance**
|
| 43 |
| Method | GSM8K | IF Eval | Truthful QA | Alpaca Eval | **Mean** |
|
| 44 |
| :--- | :---: | :---: | :---: | :---: | :---: |
|
| 45 |
| *Baselines* | | | | | |
|
|
|
|
| 62 |
| Method | Factuality | Focus | Math | Precise IF | Safety | Ties | **Mean** |
|
| 63 |
| :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|
| 64 |
| *Baselines* | | | | | | | |
|
| 65 |
+
| Random | +0.407 | +0.106 | +0.151 | +0.092 | +0.422 | +0.157 | +0.223 |
|
| 66 |
+
| UltraFeedback | +0.419 | +0.068 | +0.189 | +0.058 | +0.440 | +0.228 | +0.234 |
|
| 67 |
+
| MaxMin | +0.410 | **+0.462** | +0.172 | +0.055 | **+0.531** | **+0.319** | **+0.325** |
|
| 68 |
+
| DeltaQwen | +0.238 | -0.023 | +0.011 | **+0.108** | +0.306 | +0.132 | +0.129 |
|
| 69 |
| *Ours* | | | | | | | |
|
| 70 |
+
| **DRTS** | +0.423 | +0.233 | +0.164 | +0.055 | +0.377 | +0.285 | +0.256 |
|
| 71 |
| **DeltaUCB** | +0.370 | +0.319 | **+0.194** | +0.033 | +0.346 | +0.310 | +0.262 |
|
| 72 |
+
| **DTS** | +0.417 | -0.021 | +0.148 | +0.077 | +0.450 | +0.245 | +0.219 |
|
| 73 |
+
| **InfoMax** | **+0.429** | +0.122 | +0.162 | +0.030 | +0.495 | +0.227 | +0.244 |
|
| 74 |
| **MaxMinLCB** | +0.371 | -0.016 | +0.145 | +0.039 | +0.395 | +0.167 | +0.184 |
|
| 75 |
|
| 76 |
+
**DPO Performance**
|
| 77 |
| Method | GSM8K | IF Eval | Truthful QA | Alpaca Eval | **Mean** |
|
| 78 |
| :--- | :---: | :---: | :---: | :---: | :---: |
|
| 79 |
| *Baselines* | | | | | |
|
| 80 |
+
| Random | +0.012 | +0.015 | +0.045 | +0.063 | +0.033 |
|
| 81 |
+
| UltraFeedback | +0.027 | **+0.054** | +0.043 | +0.071 | +0.048 |
|
| 82 |
+
| MaxMin | +0.049 | -0.011 | +0.128 | +0.270 | +0.108 |
|
| 83 |
+
| DeltaQwen | **+0.058** | +0.002 | **+0.152** | **+0.384** | **+0.149** |
|
| 84 |
| *Ours* | | | | | |
|
| 85 |
+
| **DRTS** | +0.052 | +0.012 | +0.114 | +0.229 | +0.101 |
|
| 86 |
+
| **DeltaUCB** | +0.055 | +0.013 | +0.077 | +0.238 | +0.095 |
|
| 87 |
+
| **DTS** | +0.008 | +0.002 | +0.011 | +0.021 | +0.010 |
|
| 88 |
+
| **InfoMax** | +0.021 | +0.002 | +0.011 | +0.013 | +0.012 |
|
| 89 |
+
| **MaxMinLCB** | +0.003 | +0.010 | +0.004 | +0.018 | +0.008 |
|
| 90 |
|
| 91 |
---
|
| 92 |
|
|
|
|
| 96 |
| Method | Factuality | Focus | Math | Precise IF | Safety | Ties | **Mean** |
|
| 97 |
| :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|
| 98 |
| *Baselines* | | | | | | | |
|
| 99 |
+
| Random | +0.455 | +0.216 | **+0.205** | +0.077 | +0.466 | +0.193 | +0.269 |
|
| 100 |
+
| UltraFeedback | +0.407 | +0.114 | +0.175 | +0.064 | +0.433 | +0.247 | +0.240 |
|
| 101 |
+
| MaxMin | +0.410 | **+0.467** | +0.194 | +0.083 | +0.412 | **+0.380** | **+0.325** |
|
| 102 |
+
| DeltaQwen | +0.242 | -0.007 | +0.009 | **+0.151** | +0.279 | +0.241 | +0.153 |
|
| 103 |
| *Ours* | | | | | | | |
|
| 104 |
+
| **DRTS** | +0.427 | +0.436 | +0.156 | +0.086 | +0.475 | +0.272 | +0.309 |
|
| 105 |
+
| **DeltaUCB** | +0.463 | +0.350 | +0.164 | +0.092 | +0.469 | +0.213 | +0.292 |
|
| 106 |
| **DTS** | +0.419 | +0.087 | +0.186 | +0.083 | +0.411 | +0.297 | +0.247 |
|
| 107 |
| **InfoMax** | **+0.476** | +0.383 | +0.153 | +0.042 | **+0.546** | +0.199 | +0.300 |
|
| 108 |
| **MaxMinLCB** | +0.439 | +0.048 | +0.159 | +0.030 | +0.435 | +0.201 | +0.219 |
|
| 109 |
|
| 110 |
+
**DPO Performance**
|
| 111 |
| Method | GSM8K | IF Eval | Truthful QA | Alpaca Eval | **Mean** |
|
| 112 |
| :--- | :---: | :---: | :---: | :---: | :---: |
|
| 113 |
| *Baselines* | | | | | |
|
| 114 |
+
| Random | +0.024 | +0.028 | +0.056 | +0.077 | +0.046 |
|
| 115 |
+
| UltraFeedback | +0.037 | -0.001 | +0.039 | +0.072 | +0.036 |
|
| 116 |
+
| MaxMin | +0.022 | -0.016 | **+0.150** | +0.289 | +0.111 |
|
| 117 |
+
| DeltaQwen | +0.055 | **+0.047** | +0.130 | **+0.316** | **+0.137** |
|
| 118 |
| *Ours* | | | | | |
|
| 119 |
+
| **DRTS** | **+0.055** | +0.015 | +0.108 | +0.177 | +0.088 |
|
| 120 |
+
| **DeltaUCB** | +0.049 | +0.039 | +0.117 | +0.217 | +0.105 |
|
| 121 |
+
| **DTS** | +0.009 | +0.002 | +0.014 | +0.029 | +0.013 |
|
| 122 |
+
| **InfoMax** | +0.011 | +0.021 | +0.014 | +0.018 | +0.015 |
|
| 123 |
+
| **MaxMinLCB** | -0.010 | +0.019 | +0.010 | +0.021 | +0.009 |
|
| 124 |
|
| 125 |
---
|
| 126 |
|
|
|
|
| 142 |
| **InfoMax** | +0.431 | +0.302 | +0.175 | +0.098 | +0.545 | +0.286 | +0.306 |
|
| 143 |
| **MaxMinLCB** | +0.448 | +0.168 | +0.140 | +0.101 | +0.531 | +0.196 | +0.264 |
|
| 144 |
|
| 145 |
+
**DPO Performance**
|
| 146 |
| Method | GSM8K | IF Eval | Truthful QA | Alpaca Eval | **Mean** |
|
| 147 |
| :--- | :---: | :---: | :---: | :---: | :---: |
|
| 148 |
| *Baselines* | | | | | |
|
|
|
|
| 172 |
|
| 173 |
---
|
| 174 |
|
| 175 |
+
## 🤖 Source Models and Licenses
|
| 176 |
+
|
| 177 |
+
To ensure diversity and quality, we utilize a wide range of open-source models for completion generation. Below is the list of models used, along with their parameters and licenses.
|
| 178 |
+
|
| 179 |
+
| Model | Parameters (B) | License |
|
| 180 |
+
| :--- | :---: | :--- |
|
| 181 |
+
| **Qwen** | | |
|
| 182 |
+
| `Qwen/Qwen2.5-0.5B-Instruct` | 0.5 | Apache 2.0 |
|
| 183 |
+
| `Qwen/Qwen2.5-72B-Instruct` | 72 | Qwen |
|
| 184 |
+
| `Qwen/Qwen3-0.6B` | 0.6 | Apache 2.0 |
|
| 185 |
+
| `Qwen/Qwen3-1.7B` | 1.7 | Apache 2.0 |
|
| 186 |
+
| `Qwen/Qwen3-14B` | 14 | Apache 2.0 |
|
| 187 |
+
| `Qwen/Qwen3-30B-A3B` | 30 | Apache 2.0 |
|
| 188 |
+
| `Qwen/Qwen3-32B` | 32 | Apache 2.0 |
|
| 189 |
+
| `Qwen/Qwen3-235B-A22B` | 234 | Apache 2.0 |
|
| 190 |
+
| **Llama** | | |
|
| 191 |
+
| `meta-llama/Llama-3.1-8B-Instruct` | 8 | Llama 3 |
|
| 192 |
+
| `meta-llama/Llama-3.2-1B-Instruct` | 1 | Llama 3 |
|
| 193 |
+
| `meta-llama/Llama-3.2-3B-Instruct` | 3 | Llama 3 |
|
| 194 |
+
| `meta-llama/Llama-3.3-70B-Instruct` | 70 | Llama 3 |
|
| 195 |
+
| **Microsoft** | | |
|
| 196 |
+
| `microsoft/Phi-4-mini-instruct` | 4 | MIT |
|
| 197 |
+
| `microsoft/phi-4` | 14 | MIT |
|
| 198 |
+
| **Mistral** | | |
|
| 199 |
+
| `mistralai/Mistral-Small-24B-Instruct-2501` | 23 | Apache 2.0 |
|
| 200 |
+
| `mistralai/Mistral-Large-Instruct-2411` | 123 | MRL |
|
| 201 |
+
| **NVIDIA** | | |
|
| 202 |
+
| `nvidia/Llama-3.1-Nemotron-70B-Instruct-HF` | 70 | Llama 3 |
|
| 203 |
+
| `nvidia/Llama-3_3-Nemotron-Super-49B-v1` | 49 | Nvidia Open Model |
|
| 204 |
+
| `nvidia/Llama-3_1-Nemotron-Ultra-253B-v1` | 253 | Nvidia Open Model |
|
| 205 |
+
| **Gemma** | | |
|
| 206 |
+
| `google/gemma-3-1b-it` | 1 | Gemma |
|
| 207 |
+
| `google/gemma-3-4b-it` | 4 | Gemma |
|
| 208 |
+
| `google/gemma-3-12b-it` | 12 | Gemma |
|
| 209 |
+
| `google/gemma-3-27b-it` | 27 | Gemma |
|
| 210 |
+
| **AllenAI** | | |
|
| 211 |
+
| `allenai/OLMo-2-0325-32B-Instruct` | 32 | Apache 2.0 |
|
| 212 |
+
| `allenai/Llama-3.1-Tulu-3-70B` | 70 | Llama 3 |
|
| 213 |
+
| `allenai/Llama-3.1-Tulu-3-405B` | 405 | Llama 3 |
|
| 214 |
+
| **Other** | | |
|
| 215 |
+
| `HuggingFaceTB/SmolLM2-1.7B-Instruct` | 1.7 | Apache 2.0 |
|
| 216 |
+
| `moonshotai/Moonlight-16B-A3B-Instruct` | 16 | MIT |
|
| 217 |
+
| `CohereLabs/c4ai-command-a-03-2025` | 111 | CC by NC 4.0 |
|
| 218 |
+
| `deepseek-ai/DeepSeek-V3` | 671 | Deepseek |
|
| 219 |
|
| 220 |
---
|
| 221 |
|
|
|
|
| 223 |
|
| 224 |
### 1. Installation
|
| 225 |
Install the package in editable mode:
|
| 226 |
+
``bash
|
| 227 |
pip install -e .
|
| 228 |
+
``
|
| 229 |
|
| 230 |
### 2. Running the Pipeline
|
| 231 |
Run the main dataset generation script:
|
| 232 |
+
``bash
|
| 233 |
python path/to/main_script.py
|
| 234 |
+
``
|
| 235 |
|
| 236 |
### 3. Configuration (Optional)
|
| 237 |
To modify the pipeline parameters and steps, edit the configuration files in the `config/` directory.
|
|
|
|
| 242 |
|
| 243 |
### Option 1: Docker/Podman (Recommended)
|
| 244 |
Build the container image:
|
| 245 |
+
``bash
|
| 246 |
podman build -t activeuf:latest .
|
| 247 |
+
``
|
| 248 |
|
| 249 |
### Option 2: `uv` (For Local Use)
|
| 250 |
Create a `uv` environment with all dependencies.
|
| 251 |
+
``bash
|
| 252 |
# Install uv
|
| 253 |
curl -LsSf [https://astral.sh/uv/install.sh](https://astral.sh/uv/install.sh) | sh
|
| 254 |
source $HOME/.local/bin/env
|
|
|
|
| 256 |
# Sync dependencies
|
| 257 |
uv sync --dev
|
| 258 |
source .venv/bin/activate
|
| 259 |
+
``
|
| 260 |
|
| 261 |
---
|
| 262 |
|
|
|
|
| 266 |
|
| 267 |
### Pre-commit Hooks
|
| 268 |
This project uses `ruff` for linting and formatting.
|
| 269 |
+
``bash
|
| 270 |
pre-commit install
|
| 271 |
+
``
|
| 272 |
|
| 273 |
### Manual Linting
|
| 274 |
+
``bash
|
| 275 |
# Format code
|
| 276 |
ruff format
|
| 277 |
|
| 278 |
# Lint and auto-fix
|
| 279 |
ruff check --fix
|
| 280 |
+
``
|
| 281 |
|
| 282 |
---
|
| 283 |
|
|
|
|
| 287 |
|
| 288 |
**Note on Data Usage:**
|
| 289 |
While the code and curated datasets in this repository are released under MIT, the datasets contain outputs generated by third-party models (listed above). Users are responsible for adhering to the respective licenses of these source models (e.g., Llama Community License, Apache 2.0, Qwen Research License) when using this data for training or commercial purposes.
|
| 290 |
+
"""
|
| 291 |
+
|
| 292 |
+
# Automatically replace double backticks with triple backticks for correct Markdown rendering
|
| 293 |
+
final_content = readme_text.replace("``", "```")
|
| 294 |
+
|
| 295 |
+
with open("README.md", "w", encoding="utf-8") as f:
|
| 296 |
+
f.write(final_content)
|
| 297 |
+
|
| 298 |
+
print("✅ README.md generated successfully with all tables and licenses included!")
|