File size: 8,312 Bytes
db21e3a 21071a7 db21e3a | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 | # SWE-Next: Scalable Real-World Software Engineering Tasks for Agents
<p align="left">
<a href="https://arxiv.org/abs/2603.20691"><img alt="Paper" src="https://img.shields.io/badge/Paper-arXiv-b31b1b?style=for-the-badge&logo=arxiv&logoColor=white"></a>
<a href="https://tiger-ai-lab.github.io/SWE-Next/"><img alt="Project Page" src="https://img.shields.io/badge/Project%20Page-Website-4285F4?style=for-the-badge&logo=googlechrome&logoColor=white"></a>
<a href="https://github.com/TIGER-AI-Lab/SWE-Next"><img alt="Code" src="https://img.shields.io/badge/Code-GitHub-181717?style=for-the-badge&logo=github&logoColor=white"></a>
<a href="https://huggingface.co/datasets/TIGER-Lab/SWE-Next-SFT-Trajectories"><img alt="SFT Trajs" src="https://img.shields.io/badge/SFT%20Trajs-HuggingFace-FFD21E?style=for-the-badge&logo=huggingface&logoColor=000"></a>
<a href="https://huggingface.co/datasets/TIGER-Lab/SWE-Next"><img alt="Dataset" src="https://img.shields.io/badge/Dataset-HuggingFace-FFD21E?style=for-the-badge&logo=huggingface&logoColor=000"></a>
<a href="https://huggingface.co/TIGER-Lab/SWE-Next-7B"><img alt="Model 7B" src="https://img.shields.io/badge/Model%207B-HuggingFace-FFD21E?style=for-the-badge&logo=huggingface&logoColor=000"></a>
<a href="https://huggingface.co/TIGER-Lab/SWE-Next-14B"><img alt="Model 14B" src="https://img.shields.io/badge/Model%2014B-HuggingFace-FFD21E?style=for-the-badge&logo=huggingface&logoColor=000"></a>
</p>
## π° News
- **2026-04-07**: SWE-Next is now publicly released!
## π Introduction
**SWE-Next** introduces reusable **repo-quarter profiles**, which reuse the same environment across nearby commits in time while keeping each task run separate and reproducible. Using only **30 hours** and **639GB** of environment storage, SWE-Next processes **3,971** seed repositories and **102,582** candidate commit pairs mined from real merged PRs to construct a dataset of **2,308** self-verifying instances. SWE-Next improves downstream pass@1 on SWE-Bench Verified and SWE-Bench Lite with fewer or comparable training trajectories, making large-scale executable data collection far more practical and accessible for research.
## β¨ Highlights
- **Scaled Environment Generation** β SWE-Next is an execution-grounded framework that turns real merged-PR commits into self-verifying SWE tasks, and pairs them with high-signal trajectories.
- **Repo-quarter Profiles** - A reusable environment mechanism that amortizes build and storage cost across temporally nearby commits, substantially reducing resource requirements and accelerating large-scale executable SWE data collection.
## π οΈ Setup
### Prerequisites
- Python 3.10+
- Docker (for environment execution)
- [uv](https://github.com/astral-sh/uv) package manager
### Installation
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
source $HOME/.local/bin/env
git clone https://github.com/TIGER-AI-Lab/SWE-Next.git
cd SWE-Next
uv venv && source .venv/bin/activate
uv sync && uv pip install -e .
```
## π€ Data & Models
Pre-built artifacts are available on HuggingFace. Download them into `data/` before running the pipeline:
| Artifact | Description | Download |
|----------|-------------|---------|
| `packages_python_filtered` | 3,900+ Python package list used as pipeline input | `huggingface-cli download TIGER-Lab/packages_python_filtered --repo-type dataset --local-dir data/packages_python_filtered` |
| `new_commit_better_repos` | Repos with confirmed NEW_COMMIT_BETTER commits | `huggingface-cli download TIGER-Lab/new_commit_better_repos --repo-type dataset --local-dir data/new_commit_better_repos` |
| `SWE-Next` | Final curated dataset (2,308 instances) | `huggingface-cli download TIGER-Lab/SWE-Next --repo-type dataset --local-dir data/SWE-Next` |
| `SWE-Next-SFT-Trajectories` | SFT training trajectories | `huggingface-cli download TIGER-Lab/SWE-Next-SFT-Trajectories --repo-type dataset --local-dir data/SWE-Next-SFT-Trajectories` |
Pre-trained models:
| Model | Download |
|-------|---------|
| SWE-Next-7B | `huggingface-cli download TIGER-Lab/SWE-Next-7B --repo-type model --local-dir LlamaFactory/saves/SWE_Next_7B` |
| SWE-Next-14B | `huggingface-cli download TIGER-Lab/SWE-Next-14B --repo-type model --local-dir LlamaFactory/saves/SWE_Next_14B` |
## π³ Environment Generation
SWE-Next extends environment generation to 3,900+ Python packages.
The supported package list is maintained in [`data/packages_python_filtered/packages_python_filtered.csv`](data/packages_python_filtered/packages_python_filtered.csv) and target repositories in [`data/new_commit_better_repos/new_commit_better_repos.csv`](data/new_commit_better_repos/new_commit_better_repos.csv).
## π Data Pipeline (One-Click)
`run_pr_pipeline.zsh` automates the full data collection pipeline. It reads `data/packages_python_filtered/packages_python_filtered.csv`, clones the repos automatically, and processes them end-to-end. If the CSV is not present it falls back to repos already cloned under `outputs/upstream_repos/`.
**Prerequisites:** copy `.env.template` to `.env` and fill in your credentials:
```
OPENAI_API_KEY=... # required for synthetic issue generation
GITHUB_TOKEN=... # required for fetching PRs
DOCKERHUB_USERNAME=... # required for pushing Docker images
DOCKERHUB_TOKEN=...
DOCKERHUB_NAMESPACE=... # your Docker Hub namespace
```
**Option 1 β Dataset only** (runs until `outputs/all_new_commit_better_pr.jsonl` is produced, no trajectories):
```bash
PR_GEN_TRAJ=0 zsh run_pr_pipeline.zsh
```
**Option 2 β Dataset + trajectories** (continues to run GPT-5-mini on the collected instances):
```bash
PR_GEN_TRAJ=1 PR_TRAJ_LLM_NAME=gpt-5-mini zsh run_pr_pipeline.zsh
```
To process a specific repo only:
```bash
PR_GEN_TRAJ=0 zsh run_pr_pipeline.zsh owner/repo
```
## ποΈ Training
### Step 1 β Generate SFT Trajectories
Download the SWE-Next dataset first (see [Data & Models](#data--models)), then collect trajectories using a frontier LLM:
```bash
python src/swenext/agenthub/run/edit.py runagent_multiple \
--dataset "data/SWE-Next/SWE_Next_dataset.jsonl" \
--traj_dir "./traj/swe_next_sft" \
--max_workers 8 \
--k -1 \
--llm_name "gpt-5-mini" \
--use_fn_calling True \
--temperature 0.2 \
--max_steps 40 \
--backend "docker"
```
Or skip this step and use the pre-collected trajectories from HuggingFace (download `SWE-Next-SFT-Trajectories` above).
### Step 2 β SFT Training
Clone [LlamaFactory](https://github.com/hiyouga/LLaMA-Factory) into the project root first:
```bash
git clone https://github.com/hiyouga/LLaMA-Factory.git LlamaFactory
```
Install LlamaFactory dependencies, then train (run from the project root):
```bash
cd LlamaFactory && pip install -e ".[torch,metrics]" && cd ..
# Train 7B agent
llamafactory-cli train train/swe_next_7B.yaml
# Train 14B agent
llamafactory-cli train train/swe_next_14B.yaml
```
Trained model checkpoints will be saved to `LlamaFactory/saves/SWE_Next_7B` and `LlamaFactory/saves/SWE_Next_14B`.
### Step 3 β Evaluate on SWE-Bench Verified
Start a vLLM server with the trained model, then run evaluation:
```bash
# Start vLLM server (in a separate terminal)
vllm serve LlamaFactory/saves/SWE_Next_7B \
--served-model-name SWE-Next-7B \
--port 8000
# Run evaluation on SWE-Bench Verified (8 parallel workers)
export LLM_BASE_URL="http://127.0.0.1:8000/v1"
python src/swenext/agenthub/run/edit.py runagent_multiple \
--dataset "R2E-Gym/SWE-Bench-Verified" \
--split "test" \
--traj_dir "./traj/swe_bench_verified" \
--max_workers 8 \
--k -1 \
--llm_name "openai/SWE-Next-7B" \
--use_fn_calling False \
--temperature 1 \
--max_steps 40 \
--backend "docker"
```
> Use the official [SWE-Bench evaluation harness](https://github.com/SWE-bench/SWE-bench) for final reported scores.
## π Citation
```bibtex
@misc{liang2026swenextscalablerealworldsoftware,
title={SWE-Next: Scalable Real-World Software Engineering Tasks for Agents},
author={Jiarong Liang and Zhiheng Lyu and Zijie Liu and Xiangchao Chen and Ping Nie and Kai Zou and Wenhu Chen},
year={2026},
eprint={2603.20691},
archivePrefix={arXiv},
primaryClass={cs.SE},
url={https://arxiv.org/abs/2603.20691},
}
```
|