Update README.md

b03d401 verified 1 day ago

8.31 kB

	# SWE-Next: Scalable Real-World Software Engineering Tasks for Agents

	<p align="left">
	<a href="https://arxiv.org/abs/2603.20691"><img alt="Paper" src="https://img.shields.io/badge/Paper-arXiv-b31b1b?style=for-the-badge&logo=arxiv&logoColor=white"></a>
	<a href="https://tiger-ai-lab.github.io/SWE-Next/"><img alt="Project Page" src="https://img.shields.io/badge/Project%20Page-Website-4285F4?style=for-the-badge&logo=googlechrome&logoColor=white"></a>
	<a href="https://github.com/TIGER-AI-Lab/SWE-Next"><img alt="Code" src="https://img.shields.io/badge/Code-GitHub-181717?style=for-the-badge&logo=github&logoColor=white"></a>
	<a href="https://huggingface.co/datasets/TIGER-Lab/SWE-Next-SFT-Trajectories"><img alt="SFT Trajs" src="https://img.shields.io/badge/SFT%20Trajs-HuggingFace-FFD21E?style=for-the-badge&logo=huggingface&logoColor=000"></a>
	<a href="https://huggingface.co/datasets/TIGER-Lab/SWE-Next"><img alt="Dataset" src="https://img.shields.io/badge/Dataset-HuggingFace-FFD21E?style=for-the-badge&logo=huggingface&logoColor=000"></a>
	<a href="https://huggingface.co/TIGER-Lab/SWE-Next-7B"><img alt="Model 7B" src="https://img.shields.io/badge/Model%207B-HuggingFace-FFD21E?style=for-the-badge&logo=huggingface&logoColor=000"></a>
	<a href="https://huggingface.co/TIGER-Lab/SWE-Next-14B"><img alt="Model 14B" src="https://img.shields.io/badge/Model%2014B-HuggingFace-FFD21E?style=for-the-badge&logo=huggingface&logoColor=000"></a>
	</p>

	## 📰 News

	- 2026-04-07: SWE-Next is now publicly released!

	## 📖 Introduction

	SWE-Next introduces reusable repo-quarter profiles, which reuse the same environment across nearby commits in time while keeping each task run separate and reproducible. Using only 30 hours and 639GB of environment storage, SWE-Next processes 3,971 seed repositories and 102,582 candidate commit pairs mined from real merged PRs to construct a dataset of 2,308 self-verifying instances. SWE-Next improves downstream pass@1 on SWE-Bench Verified and SWE-Bench Lite with fewer or comparable training trajectories, making large-scale executable data collection far more practical and accessible for research.



	## ✨ Highlights

	- Scaled Environment Generation — SWE-Next is an execution-grounded framework that turns real merged-PR commits into self-verifying SWE tasks, and pairs them with high-signal trajectories.

	- Repo-quarter Profiles - A reusable environment mechanism that amortizes build and storage cost across temporally nearby commits, substantially reducing resource requirements and accelerating large-scale executable SWE data collection.


	## 🛠️ Setup

	### Prerequisites

	- Python 3.10+
	- Docker (for environment execution)
	- [uv](https://github.com/astral-sh/uv) package manager

	### Installation

	```bash
	curl -LsSf https://astral.sh/uv/install.sh \| sh
	source $HOME/.local/bin/env

	git clone https://github.com/TIGER-AI-Lab/SWE-Next.git
	cd SWE-Next
	uv venv && source .venv/bin/activate
	uv sync && uv pip install -e .
	```

	## 🤗 Data & Models

	Pre-built artifacts are available on HuggingFace. Download them into `data/` before running the pipeline:

	\| Artifact \| Description \| Download \|
	\|----------\|-------------\|---------\|
	\| `packages_python_filtered` \| 3,900+ Python package list used as pipeline input \| `huggingface-cli download TIGER-Lab/packages_python_filtered --repo-type dataset --local-dir data/packages_python_filtered` \|
	\| `new_commit_better_repos` \| Repos with confirmed NEW_COMMIT_BETTER commits \| `huggingface-cli download TIGER-Lab/new_commit_better_repos --repo-type dataset --local-dir data/new_commit_better_repos` \|
	\| `SWE-Next` \| Final curated dataset (2,308 instances) \| `huggingface-cli download TIGER-Lab/SWE-Next --repo-type dataset --local-dir data/SWE-Next` \|
	\| `SWE-Next-SFT-Trajectories` \| SFT training trajectories \| `huggingface-cli download TIGER-Lab/SWE-Next-SFT-Trajectories --repo-type dataset --local-dir data/SWE-Next-SFT-Trajectories` \|

	Pre-trained models:

	\| Model \| Download \|
	\|-------\|---------\|
	\| SWE-Next-7B \| `huggingface-cli download TIGER-Lab/SWE-Next-7B --repo-type model --local-dir LlamaFactory/saves/SWE_Next_7B` \|
	\| SWE-Next-14B \| `huggingface-cli download TIGER-Lab/SWE-Next-14B --repo-type model --local-dir LlamaFactory/saves/SWE_Next_14B` \|

	## 🐳 Environment Generation

	SWE-Next extends environment generation to 3,900+ Python packages.

	The supported package list is maintained in [`data/packages_python_filtered/packages_python_filtered.csv`](data/packages_python_filtered/packages_python_filtered.csv) and target repositories in [`data/new_commit_better_repos/new_commit_better_repos.csv`](data/new_commit_better_repos/new_commit_better_repos.csv).

	## 🚀 Data Pipeline (One-Click)

	`run_pr_pipeline.zsh` automates the full data collection pipeline. It reads `data/packages_python_filtered/packages_python_filtered.csv`, clones the repos automatically, and processes them end-to-end. If the CSV is not present it falls back to repos already cloned under `outputs/upstream_repos/`.

	Prerequisites: copy `.env.template` to `.env` and fill in your credentials:
	```
	OPENAI_API_KEY=... # required for synthetic issue generation
	GITHUB_TOKEN=... # required for fetching PRs
	DOCKERHUB_USERNAME=... # required for pushing Docker images
	DOCKERHUB_TOKEN=...
	DOCKERHUB_NAMESPACE=... # your Docker Hub namespace
	```

	Option 1 — Dataset only (runs until `outputs/all_new_commit_better_pr.jsonl` is produced, no trajectories):
	```bash
	PR_GEN_TRAJ=0 zsh run_pr_pipeline.zsh
	```

	Option 2 — Dataset + trajectories (continues to run GPT-5-mini on the collected instances):
	```bash
	PR_GEN_TRAJ=1 PR_TRAJ_LLM_NAME=gpt-5-mini zsh run_pr_pipeline.zsh
	```

	To process a specific repo only:
	```bash
	PR_GEN_TRAJ=0 zsh run_pr_pipeline.zsh owner/repo
	```

	## 🏋️ Training

	### Step 1 — Generate SFT Trajectories

	Download the SWE-Next dataset first (see [Data & Models](#data--models)), then collect trajectories using a frontier LLM:

	```bash
	python src/swenext/agenthub/run/edit.py runagent_multiple \
	--dataset "data/SWE-Next/SWE_Next_dataset.jsonl" \
	--traj_dir "./traj/swe_next_sft" \
	--max_workers 8 \
	--k -1 \
	--llm_name "gpt-5-mini" \
	--use_fn_calling True \
	--temperature 0.2 \
	--max_steps 40 \
	--backend "docker"
	```

	Or skip this step and use the pre-collected trajectories from HuggingFace (download `SWE-Next-SFT-Trajectories` above).

	### Step 2 — SFT Training

	Clone [LlamaFactory](https://github.com/hiyouga/LLaMA-Factory) into the project root first:

	```bash
	git clone https://github.com/hiyouga/LLaMA-Factory.git LlamaFactory
	```

	Install LlamaFactory dependencies, then train (run from the project root):

	```bash
	cd LlamaFactory && pip install -e ".[torch,metrics]" && cd ..

	# Train 7B agent
	llamafactory-cli train train/swe_next_7B.yaml

	# Train 14B agent
	llamafactory-cli train train/swe_next_14B.yaml
	```

	Trained model checkpoints will be saved to `LlamaFactory/saves/SWE_Next_7B` and `LlamaFactory/saves/SWE_Next_14B`.

	### Step 3 — Evaluate on SWE-Bench Verified

	Start a vLLM server with the trained model, then run evaluation:

	```bash
	# Start vLLM server (in a separate terminal)
	vllm serve LlamaFactory/saves/SWE_Next_7B \
	--served-model-name SWE-Next-7B \
	--port 8000

	# Run evaluation on SWE-Bench Verified (8 parallel workers)
	export LLM_BASE_URL="http://127.0.0.1:8000/v1"

	python src/swenext/agenthub/run/edit.py runagent_multiple \
	--dataset "R2E-Gym/SWE-Bench-Verified" \
	--split "test" \
	--traj_dir "./traj/swe_bench_verified" \
	--max_workers 8 \
	--k -1 \
	--llm_name "openai/SWE-Next-7B" \
	--use_fn_calling False \
	--temperature 1 \
	--max_steps 40 \
	--backend "docker"
	```

	> Use the official [SWE-Bench evaluation harness](https://github.com/SWE-bench/SWE-bench) for final reported scores.

	## 📝 Citation

	```bibtex
	@misc{liang2026swenextscalablerealworldsoftware,
	title={SWE-Next: Scalable Real-World Software Engineering Tasks for Agents},
	author={Jiarong Liang and Zhiheng Lyu and Zijie Liu and Xiangchao Chen and Ping Nie and Kai Zou and Wenhu Chen},
	year={2026},
	eprint={2603.20691},
	archivePrefix={arXiv},
	primaryClass={cs.SE},
	url={https://arxiv.org/abs/2603.20691},
	}
	```