Instructions to use aiXcoder/aiXapply-4B-SFT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use aiXcoder/aiXapply-4B-SFT with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="aiXcoder/aiXapply-4B-SFT")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("aiXcoder/aiXapply-4B-SFT")
model = AutoModelForCausalLM.from_pretrained("aiXcoder/aiXapply-4B-SFT")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use aiXcoder/aiXapply-4B-SFT with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "aiXcoder/aiXapply-4B-SFT"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "aiXcoder/aiXapply-4B-SFT",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/aiXcoder/aiXapply-4B-SFT

SGLang

How to use aiXcoder/aiXapply-4B-SFT with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "aiXcoder/aiXapply-4B-SFT" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "aiXcoder/aiXapply-4B-SFT",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "aiXcoder/aiXapply-4B-SFT" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "aiXcoder/aiXapply-4B-SFT",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use aiXcoder/aiXapply-4B-SFT with Docker Model Runner:
```
docker model run hf.co/aiXcoder/aiXapply-4B-SFT
```

aiXcoder commited on 5 days ago

Commit

45c4c83

verified ·

1 Parent(s): ffeff0f

Upload folder using huggingface_hub

Browse files

Files changed (1) hide show

README.md +50 -16

README.md CHANGED Viewed

@@ -1,4 +1,40 @@
-# aiXapply
 <p align="center">
   <a href="#overview">Overview</a> |
@@ -13,17 +49,19 @@
 </p>
 <p align="center">
-  <a href="LICENSE"><img src="https://img.shields.io/badge/License-Apache--2.0-blue.svg" alt="Apache-2.0 license"></a>
-  <img src="https://img.shields.io/badge/GitHub-aiXcoder--Apply-black.svg" alt="GitHub repository">
   <img src="https://img.shields.io/badge/HuggingFace-Test%20Data-yellow.svg" alt="Hugging Face test dataset">
   <img src="https://img.shields.io/badge/Task-Full--File%20Apply-green.svg" alt="Full-file Apply task">
   <img src="https://img.shields.io/badge/Model-4B-orange.svg" alt="4B model">
   <img src="https://img.shields.io/badge/Endpoint-OpenAI--Compatible-lightgrey.svg" alt="OpenAI-compatible endpoint">
 </p>
-**aiXapply** is a specialized 4B model and open-source toolkit for **Full-File Apply**: given an original file and a localized update snippet, it generates the complete updated file while preserving everything outside the requested edit.
-This repository is the official artifact repository for:
 > **AiXapply: Fast and Reliable Full-File Code Integration with Specialized Small Models for IDE Workflows**
@@ -31,9 +69,7 @@ This repository is the official artifact repository for:
 Modern coding assistants often produce a local edit snippet first. The hard downstream step is applying that snippet to the original file without changing unrelated code. Unified diffs are compact but brittle, and search-and-replace is easy to generate but depends on exact string matching. aiXapply treats this downstream step as a standalone code-integration task.
-![aiXapply workflow in VS Code](assets/figures/aiXapply-vscode-workflow.png)
-*Figure 1: aiXapply in an IDE workflow. An upstream coding assistant proposes an update snippet, aiXapply expands it into a complete updated file, and the IDE presents the resulting diff for review.*
 The repository includes:
@@ -204,7 +240,7 @@ python3 continue_apply_proxy.py
 Then merge the `apply` model block from `continue_config/continue.config.yaml.example` into your Continue config. The proxy strips `<update_file>...</update_file>` tags before returning the result to Continue and supports streaming responses.
-See [continue_config/README.md](continue_config/README.md) for configuration details and troubleshooting.
 ## Dataset
@@ -237,7 +273,7 @@ Dataset scale:
 The test set covers C, C++, Dockerfile, Go, HTML, INI, Java, JavaScript, JSON, Makefile, Markdown, Python, reStructuredText, Rust, Shell, SQL, Text, TypeScript, XML, and YAML.
-See [data_generation/README.md](data_generation/README.md) for scripts, configs, and reconstruction steps.
 ## Training
@@ -341,13 +377,11 @@ The primary metric is **equivalence accuracy**:
 - Structured formats such as JSON, YAML, XML, and INI are parsed or classified as invalid when parsing fails.
 - Errors can be grouped into `OUTPUT_INVALID`, `PATCH_NOT_APPLIED`, `PATCH_INCOMPLETE`, `PATCH_INCORRECT`, `WRONG_POSITION`, and `OUT_OF_PATCH_SIDE_EFFECT`.
-See [experiments/README.md](experiments/README.md) and [experiments/evaluation/README.md](experiments/evaluation/README.md) for the full experiment layout.
 ## Results
-![Accuracy-latency frontier for code apply](assets/figures/aiXapply-latency-accuracy-frontier.png)
-*Figure 3: Accuracy-latency comparison across unified diff, search-and-replace, and full-file Apply. aiXapply-RL keeps full-file Apply accuracy while reducing latency to an interactive range.*
 ### Main Benchmark
@@ -404,13 +438,13 @@ In the aiXcoder IDE plugin, aiXapply is deployed as a dedicated Apply service af
 ## Contributing
-Contributions are welcome. Please read [CONTRIBUTING.md](CONTRIBUTING.md) before opening issues or pull requests.
 For useful bug reports, include the script or endpoint you ran, the command/configuration, the observed output or traceback, and enough model/provider context to reproduce the problem.
 ## License
-This repository is licensed under the Apache License 2.0. See [LICENSE](LICENSE) for details.
 ## Citation

+---
+license: apache-2.0
+base_model: Qwen/Qwen3-4B
+base_model_relation: finetune
+library_name: transformers
+pipeline_tag: text-generation
+language:
+  - code
+tags:
+  - qwen3
+  - code
+  - code-generation
+  - full-file-apply
+  - apply-model
+  - openai-compatible
+  - ide
+datasets:
+  - aiXcoder/aiXapply_test_data
+metrics:
+  - accuracy
+model-index:
+  - name: aiXapply-4B-SFT
+    results:
+      - task:
+          type: text-generation
+          name: Full-File Apply
+        dataset:
+          type: aiXcoder/aiXapply_test_data
+          name: aiXapply main benchmark
+          split: main_test_data
+        metrics:
+          - type: accuracy
+            name: Average equivalence accuracy
+            value: 0.944
+---
+# aiXapply-4B-SFT
 <p align="center">
   <a href="#overview">Overview</a> |
 </p>
 <p align="center">
+  <a href="https://github.com/aixcoder-plugin/aiXapply-4B/blob/main/LICENSE"><img src="https://img.shields.io/badge/License-Apache--2.0-blue.svg" alt="Apache-2.0 license"></a>
+  <img src="https://img.shields.io/badge/GitHub-aiXapply--4B-black.svg" alt="GitHub repository">
   <img src="https://img.shields.io/badge/HuggingFace-Test%20Data-yellow.svg" alt="Hugging Face test dataset">
   <img src="https://img.shields.io/badge/Task-Full--File%20Apply-green.svg" alt="Full-file Apply task">
   <img src="https://img.shields.io/badge/Model-4B-orange.svg" alt="4B model">
   <img src="https://img.shields.io/badge/Endpoint-OpenAI--Compatible-lightgrey.svg" alt="OpenAI-compatible endpoint">
 </p>
+**aiXapply-4B-SFT** is the supervised fine-tuned aiXapply model for **Full-File Apply**. Given an original file and a localized update snippet, it generates the complete updated file while preserving everything outside the requested edit.
+Use this SFT model as the default choice for high full-file Apply accuracy and long-context fidelity. It reaches **94.4%** average equivalence accuracy on the 1,637-sample main benchmark and shows stronger long-context structural preservation in the reported generalization experiments. For the RL-aligned variant used in the latency/accuracy frontier and cross-format experiments, also see [`aiXcoder/aiXapply-4B-RL`](https://huggingface.co/aiXcoder/aiXapply-4B-RL).
+This model is part of the official artifact release for paper:
 > **AiXapply: Fast and Reliable Full-File Code Integration with Specialized Small Models for IDE Workflows**
 Modern coding assistants often produce a local edit snippet first. The hard downstream step is applying that snippet to the original file without changing unrelated code. Unified diffs are compact but brittle, and search-and-replace is easy to generate but depends on exact string matching. aiXapply treats this downstream step as a standalone code-integration task.
+In an IDE workflow, an upstream coding assistant proposes an update snippet, aiXapply expands it into a complete updated file, and the IDE presents the resulting diff for review. See the [code repository](https://github.com/aixcoder-plugin/aiXapply-4B) for figures, scripts, and full experiment details.
 The repository includes:
 Then merge the `apply` model block from `continue_config/continue.config.yaml.example` into your Continue config. The proxy strips `<update_file>...</update_file>` tags before returning the result to Continue and supports streaming responses.
+See [`continue_config/README.md`](https://github.com/aixcoder-plugin/aiXapply-4B/blob/main/continue_config/README.md) for configuration details and troubleshooting.
 ## Dataset
 The test set covers C, C++, Dockerfile, Go, HTML, INI, Java, JavaScript, JSON, Makefile, Markdown, Python, reStructuredText, Rust, Shell, SQL, Text, TypeScript, XML, and YAML.
+See [`data_generation/README.md`](https://github.com/aixcoder-plugin/aiXapply-4B/blob/main/data_generation/README.md) for scripts, configs, and reconstruction steps.
 ## Training
 - Structured formats such as JSON, YAML, XML, and INI are parsed or classified as invalid when parsing fails.
 - Errors can be grouped into `OUTPUT_INVALID`, `PATCH_NOT_APPLIED`, `PATCH_INCOMPLETE`, `PATCH_INCORRECT`, `WRONG_POSITION`, and `OUT_OF_PATCH_SIDE_EFFECT`.
+See [`experiments/README.md`](https://github.com/aixcoder-plugin/aiXapply-4B/blob/main/experiments/README.md) and [`experiments/evaluation/README.md`](https://github.com/aixcoder-plugin/aiXapply-4B/blob/main/experiments/evaluation/README.md) for the full experiment layout.
 ## Results
+aiXapply-RL keeps full-file Apply accuracy while reducing latency to an interactive range in the latency/accuracy frontier experiments, while aiXapply-SFT provides the strongest reported main-benchmark accuracy and long-context result.
 ### Main Benchmark
 ## Contributing
+Contributions are welcome. Please read [`CONTRIBUTING.md`](https://github.com/aixcoder-plugin/aiXapply-4B/blob/main/CONTRIBUTING.md) before opening issues or pull requests.
 For useful bug reports, include the script or endpoint you ran, the command/configuration, the observed output or traceback, and enough model/provider context to reproduce the problem.
 ## License
+This model is licensed under the Apache License 2.0. See the [code repository LICENSE](https://github.com/aixcoder-plugin/aiXapply-4B/blob/main/LICENSE) for details.
 ## Citation