Text Generation
Transformers
Safetensors
code
qwen3
code-generation
full-file-apply
apply-model
openai-compatible
ide
conversational
Eval Results (legacy)
text-generation-inference
Instructions to use aiXcoder/aiXapply-4B-SFT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use aiXcoder/aiXapply-4B-SFT with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="aiXcoder/aiXapply-4B-SFT") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("aiXcoder/aiXapply-4B-SFT") model = AutoModelForCausalLM.from_pretrained("aiXcoder/aiXapply-4B-SFT") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use aiXcoder/aiXapply-4B-SFT with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "aiXcoder/aiXapply-4B-SFT" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aiXcoder/aiXapply-4B-SFT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/aiXcoder/aiXapply-4B-SFT
- SGLang
How to use aiXcoder/aiXapply-4B-SFT with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "aiXcoder/aiXapply-4B-SFT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aiXcoder/aiXapply-4B-SFT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "aiXcoder/aiXapply-4B-SFT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aiXcoder/aiXapply-4B-SFT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use aiXcoder/aiXapply-4B-SFT with Docker Model Runner:
docker model run hf.co/aiXcoder/aiXapply-4B-SFT
Upload folder using huggingface_hub
Browse files
README.md
CHANGED
|
@@ -1,4 +1,40 @@
|
|
| 1 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
|
| 3 |
<p align="center">
|
| 4 |
<a href="#overview">Overview</a> |
|
|
@@ -13,17 +49,19 @@
|
|
| 13 |
</p>
|
| 14 |
|
| 15 |
<p align="center">
|
| 16 |
-
<a href="LICENSE"><img src="https://img.shields.io/badge/License-Apache--2.0-blue.svg" alt="Apache-2.0 license"></a>
|
| 17 |
-
<img src="https://img.shields.io/badge/GitHub-
|
| 18 |
<img src="https://img.shields.io/badge/HuggingFace-Test%20Data-yellow.svg" alt="Hugging Face test dataset">
|
| 19 |
<img src="https://img.shields.io/badge/Task-Full--File%20Apply-green.svg" alt="Full-file Apply task">
|
| 20 |
<img src="https://img.shields.io/badge/Model-4B-orange.svg" alt="4B model">
|
| 21 |
<img src="https://img.shields.io/badge/Endpoint-OpenAI--Compatible-lightgrey.svg" alt="OpenAI-compatible endpoint">
|
| 22 |
</p>
|
| 23 |
|
| 24 |
-
**aiXapply** is
|
| 25 |
|
| 26 |
-
|
|
|
|
|
|
|
| 27 |
|
| 28 |
> **AiXapply: Fast and Reliable Full-File Code Integration with Specialized Small Models for IDE Workflows**
|
| 29 |
|
|
@@ -31,9 +69,7 @@ This repository is the official artifact repository for:
|
|
| 31 |
|
| 32 |
Modern coding assistants often produce a local edit snippet first. The hard downstream step is applying that snippet to the original file without changing unrelated code. Unified diffs are compact but brittle, and search-and-replace is easy to generate but depends on exact string matching. aiXapply treats this downstream step as a standalone code-integration task.
|
| 33 |
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
*Figure 1: aiXapply in an IDE workflow. An upstream coding assistant proposes an update snippet, aiXapply expands it into a complete updated file, and the IDE presents the resulting diff for review.*
|
| 37 |
|
| 38 |
The repository includes:
|
| 39 |
|
|
@@ -204,7 +240,7 @@ python3 continue_apply_proxy.py
|
|
| 204 |
|
| 205 |
Then merge the `apply` model block from `continue_config/continue.config.yaml.example` into your Continue config. The proxy strips `<update_file>...</update_file>` tags before returning the result to Continue and supports streaming responses.
|
| 206 |
|
| 207 |
-
See [continue_config/README.md](continue_config/README.md) for configuration details and troubleshooting.
|
| 208 |
|
| 209 |
## Dataset
|
| 210 |
|
|
@@ -237,7 +273,7 @@ Dataset scale:
|
|
| 237 |
|
| 238 |
The test set covers C, C++, Dockerfile, Go, HTML, INI, Java, JavaScript, JSON, Makefile, Markdown, Python, reStructuredText, Rust, Shell, SQL, Text, TypeScript, XML, and YAML.
|
| 239 |
|
| 240 |
-
See [data_generation/README.md](data_generation/README.md) for scripts, configs, and reconstruction steps.
|
| 241 |
|
| 242 |
## Training
|
| 243 |
|
|
@@ -341,13 +377,11 @@ The primary metric is **equivalence accuracy**:
|
|
| 341 |
- Structured formats such as JSON, YAML, XML, and INI are parsed or classified as invalid when parsing fails.
|
| 342 |
- Errors can be grouped into `OUTPUT_INVALID`, `PATCH_NOT_APPLIED`, `PATCH_INCOMPLETE`, `PATCH_INCORRECT`, `WRONG_POSITION`, and `OUT_OF_PATCH_SIDE_EFFECT`.
|
| 343 |
|
| 344 |
-
See [experiments/README.md](experiments/README.md) and [experiments/evaluation/README.md](experiments/evaluation/README.md) for the full experiment layout.
|
| 345 |
|
| 346 |
## Results
|
| 347 |
|
| 348 |
-
|
| 349 |
-
|
| 350 |
-
*Figure 3: Accuracy-latency comparison across unified diff, search-and-replace, and full-file Apply. aiXapply-RL keeps full-file Apply accuracy while reducing latency to an interactive range.*
|
| 351 |
|
| 352 |
### Main Benchmark
|
| 353 |
|
|
@@ -404,13 +438,13 @@ In the aiXcoder IDE plugin, aiXapply is deployed as a dedicated Apply service af
|
|
| 404 |
|
| 405 |
## Contributing
|
| 406 |
|
| 407 |
-
Contributions are welcome. Please read [CONTRIBUTING.md](CONTRIBUTING.md) before opening issues or pull requests.
|
| 408 |
|
| 409 |
For useful bug reports, include the script or endpoint you ran, the command/configuration, the observed output or traceback, and enough model/provider context to reproduce the problem.
|
| 410 |
|
| 411 |
## License
|
| 412 |
|
| 413 |
-
This
|
| 414 |
|
| 415 |
## Citation
|
| 416 |
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
base_model: Qwen/Qwen3-4B
|
| 4 |
+
base_model_relation: finetune
|
| 5 |
+
library_name: transformers
|
| 6 |
+
pipeline_tag: text-generation
|
| 7 |
+
language:
|
| 8 |
+
- code
|
| 9 |
+
tags:
|
| 10 |
+
- qwen3
|
| 11 |
+
- code
|
| 12 |
+
- code-generation
|
| 13 |
+
- full-file-apply
|
| 14 |
+
- apply-model
|
| 15 |
+
- openai-compatible
|
| 16 |
+
- ide
|
| 17 |
+
datasets:
|
| 18 |
+
- aiXcoder/aiXapply_test_data
|
| 19 |
+
metrics:
|
| 20 |
+
- accuracy
|
| 21 |
+
model-index:
|
| 22 |
+
- name: aiXapply-4B-SFT
|
| 23 |
+
results:
|
| 24 |
+
- task:
|
| 25 |
+
type: text-generation
|
| 26 |
+
name: Full-File Apply
|
| 27 |
+
dataset:
|
| 28 |
+
type: aiXcoder/aiXapply_test_data
|
| 29 |
+
name: aiXapply main benchmark
|
| 30 |
+
split: main_test_data
|
| 31 |
+
metrics:
|
| 32 |
+
- type: accuracy
|
| 33 |
+
name: Average equivalence accuracy
|
| 34 |
+
value: 0.944
|
| 35 |
+
---
|
| 36 |
+
|
| 37 |
+
# aiXapply-4B-SFT
|
| 38 |
|
| 39 |
<p align="center">
|
| 40 |
<a href="#overview">Overview</a> |
|
|
|
|
| 49 |
</p>
|
| 50 |
|
| 51 |
<p align="center">
|
| 52 |
+
<a href="https://github.com/aixcoder-plugin/aiXapply-4B/blob/main/LICENSE"><img src="https://img.shields.io/badge/License-Apache--2.0-blue.svg" alt="Apache-2.0 license"></a>
|
| 53 |
+
<img src="https://img.shields.io/badge/GitHub-aiXapply--4B-black.svg" alt="GitHub repository">
|
| 54 |
<img src="https://img.shields.io/badge/HuggingFace-Test%20Data-yellow.svg" alt="Hugging Face test dataset">
|
| 55 |
<img src="https://img.shields.io/badge/Task-Full--File%20Apply-green.svg" alt="Full-file Apply task">
|
| 56 |
<img src="https://img.shields.io/badge/Model-4B-orange.svg" alt="4B model">
|
| 57 |
<img src="https://img.shields.io/badge/Endpoint-OpenAI--Compatible-lightgrey.svg" alt="OpenAI-compatible endpoint">
|
| 58 |
</p>
|
| 59 |
|
| 60 |
+
**aiXapply-4B-SFT** is the supervised fine-tuned aiXapply model for **Full-File Apply**. Given an original file and a localized update snippet, it generates the complete updated file while preserving everything outside the requested edit.
|
| 61 |
|
| 62 |
+
Use this SFT model as the default choice for high full-file Apply accuracy and long-context fidelity. It reaches **94.4%** average equivalence accuracy on the 1,637-sample main benchmark and shows stronger long-context structural preservation in the reported generalization experiments. For the RL-aligned variant used in the latency/accuracy frontier and cross-format experiments, also see [`aiXcoder/aiXapply-4B-RL`](https://huggingface.co/aiXcoder/aiXapply-4B-RL).
|
| 63 |
+
|
| 64 |
+
This model is part of the official artifact release for paper:
|
| 65 |
|
| 66 |
> **AiXapply: Fast and Reliable Full-File Code Integration with Specialized Small Models for IDE Workflows**
|
| 67 |
|
|
|
|
| 69 |
|
| 70 |
Modern coding assistants often produce a local edit snippet first. The hard downstream step is applying that snippet to the original file without changing unrelated code. Unified diffs are compact but brittle, and search-and-replace is easy to generate but depends on exact string matching. aiXapply treats this downstream step as a standalone code-integration task.
|
| 71 |
|
| 72 |
+
In an IDE workflow, an upstream coding assistant proposes an update snippet, aiXapply expands it into a complete updated file, and the IDE presents the resulting diff for review. See the [code repository](https://github.com/aixcoder-plugin/aiXapply-4B) for figures, scripts, and full experiment details.
|
|
|
|
|
|
|
| 73 |
|
| 74 |
The repository includes:
|
| 75 |
|
|
|
|
| 240 |
|
| 241 |
Then merge the `apply` model block from `continue_config/continue.config.yaml.example` into your Continue config. The proxy strips `<update_file>...</update_file>` tags before returning the result to Continue and supports streaming responses.
|
| 242 |
|
| 243 |
+
See [`continue_config/README.md`](https://github.com/aixcoder-plugin/aiXapply-4B/blob/main/continue_config/README.md) for configuration details and troubleshooting.
|
| 244 |
|
| 245 |
## Dataset
|
| 246 |
|
|
|
|
| 273 |
|
| 274 |
The test set covers C, C++, Dockerfile, Go, HTML, INI, Java, JavaScript, JSON, Makefile, Markdown, Python, reStructuredText, Rust, Shell, SQL, Text, TypeScript, XML, and YAML.
|
| 275 |
|
| 276 |
+
See [`data_generation/README.md`](https://github.com/aixcoder-plugin/aiXapply-4B/blob/main/data_generation/README.md) for scripts, configs, and reconstruction steps.
|
| 277 |
|
| 278 |
## Training
|
| 279 |
|
|
|
|
| 377 |
- Structured formats such as JSON, YAML, XML, and INI are parsed or classified as invalid when parsing fails.
|
| 378 |
- Errors can be grouped into `OUTPUT_INVALID`, `PATCH_NOT_APPLIED`, `PATCH_INCOMPLETE`, `PATCH_INCORRECT`, `WRONG_POSITION`, and `OUT_OF_PATCH_SIDE_EFFECT`.
|
| 379 |
|
| 380 |
+
See [`experiments/README.md`](https://github.com/aixcoder-plugin/aiXapply-4B/blob/main/experiments/README.md) and [`experiments/evaluation/README.md`](https://github.com/aixcoder-plugin/aiXapply-4B/blob/main/experiments/evaluation/README.md) for the full experiment layout.
|
| 381 |
|
| 382 |
## Results
|
| 383 |
|
| 384 |
+
aiXapply-RL keeps full-file Apply accuracy while reducing latency to an interactive range in the latency/accuracy frontier experiments, while aiXapply-SFT provides the strongest reported main-benchmark accuracy and long-context result.
|
|
|
|
|
|
|
| 385 |
|
| 386 |
### Main Benchmark
|
| 387 |
|
|
|
|
| 438 |
|
| 439 |
## Contributing
|
| 440 |
|
| 441 |
+
Contributions are welcome. Please read [`CONTRIBUTING.md`](https://github.com/aixcoder-plugin/aiXapply-4B/blob/main/CONTRIBUTING.md) before opening issues or pull requests.
|
| 442 |
|
| 443 |
For useful bug reports, include the script or endpoint you ran, the command/configuration, the observed output or traceback, and enough model/provider context to reproduce the problem.
|
| 444 |
|
| 445 |
## License
|
| 446 |
|
| 447 |
+
This model is licensed under the Apache License 2.0. See the [code repository LICENSE](https://github.com/aixcoder-plugin/aiXapply-4B/blob/main/LICENSE) for details.
|
| 448 |
|
| 449 |
## Citation
|
| 450 |
|