|
|
--- |
|
|
license: llama2 |
|
|
base_model: |
|
|
- codellama/CodeLlama-13b-Instruct-hf |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
CodeLlama-13b-MORepair is a program repair model fine-tuned from CodeLlama-13b-instruct using a novel multi-objective fine-tuning framework called MOREPAIR. This model is specifically designed to improve automated program repair capabilities by learning both code transformations and repair logic reasoning. |
|
|
|
|
|
[Paper](https://arxiv.org/abs/2404.12636) | [Code](https://github.com/buaabarty/morepair) | [Colab](https://colab.research.google.com/drive/1vlabdN5Oucm-5kVtMHuEw-kvqDOtB5hg) |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model in your research, please cite: |
|
|
```bibtex |
|
|
@article{10.1145/3735129, |
|
|
author = {Yang, Boyang and Tian, Haoye and Ren, Jiadong and Zhang, Hongyu and Klein, Jacques and Bissyande, Tegawende and Le Goues, Claire and Jin, Shunfu}, |
|
|
title = {MORepair: Teaching LLMs to Repair Code via Multi-Objective Fine-Tuning}, |
|
|
year = {2025}, |
|
|
publisher = {Association for Computing Machinery}, |
|
|
issn = {1049-331X}, |
|
|
url = {https://doi.org/10.1145/3735129}, |
|
|
doi = {10.1145/3735129}, |
|
|
journal = {ACM Trans. Softw. Eng. Methodol.}, |
|
|
} |
|
|
``` |
|
|
|
|
|
## Model Description |
|
|
|
|
|
- **Base Model**: CodeLlama-13b-instruct |
|
|
- **Training Technique**: Multi-objective fine-tuning with MOREPAIR framework |
|
|
- **Supported Languages**: Primarily tested on C++ and Java, but likely generalizes to other languages |
|
|
- **Primary Use**: Automated program repair |
|
|
- **License**: Llama 2 Community License |
|
|
- **Evaluation Benchmarks**: [EvalRepair-Java](https://huggingface.co/datasets/barty/EvalRepair-Java) | [EvalRepair-C++](https://huggingface.co/datasets/barty/EvalRepair-Cpp) | [D4J-Repair](https://huggingface.co/datasets/barty/D4J-Repair) | [SWE-Repair](https://huggingface.co/datasets/barty/SWE-Repair) |
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Training Data |
|
|
- **Dataset**: [TutorLLMCode](https://tutorcode.org/docs/) |
|
|
- **Size**: 1,535 pairs of buggy and repaired code |
|
|
- **Nature**: Programming task corrections with LLM-generated repair guidance |
|
|
|
|
|
### Training Approach |
|
|
The model was trained using MOREPAIR, which employs: |
|
|
- Multi-objective learning with two objectives: |
|
|
1. Generating repaired code |
|
|
2. Producing repaired code with explanatory guidance |
|
|
- QLoRA fine-tuning (only 1.84% of parameters modified) |
|
|
- NEFTune for improved generalization |
|
|
- LLM-generated guidance for understanding repair logic |
|
|
|
|
|
## Usage |
|
|
|
|
|
Here's how to use the model with the Hugging Face Transformers library: |
|
|
|
|
|
### Installation |
|
|
```bash |
|
|
pip install transformers torch |
|
|
``` |
|
|
|
|
|
### Basic Usage |
|
|
````python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline |
|
|
|
|
|
# Load model and tokenizer |
|
|
model_name = "barty/CodeLlama-13B-MORepair" |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
model_name, |
|
|
device_map="auto", |
|
|
load_in_8bit=True, |
|
|
torch_dtype=torch.float16 |
|
|
) |
|
|
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer) |
|
|
|
|
|
def repair_code(buggy_code, filename="example.java"): |
|
|
# Construct prompt in the format model expects |
|
|
prompt = f"""[INST] This is an incorrect code({filename}): |
|
|
```java |
|
|
{buggy_code} |
|
|
``` |
|
|
You are a software engineer. Can you repair the incorrect code? |
|
|
[/INST] |
|
|
```java |
|
|
""" |
|
|
|
|
|
# Calculate token count for length control |
|
|
prompt_tokens = len(tokenizer.tokenize(prompt)) |
|
|
max_new_tokens = 500 - prompt_tokens |
|
|
|
|
|
# Generate repair |
|
|
output = pipe( |
|
|
prompt, |
|
|
min_length=prompt_tokens + 64, |
|
|
max_length=prompt_tokens + max_new_tokens, |
|
|
temperature=1.0, |
|
|
do_sample=True |
|
|
) |
|
|
|
|
|
# Extract the generated code |
|
|
full_text = output[0]['generated_text'] |
|
|
fixed_code = full_text.split('[/INST]')[1].strip() |
|
|
|
|
|
return full_text, fixed_code |
|
|
|
|
|
# Example usage |
|
|
buggy_code = """ |
|
|
public static int findMinRotated(int[] arr) { |
|
|
int left = 0; |
|
|
int right = arr.length - 1; |
|
|
|
|
|
while (left < right) { |
|
|
int mid = (left + right) / 2; |
|
|
if (arr[mid] > arr[right]) |
|
|
left = mid; // Bug: should be mid + 1 |
|
|
else |
|
|
right = mid; |
|
|
} |
|
|
return arr[left]; |
|
|
} |
|
|
""" |
|
|
|
|
|
full_response, fixed_code = repair_code(buggy_code) |
|
|
print("Fixed code:") |
|
|
print(fixed_code) |
|
|
```` |
|
|
|
|
|
### Important Parameters |
|
|
- `load_in_8bit=True`: Enables 8-bit quantization for efficient inference |
|
|
- `temperature=1.0`: Controls randomness in generation |
|
|
- `do_sample=True`: Enables sampling-based generation |
|
|
- `min_length`: Minimum length of generated text |
|
|
- `max_length`: Maximum length of generated text |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Performance varies across different programming languages |
|
|
- May require multiple attempts to generate correct fixes |
|
|
- Should be used with appropriate test cases to validate repairs |
|
|
- May not handle very complex or multi-file program repairs |
|
|
|
|
|
## Technical Specifications |
|
|
|
|
|
- **Architecture**: Based on CodeLlama-13b-instruct |
|
|
- **Parameters**: Same as base model (13B) |
|
|
- **Fine-tuning Method**: QLoRA + NEFTune |
|
|
- **Context Window**: Same as CodeLlama-13b-instruct |
|
|
- **Input Format**: Code snippets with optional repair guidance |
|
|
|
|
|
|
|
|
|