| ````markdown | |
| --- | |
| license: mit | |
| tags: | |
| - codellama | |
| - linux | |
| - bugfix | |
| - lora | |
| - qlora | |
| - git-diff | |
| base_model: codellama/CodeLLaMA-7b-Instruct-hf | |
| model_type: LlamaForCausalLM | |
| library_name: peft | |
| pipeline_tag: text-generation | |
| --- | |
| # CodeLLaMA-Linux-BugFix | |
| A fine-tuned version of `CodeLLaMA-7B-Instruct`, designed specifically for Linux kernel bug fixing using QLoRA (Quantized Low-Rank Adaptation). The model learns to generate Git diff patches based on buggy C code and commit messages. | |
| --- | |
| ## π― Overview | |
| This project targets automated Linux kernel bug fixing by: | |
| - Mining real commit data from kernel Git history | |
| - Training a QLoRA model to generate Git-style fixes | |
| - Evaluating performance using BLEU and ROUGE | |
| - Supporting integration into code review pipelines | |
| --- | |
| ## π Performance Results | |
| **BLEU Score**: 33.87 | |
| **ROUGE Scores**: | |
| - ROUGE-1: P=0.3775, R=0.7306, F1=0.4355 | |
| - ROUGE-2: P=0.2898, R=0.6096, F1=0.3457 | |
| - ROUGE-L: P=0.3023, R=0.6333, F1=0.3612 | |
| These results show that the model generates high-quality diffs with good semantic similarity to ground-truth patches. | |
| --- | |
| ## π§ Model Configuration | |
| - **Base model**: `CodeLLaMA-7B-Instruct` | |
| - **Fine-tuning**: QLoRA (LoRA r=64, Ξ±=16, dropout=0.1) | |
| - **Quantization**: 4-bit NF4 | |
| - **Training**: 3 epochs, batch size 64, LR 2e-4 | |
| - **Precision**: bfloat16 with gradient checkpointing | |
| - **Hardware**: 1Γ NVIDIA H200 (144 GB VRAM) | |
| --- | |
| ## ποΈ Dataset | |
| - 100,000 samples from Linux kernel Git commits | |
| - Format: JSONL with `"prompt"` and `"completion"` fields | |
| - Content: C code segments + commit messages β Git diffs | |
| - Source: Bug-fix commits filtered by keywords like `fix`, `null`, `race`, `panic` | |
| --- | |
| ## π Usage | |
| ```python | |
| from transformers import AutoTokenizer, AutoModelForCausalLM | |
| from peft import PeftModel | |
| model = AutoModelForCausalLM.from_pretrained("codellama/CodeLLaMA-7b-Instruct-hf") | |
| model = PeftModel.from_pretrained(model, "train/output/qlora-codellama-bugfix") | |
| tokenizer = AutoTokenizer.from_pretrained("codellama/CodeLLaMA-7b-Instruct-hf") | |
| prompt = ''' | |
| Given the following original C code: | |
| ```c | |
| if (!file->filter) | |
| return; | |
| ```` | |
| Instruction: Fix the null pointer dereference | |
| Return the diff that fixes it: | |
| ''' | |
| inputs = tokenizer(prompt, return\_tensors="pt") | |
| outputs = model.generate(\*\*inputs, max\_length=512, temperature=0.1) | |
| fix = tokenizer.decode(outputs\[0], skip\_special\_tokens=True) | |
| print(fix) | |
| ``` | |
| --- | |
| ## π Structure | |
| ``` | |
| CodeLLaMA-Linux-BugFix/ | |
| βββ dataset/ # Raw and processed JSONL files | |
| βββ dataset\_builder/ # Scripts for mining & formatting commits | |
| βββ train/ # Training scripts & checkpoints | |
| βββ evaluate/ # Evaluation scripts & results | |
| βββ requirements.txt # Dependencies | |
| ``` | |
| --- | |
| ## π Metrics | |
| | Metric | Score | | |
| |----------|--------| | |
| | BLEU | 33.87 | | |
| | ROUGE-1 | 0.4355 | | |
| | ROUGE-2 | 0.3457 | | |
| | ROUGE-L | 0.3612 | | |
| --- | |
| ## π¬ Use Cases | |
| - Kernel patch suggestion tools | |
| - Code review assistants | |
| - Bug localization + repair research | |
| - APR benchmarks for kernel code | |
| --- | |
| ## π License | |
| MIT License | |
| --- | |
| ## π References | |
| - [CodeLLaMA](https://arxiv.org/abs/2308.12950) | |
| - [QLoRA](https://arxiv.org/abs/2305.14314) | |
| - [LoRA](https://arxiv.org/abs/2106.09685) | |
| ``` | |