Instructions to use Sud1212/phi3-debug-llm-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use Sud1212/phi3-debug-llm-lora with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct") model = PeftModel.from_pretrained(base_model, "Sud1212/phi3-debug-llm-lora") - Notebooks
- Google Colab
- Kaggle
| license: mit | |
| base_model: microsoft/phi-3-mini-4k-instruct | |
| tags: | |
| - llm | |
| - code-generation | |
| - bug-fixing | |
| - lora | |
| - peft | |
| - python | |
| datasets: | |
| - mbpp | |
| metrics: | |
| - exact_match | |
| - similarity | |
| # DebugGPT LoRA Adapter for Phi-3 Mini | |
| A lightweight LoRA adapter fine-tuned on synthetic Python bug-fixing tasks using the MBPP dataset. This model enhances the ability of Phi-3 Mini to detect and correct common Python syntax errors while preserving general language capabilities. | |
| --- | |
| ## Model Description | |
| - **Base Model:** microsoft/phi-3-mini-4k-instruct | |
| - **Fine-Tuning Method:** QLoRA (Low-Rank Adaptation with 4-bit quantization) | |
| - **Task:** Automated Python bug fixing | |
| The model takes buggy Python code as input and generates the corrected version. | |
| --- | |
| ## Intended Use | |
| This model is designed for: | |
| - Python debugging assistance | |
| - Educational coding tools | |
| - AI-assisted code correction | |
| - Research experiments in code repair | |
| ### Out-of-Scope Use | |
| - Production-critical systems | |
| - Security-sensitive applications | |
| - Complex multi-file debugging | |
| --- | |
| ## Dataset | |
| We use the **MBPP (Mostly Basic Python Problems)** dataset. Since MBPP contains correct code, we generate a bug-fixing dataset by injecting synthetic bugs. | |
| ### Data Format | |
| Each example follows an instruction-tuning format: | |
| ```json | |
| { | |
| "instruction": "Fix the bug in the following Python code", | |
| "input": "<buggy code>", | |
| "output": "<correct code>" | |
| } | |
| ``` | |
| ### Bug Injection Strategy | |
| We introduce controlled bugs such as: | |
| - Operator replacement (`+` → `-`) | |
| - Comparison changes (`>` → `<`) | |
| - Removal of return statements | |
| ### Dataset Size | |
| | Split | Samples | | |
| |------------|---------| | |
| | Train | ~374 | | |
| | Validation | ~90 | | |
| | Test | ~500 | | |
| --- | |
| ## Training Procedure | |
| ### Method: QLoRA | |
| To enable efficient training on limited hardware: | |
| - Base model loaded in 4-bit precision (NF4) | |
| - Base weights frozen | |
| - Only LoRA adapters trained | |
| ### LoRA Configuration | |
| | Parameter | Value | | |
| |-----------------|------------------------------------| | |
| | Rank (r) | 16 | | |
| | Alpha | 32 | | |
| | Dropout | 0.05 | | |
| | Target Modules | q_proj, k_proj, v_proj, o_proj | | |
| ### Training Configuration | |
| | Parameter | Value | | |
| |------------------------|---------| | |
| | Epochs | 3 | | |
| | Learning Rate | 2e-4 | | |
| | Batch Size | 1 | | |
| | Gradient Accumulation | 8 | | |
| | Precision | FP16 | | |
| | Optimizer | AdamW | | |
| --- | |
| ## Hardware & Frameworks | |
| - **GPU:** NVIDIA Tesla T4 | |
| - **Frameworks:** Hugging Face Transformers, PEFT (LoRA), TRL (SFTTrainer), Weights & Biases | |
| --- | |
| ## Evaluation Results | |
| ### Performance Summary | |
| | Metric | Base Model | Fine-Tuned Model | | |
| |-------------------------|---------------|--------------------| | |
| | Syntax Fix Accuracy | Low | Noticeably Higher | | |
| | Indentation Correction | Inconsistent | Reliable | | |
| | Variable Error Fixing | Occasional | Improved | | |
| | Complex Logic Bugs | Limited | Limited (unchanged)| | |
| | Instruction Adherence | Moderate | High | | |
| > **Note:** Quantitative metrics (e.g., exact match accuracy, CodeBLEU) were not computed due to dataset and tooling constraints. | |
| --- | |
| ## Example | |
| ### Input — Buggy Code | |
| ```python | |
| for i in range(5) | |
| print(i) | |
| ``` | |
| ### Output — Fixed Code | |
| ```python | |
| for i in range(5): | |
| print(i) | |
| ``` | |
| --- | |
| ## Limitations | |
| - Small dataset size limits generalization | |
| - Focused primarily on syntax-level bugs | |
| - Limited performance on complex logical errors | |
| - Not evaluated on large-scale real-world codebases | |
| --- | |
| ## Discussion | |
| ### What Worked Well | |
| - QLoRA enabled efficient fine-tuning on limited hardware | |
| - Significant improvement in syntax correction tasks | |
| - Strong adherence to instruction format | |
| ### Challenges | |
| - Limited dataset size | |
| - Lack of quantitative evaluation metrics | |
| - Difficulty handling complex multi-line logic bugs | |
| ### Ethical Considerations | |
| - The model may generate incorrect fixes for complex bugs | |
| - Should be used as an assistive tool, not a final authority | |
| - Users should validate outputs before deployment | |
| --- | |
| ## How to Use | |
| ```python | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| from peft import PeftModel | |
| base_model = AutoModelForCausalLM.from_pretrained( | |
| "microsoft/phi-3-mini-4k-instruct" | |
| ) | |
| tokenizer = AutoTokenizer.from_pretrained( | |
| "microsoft/phi-3-mini-4k-instruct" | |
| ) | |
| model = PeftModel.from_pretrained( | |
| base_model, | |
| "Sud1212/phi3-debug-llm-lora" | |
| ) | |
| prompt = "Fix the bug:\nfor i in range(5)\n print(i)" | |
| inputs = tokenizer(prompt, return_tensors="pt") | |
| outputs = model.generate(**inputs, max_new_tokens=100) | |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) | |
| ``` | |
| --- | |
| ## Resources | |
| - **GitHub Repository:** [Phi3-debugLLM-LoRA](https://github.com/suddhumaddi/Phi3-debugLLM-LoRA) | |
| - **Weights & Biases Dashboard:** [W&B Project](https://wandb.ai/suddhumaddi-woxsen-university/huggingface) | |
| - **Dataset (MBPP):** [Hugging Face Datasets](https://huggingface.co/datasets/mbpp) | |
| --- | |
| ## Author | |
| **Sudarshan Maddi** | |
| Woxsen University | |
| --- | |
| ## License | |
| MIT License |