Upload folder using huggingface_hub

482c9a9 verified 3 months ago

5.25 kB

language:
  - en
  - code
license: apache-2.0
tags:
  - code
  - java
  - bug-fixing
  - code-repair
  - qwen2.5
  - supervised-fine-tuning
base_model: Qwen/Qwen2.5-Coder-7B-Instruct
model_type: qwen2
pipeline_tag: text-generation

HaiJava-Surgeon-Qwen2.5-Coder-7B-SFT-v1

Model Description

HaiJava-Surgeon-Qwen2.5-Coder-7B-SFT-v1 is a specialized code repair model fine-tuned on Java bug-fixing tasks. It is based on Qwen2.5-Coder-7B-Instruct and has been supervised fine-tuned (SFT) with LoRA adapters merged into the base model for optimal performance.

Model Details

Model Name: HaiJava-Surgeon-Qwen2.5-Coder-7B-SFT-v1
Version: v1.0
Base Model: Qwen/Qwen2.5-Coder-7B-Instruct
Model Type: Causal Language Model (Decoder-only Transformer)
Architecture: Qwen2ForCausalLM
Parameters: 7.61B
Precision: FP16
Context Length: 32,768 tokens
Fine-tuning Method: LoRA (Low-Rank Adaptation) merged into base
Training Steps: 1,000
Release Date: 2026-01-02

Intended Use

This model is designed for:

Java bug detection and repair
Syntax error correction
Logic bug fixing
Code quality improvement
Automated code review assistance

Performance

Evaluated on a 50-sample diverse test set:

Metric	Base Model	HaiJava-Surgeon v1	Improvement
Overall Accuracy	18%	28%	+55.6%
Syntax Errors	60%	90%	+50%
Logic Bugs	30%	40%	+33%

Statistical Significance: p-value = 0.0238* (significant at α=0.05)

Strengths

✅ Excellent at syntax error detection and repair (90% accuracy)
✅ Good at logic bug fixing (40% accuracy)
✅ Shows generalization to JavaScript (50% accuracy on OOD samples)

Limitations

⚠️ Struggles with API misuse detection (0% accuracy)
⚠️ Limited edge case handling (0% accuracy)
⚠️ Needs improvement on null pointer exception fixes (0% accuracy)
⚠️ Limited Python support (0% accuracy on OOD samples)

Usage

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
    "HaiJava-Surgeon-Qwen2.5-Coder-7B-SFT-v1",
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True
)

tokenizer = AutoTokenizer.from_pretrained(
    "HaiJava-Surgeon-Qwen2.5-Coder-7B-SFT-v1",
    trust_remote_code=True
)

# Prepare input
buggy_code = """
public class Example {
    public static void main(String[] args) {
        int x = 10
        System.out.println(x);
    }
}
"""

prompt = f"Fix the bug in the following Java code:\n\n{buggy_code}"
messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

# Generate fix
inputs = tokenizer([text], return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    do_sample=False,
    pad_token_id=tokenizer.eos_token_id
)

# Decode response
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
print(response)

Recommended Generation Parameters

For Maximum Accuracy:

outputs = model.generate(
    **inputs,
    max_new_tokens=1024,
    do_sample=False,      # Greedy decoding
    num_beams=5,          # Beam search
)

For Speed:

outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    do_sample=False,
)

Training Details

Training Data

Domain: Java bug-fixing
Categories: Syntax errors, logic bugs, API misuse, edge cases, null handling
Training Steps: 1,000

Training Configuration

Method: LoRA (Low-Rank Adaptation)
LoRA Rank (r): 16
LoRA Alpha: 32
Target Modules: q_proj, v_proj
Dropout: 0.05
Optimizer: AdamW
Learning Rate: 5e-5

Hardware

GPU: NVIDIA GPU with CUDA support
Training Time: ~2-3 hours
Framework: LLaMA-Factory + PyTorch

Evaluation

Evaluated on 50 diverse samples covering:

Syntax errors (10 samples)
Logic bugs (10 samples)
API misuse (10 samples)
Edge cases (10 samples)
Null handling (5 samples)
Out-of-distribution: JavaScript (2 samples)
Out-of-distribution: Python (3 samples)

Evaluation Metrics:

Exact match accuracy
Normalized edit distance
Statistical significance testing (paired t-test)

License

This model inherits the license from the base model: Qwen/Qwen2.5-Coder-7B-Instruct

Citation

If you use this model, please cite:

@misc{haijava-surgeon-v1,
  title={HaiJava-Surgeon-Qwen2.5-Coder-7B-SFT-v1: A Specialized Java Bug-Fixing Model},
  author={Your Name/Organization},
  year={2026},
  url={https://huggingface.co/your-username/HaiJava-Surgeon-Qwen2.5-Coder-7B-SFT-v1}
}

Acknowledgments

Base Model: Qwen Team (Alibaba Cloud)
Fine-tuning Framework: LLaMA-Factory
Evaluation: Custom 50-sample test suite