File size: 5,751 Bytes
fe9e591
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
---
license: apache-2.0
base_model: Qwen/Qwen2.5-Coder-7B-Instruct
tags:
  - abap
  - sap
  - code
  - orpo
  - fine-tuned
  - qwen2
language:
  - en
pipeline_tag: text-generation
library_name: transformers
---

# Qwen-Coder-ABAP

Fine-tuned [Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) for **modern ABAP 7.4+ code generation**.

Trained using **ORPO (Odds Ratio Preference Optimization)** on a high-quality dataset of 280 ABAP preference pairs to promote modern syntax and eliminate legacy patterns.

## Model Details

| Attribute | Value |
|-----------|-------|
| Base Model | Qwen2.5-Coder-7B-Instruct |
| Fine-tuning Method | ORPO |
| Training Examples | 280 preference pairs |
| LoRA Rank | 32 |
| LoRA Alpha | 64 |
| Training Epochs | 3 |
| Hardware | NVIDIA RTX 4060 Ti 16GB |

## Performance

Benchmarked on 12 ABAP coding tasks (modernization, basic coding, completion):

| Metric | Base Model | Fine-tuned | Improvement |
|--------|------------|------------|-------------|
| Modern ABAP patterns | 18 | 23 | +28% |
| Legacy patterns | 7 | 2 | -71% |
| Net score | +11 | +21 | +91% |
| Inference time | 74.7s | 23.5s | 3x faster |

## Usage

### Transformers

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("oisee/qwen-coder-abap")
tokenizer = AutoTokenizer.from_pretrained("oisee/qwen-coder-abap")

messages = [
    {"role": "system", "content": "You are an ABAP programming assistant specialized in modern ABAP 7.4+ syntax."},
    {"role": "user", "content": "Convert this to modern ABAP: READ TABLE lt_data INTO ls_row WITH KEY id = 1."}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

### Ollama

```bash
ollama run oisee/qwen-coder-abap "Convert READ TABLE to modern ABAP"
```

Also available as quantized GGUF: [ollama.com/oisee/qwen-coder-abap](https://ollama.com/oisee/qwen-coder-abap)

## Modern ABAP Patterns (Promoted)

The model is trained to prefer these modern ABAP 7.4+ patterns:

```abap
" Inline declarations
DATA(lv_result) = calculate_total( ).
FIELD-SYMBOL(<ls_row>) TYPE ty_row.

" Table expressions (instead of READ TABLE)
DATA(ls_customer) = lt_customers[ id = '12345' ].

" NEW operator (instead of CREATE OBJECT)
DATA(lo_handler) = NEW zcl_handler( iv_config = 'DEFAULT' ).

" String templates (instead of CONCATENATE)
DATA(lv_msg) = |Customer { lv_id } has { lv_count } orders|.

" VALUE constructor
DATA(lt_data) = VALUE #( ( id = 1 name = 'A' ) ( id = 2 name = 'B' ) ).

" REDUCE for aggregation
DATA(lv_sum) = REDUCE #( INIT s = 0 FOR row IN lt_data NEXT s = s + row-amount ).

" FILTER for table filtering
DATA(lt_active) = FILTER #( lt_data WHERE status = 'A' ).

" Modern LOOP with inline field-symbol
LOOP AT lt_data ASSIGNING FIELD-SYMBOL(<ls_row>).
  <ls_row>-processed = abap_true.
ENDLOOP.
```

## Legacy Patterns (Avoided)

The model learns to avoid these legacy patterns:

```abap
" Legacy - model avoids these
READ TABLE lt_data INTO ls_row WITH KEY id = 1.
CREATE OBJECT lo_handler.
CALL METHOD lo_handler->process.
CONCATENATE lv_a lv_b INTO lv_result.
MOVE lv_source TO lv_target.
MOVE-CORRESPONDING ls_source TO ls_target.
DATA: lv_var TYPE string.  " Colon syntax
```

## Training Dataset

The ORPO training dataset contains **280 high-quality preference pairs** covering:

| Category | Examples | Patterns |
|----------|----------|----------|
| Constructor Expressions | 45 | VALUE #, NEW #, CORRESPONDING #, COND #, SWITCH #, REDUCE |
| Inline Declarations | 30 | DATA(), FIELD-SYMBOL(), @DATA for SELECT |
| String Templates | 25 | \|text { var }\| with formatting |
| Table Expressions | 35 | lt_table[ key = value ], OPTIONAL, DEFAULT |
| Modern SELECT | 25 | @DATA, INTO TABLE @, host variables |
| Exception Handling | 15 | TRY/CATCH with cx_root |
| AMDP/HANA | 12 | AMDP procedures, table functions |
| RAP/BDEF | 10 | Behavior definitions, draft handling |
| ALV/SALV | 15 | CL_SALV_TABLE patterns |
| Unit Testing | 18 | cl_abap_unit_assert patterns |
| Other | 50 | JSON, HTTP, File operations, BAL logging |

Each example contains:
- `prompt`: The coding task
- `chosen`: Modern ABAP solution (preferred)
- `rejected`: Legacy ABAP equivalent (discouraged)

## Training Configuration

```python
# ORPO Config
ORPOConfig(
    max_length=1536,
    beta=0.1,  # ORPO penalty strength
    learning_rate=8e-6,
    per_device_train_batch_size=1,
    gradient_accumulation_steps=8,
    num_train_epochs=3,
    optim="adamw_8bit",
)

# LoRA Config
r=32, lora_alpha=64, lora_dropout=0.05
target_modules=["q_proj", "k_proj", "v_proj", "o_proj",
                "gate_proj", "up_proj", "down_proj"]
```

## Limitations

- Focused on ABAP 7.4+ syntax; may not cover all SAP-specific APIs
- Training data is synthetic; real-world edge cases may vary
- Best for code modernization and generation tasks
- 7B parameter model; larger models may produce higher quality for complex tasks

## License

Apache 2.0 (inherited from [Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct))

## Citation

```bibtex
@misc{qwen-coder-abap,
  author = {oisee},
  title = {Qwen-Coder-ABAP: Fine-tuned Qwen2.5-Coder for Modern ABAP},
  year = {2024},
  publisher = {Hugging Face},
  url = {https://huggingface.co/oisee/qwen-coder-abap}
}
```

## Acknowledgments

- [Qwen Team](https://github.com/QwenLM) for Qwen2.5-Coder
- [Unsloth](https://github.com/unslothai/unsloth) for efficient fine-tuning
- [TRL](https://github.com/huggingface/trl) for ORPO implementation