File size: 9,862 Bytes

---
language:
- code
license: apache-2.0
tags:
- differential-privacy
- code-generation
- continued-pretraining
- lora
- dp-sgd
- opacus
- privacy
datasets:
- melihcatal/codedp-cpt
base_model:
- ibm-granite/granite-4.0-h-tiny
- bigcode/starcoder2-7b
- Qwen/Qwen3-4B-Instruct-2507
library_name: peft
pipeline_tag: text-generation
---

# CodeDP-CPT: Differentially Private Continued Pre-Training for Code Models

This repository contains LoRA adapters for code language models trained with **Continued Pre-Training (CPT)** under **Differential Privacy (DP-SGD)**. The models demonstrate that formal privacy guarantees can be applied to code generation models while preserving utility.

## Models

Nine adapter checkpoints are provided — three base models × three privacy configurations:

| Base Model | Variant | DP | Target ε | Achieved ε | Adapter Path |
|---|---|---|---|---|---|
| [ibm-granite/granite-4.0-h-tiny](https://huggingface.co/ibm-granite/granite-4.0-h-tiny) | base | No | — | — | `granite-4.0-h-tiny/base/adapter/` |
| [ibm-granite/granite-4.0-h-tiny](https://huggingface.co/ibm-granite/granite-4.0-h-tiny) | dp3 | Yes | 3.0 | 2.99 | `granite-4.0-h-tiny/dp3/adapter/` |
| [ibm-granite/granite-4.0-h-tiny](https://huggingface.co/ibm-granite/granite-4.0-h-tiny) | dp8 | Yes | 8.0 | 8.00 | `granite-4.0-h-tiny/dp8/adapter/` |
| [bigcode/starcoder2-7b](https://huggingface.co/bigcode/starcoder2-7b) | base | No | — | — | `starcoder2-7b/base/adapter/` |
| [bigcode/starcoder2-7b](https://huggingface.co/bigcode/starcoder2-7b) | dp3 | Yes | 3.0 | 3.00 | `starcoder2-7b/dp3/adapter/` |
| [bigcode/starcoder2-7b](https://huggingface.co/bigcode/starcoder2-7b) | dp8 | Yes | 8.0 | 8.00 | `starcoder2-7b/dp8/adapter/` |
| [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) | base | No | — | — | `qwen3-4b-instruct/base/adapter/` |
| [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) | dp3 | Yes | 3.0 | 2.99 | `qwen3-4b-instruct/dp3/adapter/` |
| [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) | dp8 | Yes | 8.0 | 8.00 | `qwen3-4b-instruct/dp8/adapter/` |

## Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model_name = "ibm-granite/granite-4.0-h-tiny"
adapter_path = "melihcatal/codedp-cpt-models"
subfolder = "granite-4.0-h-tiny/dp8/adapter"

tokenizer = AutoTokenizer.from_pretrained(base_model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(base_model_name, trust_remote_code=True)
model = PeftModel.from_pretrained(model, adapter_path, subfolder=subfolder)
```

## Training Details

### Dataset

- **Dataset:** [melihcatal/codedp-cpt](https://huggingface.co/datasets/melihcatal/codedp-cpt) — code mined from GitHub repositories with quality filtering and decontamination (file-level, Type-1, and Type-2 clone detection against evaluation benchmarks)
- **Mode:** Causal language modeling (continued pre-training)
- **Validation split:** 5% held out

### LoRA Configuration

| Parameter | Value |
|---|---|
| Rank (r) | 16 |
| Alpha (α) | 32 |
| Dropout | 0.05 |
| Target modules | q_proj, k_proj, v_proj, o_proj |
| Modules to save | lm_head |

### Training Hyperparameters

| Parameter | No-DP (base) | DP variants |
|---|---|---|
| Epochs | 2 | 2 |
| Micro-batch size (per GPU) | 8 | 8 |
| Learning rate | 1e-4 | 2e-4 |
| Optimizer | AdamW | AdamW |
| LR scheduler | Cosine | Cosine |
| Warmup ratio | 5% | 5% |
| Max gradient norm | 1.0 | 1.0 |
| Sequence length | 1024 | 1024 |
| Precision | bfloat16 | bfloat16 |
| Seed | 42 | 42 |

**Effective batch sizes** (micro-batch × gradient accumulation steps × GPUs):

| Model | GPUs | No-DP | DP ε=3 / ε=8 |
|---|---|---|---|
| Granite-4.0-H-Tiny | 4 | 256 (8×8×4) | 512 (8×16×4) |
| StarCoder2-7B | 4 | 256 (8×8×4) | 512 (8×16×4) |
| Qwen3-4B-Instruct | 8 | 256 (8×4×8) | 512 (8×8×8) |

### Differential Privacy

| Parameter | Value |
|---|---|
| Engine | Opacus PrivacyEngine |
| Mechanism | Gaussian (DP-SGD) |
| Per-sample gradients | Hook-based |
| Clipping | Flat (global) |
| Target δ | 1e-5 |
| Target ε | 3.0 or 8.0 |
| Privacy accounting | RDP (Rényi Differential Privacy) |

### Infrastructure

- **GPUs:** NVIDIA H200 (140 GB VRAM each) — 4 GPUs for Granite and StarCoder2, 8 GPUs for Qwen
- **CUDA:** 13.0
- **Distributed strategy:** DDP (Distributed Data Parallel) with NCCL backend

## Evaluation Results

### Functional Correctness — CodeDP-FC (Granite-4.0-H-Tiny)

103 code generation tasks, 10 samples per task, temperature 0.8.

| Variant | pass@1 | pass@5 | pass@10 |
|---|---|---|---|
| No fine-tuning | 13.5% | 18.4% | 20.4% |
| CPT (no DP) | 10.1% | 16.6% | 18.4% |
| CPT + DP (ε=3) | 13.7% | 19.1% | 21.4% |
| CPT + DP (ε=8) | **14.5%** | **21.1%** | **23.3%** |

### Training Loss (Eval Set)

| Model | No-DP | DP ε=3 | DP ε=8 |
|---|---|---|---|
| Granite-4.0-H-Tiny | 0.946 | 1.044 | 1.038 |
| StarCoder2-7B | 0.745 | 0.843 | 0.841 |
| Qwen3-4B-Instruct | 0.808 | 0.941 | 0.925 |

### Privacy Audit

New-token canary audit (500 members, 500 non-members, 49-token random prefixes). Higher AUC = more memorization; lower = better privacy.

| Model | Variant | Loss AUC | Embedding AUC | Empirical ε (p=0.01) |
|---|---|---|---|---|
| Granite-4.0-H-Tiny | base | 1.000 | 1.000 | 3.02 |
| Granite-4.0-H-Tiny | dp3 | 0.543 | 0.513 | 0.00 |
| Granite-4.0-H-Tiny | dp8 | 0.564 | 0.508 | 0.16 |
| StarCoder2-7B | base | 1.000 | 0.916 | 3.02 |
| StarCoder2-7B | dp3 | 0.526 | 0.521 | 0.00 |
| StarCoder2-7B | dp8 | 0.520 | 0.523 | 0.00 |
| Qwen3-4B-Instruct | base | 0.969 | 0.884 | 3.02 |
| Qwen3-4B-Instruct | dp3 | 0.505 | 0.515 | 0.00 |
| Qwen3-4B-Instruct | dp8 | 0.515 | 0.516 | 0.00 |

**Key finding:** DP training reduces canary audit AUC to near-random (0.5), with empirical ε dropping to 0 in most cases — confirming that the formal privacy guarantees hold in practice.

### MIA Benchmark Validation — BoW Distribution Shift

The canary MIA benchmark ([melihcatal/codedp-bench-canary-mia](https://huggingface.co/datasets/melihcatal/codedp-bench-canary-mia)) uses a targeted design where member and non-member samples share the same code prefix and differ only in the PII secret. A bag-of-words Random Forest classifier (5-fold CV) confirms no distribution shift:

| PII Type | BoW AUC | ± std | n |
|---|---|---|---|
| Overall | 0.099 | 0.018 | 400 |
| api_key | 0.033 | 0.047 | 80 |
| db_url | 0.311 | 0.105 | 80 |
| email | 0.078 | 0.099 | 80 |
| internal_ip | 0.028 | 0.021 | 80 |
| password | 0.055 | 0.048 | 80 |

All BoW AUC values are well below 0.5, confirming that MIA signal must come from the model's knowledge of the secret, not surface-level text features.

<details>
<summary>BoW shift test code</summary>

```python
from sklearn.ensemble import RandomForestClassifier
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import StratifiedKFold
from sklearn.metrics import roc_auc_score
import numpy as np, json
from datasets import load_dataset

ds = load_dataset("melihcatal/codedp-bench-canary-mia", split="train")
records = list(ds)

def bow_shift(texts, labels, n_folds=5):
    X = CountVectorizer(max_features=5000, stop_words="english").fit_transform(texts)
    y = np.array(labels)
    aucs = []
    for tr, te in StratifiedKFold(n_folds, shuffle=True, random_state=42).split(X, y):
        clf = RandomForestClassifier(100, random_state=42, n_jobs=-1)
        clf.fit(X[tr], y[tr])
        aucs.append(roc_auc_score(y[te], clf.predict_proba(X[te])[:, 1]))
    return np.mean(aucs), np.std(aucs)

# Overall
texts = [r["input"] for r in records]
labels = [r["label"] for r in records]
print("Overall:", bow_shift(texts, labels))

# Per PII category
for pii_type in sorted(set(r["pii_type"] for r in records)):
    cat = [r for r in records if r["pii_type"] == pii_type]
    print(f"{pii_type}:", bow_shift([r["input"] for r in cat], [r["label"] for r in cat]))
```
</details>

## Repository Structure

```
├── granite-4.0-h-tiny/
│   ├── base/                    # No-DP baseline
│   ├── dp3/                     # DP ε=3
│   └── dp8/                     # DP ε=8
├── starcoder2-7b/
│   ├── base/
│   ├── dp3/
│   └── dp8/
└── qwen3-4b-instruct/
    ├── base/
    ├── dp3/
    └── dp8/
```

Each variant directory contains:
- `adapter/` — LoRA adapter weights (PEFT-compatible)
- `tokenizer/` — Tokenizer with any added audit tokens
- `resolved_config.yaml` — Full training configuration
- `summary.json` — Training and audit metrics
- `audit_results.json`, `audit_scores.npz` — Privacy audit artifacts
- `metrics.jsonl`, `scalars.csv` — Training logs
- `tensorboard/` — TensorBoard events
- `codecarbon.csv` — Carbon emissions tracking
- `epochs/` — Per-epoch checkpoints and audit results

## Limitations

- These are **LoRA adapters**, not standalone models. They require the corresponding base model for inference.
- The adapters include additional tokenizer tokens added during the privacy audit process (canary tokens). These do not affect normal generation.
- Evaluation results are on the CodeDP-FC benchmark; performance may vary on other code generation tasks.
- DP training with tight privacy budgets (ε=3) incurs a utility cost, particularly visible in validation loss.

## Related Resources

- **Training dataset:** [melihcatal/codedp-cpt](https://huggingface.co/datasets/melihcatal/codedp-cpt)
- **MIA benchmark (general):** [melihcatal/codedp-bench-mia-cpt](https://huggingface.co/datasets/melihcatal/codedp-bench-mia-cpt)
- **MIA benchmark (canary):** [melihcatal/codedp-bench-canary-mia](https://huggingface.co/datasets/melihcatal/codedp-bench-canary-mia)