File size: 9,862 Bytes
80c43e2 3379db3 80c43e2 3379db3 80c43e2 f374d1e 80c43e2 38acd3e f374d1e 38acd3e f374d1e 38acd3e 3379db3 38acd3e f374d1e 80c43e2 3379db3 f374d1e 80c43e2 3379db3 80c43e2 3379db3 80c43e2 81c827f 80c43e2 3379db3 80c43e2 81c827f | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 | ---
language:
- code
license: apache-2.0
tags:
- differential-privacy
- code-generation
- continued-pretraining
- lora
- dp-sgd
- opacus
- privacy
datasets:
- melihcatal/codedp-cpt
base_model:
- ibm-granite/granite-4.0-h-tiny
- bigcode/starcoder2-7b
- Qwen/Qwen3-4B-Instruct-2507
library_name: peft
pipeline_tag: text-generation
---
# CodeDP-CPT: Differentially Private Continued Pre-Training for Code Models
This repository contains LoRA adapters for code language models trained with **Continued Pre-Training (CPT)** under **Differential Privacy (DP-SGD)**. The models demonstrate that formal privacy guarantees can be applied to code generation models while preserving utility.
## Models
Nine adapter checkpoints are provided β three base models Γ three privacy configurations:
| Base Model | Variant | DP | Target Ξ΅ | Achieved Ξ΅ | Adapter Path |
|---|---|---|---|---|---|
| [ibm-granite/granite-4.0-h-tiny](https://huggingface.co/ibm-granite/granite-4.0-h-tiny) | base | No | β | β | `granite-4.0-h-tiny/base/adapter/` |
| [ibm-granite/granite-4.0-h-tiny](https://huggingface.co/ibm-granite/granite-4.0-h-tiny) | dp3 | Yes | 3.0 | 2.99 | `granite-4.0-h-tiny/dp3/adapter/` |
| [ibm-granite/granite-4.0-h-tiny](https://huggingface.co/ibm-granite/granite-4.0-h-tiny) | dp8 | Yes | 8.0 | 8.00 | `granite-4.0-h-tiny/dp8/adapter/` |
| [bigcode/starcoder2-7b](https://huggingface.co/bigcode/starcoder2-7b) | base | No | β | β | `starcoder2-7b/base/adapter/` |
| [bigcode/starcoder2-7b](https://huggingface.co/bigcode/starcoder2-7b) | dp3 | Yes | 3.0 | 3.00 | `starcoder2-7b/dp3/adapter/` |
| [bigcode/starcoder2-7b](https://huggingface.co/bigcode/starcoder2-7b) | dp8 | Yes | 8.0 | 8.00 | `starcoder2-7b/dp8/adapter/` |
| [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) | base | No | β | β | `qwen3-4b-instruct/base/adapter/` |
| [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) | dp3 | Yes | 3.0 | 2.99 | `qwen3-4b-instruct/dp3/adapter/` |
| [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) | dp8 | Yes | 8.0 | 8.00 | `qwen3-4b-instruct/dp8/adapter/` |
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model_name = "ibm-granite/granite-4.0-h-tiny"
adapter_path = "melihcatal/codedp-cpt-models"
subfolder = "granite-4.0-h-tiny/dp8/adapter"
tokenizer = AutoTokenizer.from_pretrained(base_model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(base_model_name, trust_remote_code=True)
model = PeftModel.from_pretrained(model, adapter_path, subfolder=subfolder)
```
## Training Details
### Dataset
- **Dataset:** [melihcatal/codedp-cpt](https://huggingface.co/datasets/melihcatal/codedp-cpt) β code mined from GitHub repositories with quality filtering and decontamination (file-level, Type-1, and Type-2 clone detection against evaluation benchmarks)
- **Mode:** Causal language modeling (continued pre-training)
- **Validation split:** 5% held out
### LoRA Configuration
| Parameter | Value |
|---|---|
| Rank (r) | 16 |
| Alpha (Ξ±) | 32 |
| Dropout | 0.05 |
| Target modules | q_proj, k_proj, v_proj, o_proj |
| Modules to save | lm_head |
### Training Hyperparameters
| Parameter | No-DP (base) | DP variants |
|---|---|---|
| Epochs | 2 | 2 |
| Micro-batch size (per GPU) | 8 | 8 |
| Learning rate | 1e-4 | 2e-4 |
| Optimizer | AdamW | AdamW |
| LR scheduler | Cosine | Cosine |
| Warmup ratio | 5% | 5% |
| Max gradient norm | 1.0 | 1.0 |
| Sequence length | 1024 | 1024 |
| Precision | bfloat16 | bfloat16 |
| Seed | 42 | 42 |
**Effective batch sizes** (micro-batch Γ gradient accumulation steps Γ GPUs):
| Model | GPUs | No-DP | DP Ξ΅=3 / Ξ΅=8 |
|---|---|---|---|
| Granite-4.0-H-Tiny | 4 | 256 (8Γ8Γ4) | 512 (8Γ16Γ4) |
| StarCoder2-7B | 4 | 256 (8Γ8Γ4) | 512 (8Γ16Γ4) |
| Qwen3-4B-Instruct | 8 | 256 (8Γ4Γ8) | 512 (8Γ8Γ8) |
### Differential Privacy
| Parameter | Value |
|---|---|
| Engine | Opacus PrivacyEngine |
| Mechanism | Gaussian (DP-SGD) |
| Per-sample gradients | Hook-based |
| Clipping | Flat (global) |
| Target Ξ΄ | 1e-5 |
| Target Ξ΅ | 3.0 or 8.0 |
| Privacy accounting | RDP (RΓ©nyi Differential Privacy) |
### Infrastructure
- **GPUs:** NVIDIA H200 (140 GB VRAM each) β 4 GPUs for Granite and StarCoder2, 8 GPUs for Qwen
- **CUDA:** 13.0
- **Distributed strategy:** DDP (Distributed Data Parallel) with NCCL backend
## Evaluation Results
### Functional Correctness β CodeDP-FC (Granite-4.0-H-Tiny)
103 code generation tasks, 10 samples per task, temperature 0.8.
| Variant | pass@1 | pass@5 | pass@10 |
|---|---|---|---|
| No fine-tuning | 13.5% | 18.4% | 20.4% |
| CPT (no DP) | 10.1% | 16.6% | 18.4% |
| CPT + DP (Ξ΅=3) | 13.7% | 19.1% | 21.4% |
| CPT + DP (Ξ΅=8) | **14.5%** | **21.1%** | **23.3%** |
### Training Loss (Eval Set)
| Model | No-DP | DP Ξ΅=3 | DP Ξ΅=8 |
|---|---|---|---|
| Granite-4.0-H-Tiny | 0.946 | 1.044 | 1.038 |
| StarCoder2-7B | 0.745 | 0.843 | 0.841 |
| Qwen3-4B-Instruct | 0.808 | 0.941 | 0.925 |
### Privacy Audit
New-token canary audit (500 members, 500 non-members, 49-token random prefixes). Higher AUC = more memorization; lower = better privacy.
| Model | Variant | Loss AUC | Embedding AUC | Empirical Ξ΅ (p=0.01) |
|---|---|---|---|---|
| Granite-4.0-H-Tiny | base | 1.000 | 1.000 | 3.02 |
| Granite-4.0-H-Tiny | dp3 | 0.543 | 0.513 | 0.00 |
| Granite-4.0-H-Tiny | dp8 | 0.564 | 0.508 | 0.16 |
| StarCoder2-7B | base | 1.000 | 0.916 | 3.02 |
| StarCoder2-7B | dp3 | 0.526 | 0.521 | 0.00 |
| StarCoder2-7B | dp8 | 0.520 | 0.523 | 0.00 |
| Qwen3-4B-Instruct | base | 0.969 | 0.884 | 3.02 |
| Qwen3-4B-Instruct | dp3 | 0.505 | 0.515 | 0.00 |
| Qwen3-4B-Instruct | dp8 | 0.515 | 0.516 | 0.00 |
**Key finding:** DP training reduces canary audit AUC to near-random (0.5), with empirical Ξ΅ dropping to 0 in most cases β confirming that the formal privacy guarantees hold in practice.
### MIA Benchmark Validation β BoW Distribution Shift
The canary MIA benchmark ([melihcatal/codedp-bench-canary-mia](https://huggingface.co/datasets/melihcatal/codedp-bench-canary-mia)) uses a targeted design where member and non-member samples share the same code prefix and differ only in the PII secret. A bag-of-words Random Forest classifier (5-fold CV) confirms no distribution shift:
| PII Type | BoW AUC | Β± std | n |
|---|---|---|---|
| Overall | 0.099 | 0.018 | 400 |
| api_key | 0.033 | 0.047 | 80 |
| db_url | 0.311 | 0.105 | 80 |
| email | 0.078 | 0.099 | 80 |
| internal_ip | 0.028 | 0.021 | 80 |
| password | 0.055 | 0.048 | 80 |
All BoW AUC values are well below 0.5, confirming that MIA signal must come from the model's knowledge of the secret, not surface-level text features.
<details>
<summary>BoW shift test code</summary>
```python
from sklearn.ensemble import RandomForestClassifier
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import StratifiedKFold
from sklearn.metrics import roc_auc_score
import numpy as np, json
from datasets import load_dataset
ds = load_dataset("melihcatal/codedp-bench-canary-mia", split="train")
records = list(ds)
def bow_shift(texts, labels, n_folds=5):
X = CountVectorizer(max_features=5000, stop_words="english").fit_transform(texts)
y = np.array(labels)
aucs = []
for tr, te in StratifiedKFold(n_folds, shuffle=True, random_state=42).split(X, y):
clf = RandomForestClassifier(100, random_state=42, n_jobs=-1)
clf.fit(X[tr], y[tr])
aucs.append(roc_auc_score(y[te], clf.predict_proba(X[te])[:, 1]))
return np.mean(aucs), np.std(aucs)
# Overall
texts = [r["input"] for r in records]
labels = [r["label"] for r in records]
print("Overall:", bow_shift(texts, labels))
# Per PII category
for pii_type in sorted(set(r["pii_type"] for r in records)):
cat = [r for r in records if r["pii_type"] == pii_type]
print(f"{pii_type}:", bow_shift([r["input"] for r in cat], [r["label"] for r in cat]))
```
</details>
## Repository Structure
```
βββ granite-4.0-h-tiny/
β βββ base/ # No-DP baseline
β βββ dp3/ # DP Ξ΅=3
β βββ dp8/ # DP Ξ΅=8
βββ starcoder2-7b/
β βββ base/
β βββ dp3/
β βββ dp8/
βββ qwen3-4b-instruct/
βββ base/
βββ dp3/
βββ dp8/
```
Each variant directory contains:
- `adapter/` β LoRA adapter weights (PEFT-compatible)
- `tokenizer/` β Tokenizer with any added audit tokens
- `resolved_config.yaml` β Full training configuration
- `summary.json` β Training and audit metrics
- `audit_results.json`, `audit_scores.npz` β Privacy audit artifacts
- `metrics.jsonl`, `scalars.csv` β Training logs
- `tensorboard/` β TensorBoard events
- `codecarbon.csv` β Carbon emissions tracking
- `epochs/` β Per-epoch checkpoints and audit results
## Limitations
- These are **LoRA adapters**, not standalone models. They require the corresponding base model for inference.
- The adapters include additional tokenizer tokens added during the privacy audit process (canary tokens). These do not affect normal generation.
- Evaluation results are on the CodeDP-FC benchmark; performance may vary on other code generation tasks.
- DP training with tight privacy budgets (Ξ΅=3) incurs a utility cost, particularly visible in validation loss.
## Related Resources
- **Training dataset:** [melihcatal/codedp-cpt](https://huggingface.co/datasets/melihcatal/codedp-cpt)
- **MIA benchmark (general):** [melihcatal/codedp-bench-mia-cpt](https://huggingface.co/datasets/melihcatal/codedp-bench-mia-cpt)
- **MIA benchmark (canary):** [melihcatal/codedp-bench-canary-mia](https://huggingface.co/datasets/melihcatal/codedp-bench-canary-mia)
|