| --- |
| language: |
| - code |
| license: apache-2.0 |
| tags: |
| - differential-privacy |
| - code-generation |
| - continued-pretraining |
| - lora |
| - dp-sgd |
| - opacus |
| - privacy |
| datasets: |
| - melihcatal/codedp-cpt |
| base_model: |
| - ibm-granite/granite-4.0-h-tiny |
| - bigcode/starcoder2-7b |
| - Qwen/Qwen3-4B-Instruct-2507 |
| library_name: peft |
| pipeline_tag: text-generation |
| --- |
| |
| # CodeDP-CPT: Differentially Private Continued Pre-Training for Code Models |
|
|
| This repository contains LoRA adapters for code language models trained with **Continued Pre-Training (CPT)** under **Differential Privacy (DP-SGD)**. The models demonstrate that formal privacy guarantees can be applied to code generation models while preserving utility. |
|
|
| ## Models |
|
|
| Nine adapter checkpoints are provided β three base models Γ three privacy configurations: |
|
|
| | Base Model | Variant | DP | Target Ξ΅ | Achieved Ξ΅ | Adapter Path | |
| |---|---|---|---|---|---| |
| | [ibm-granite/granite-4.0-h-tiny](https://huggingface.co/ibm-granite/granite-4.0-h-tiny) | base | No | β | β | `granite-4.0-h-tiny/base/adapter/` | |
| | [ibm-granite/granite-4.0-h-tiny](https://huggingface.co/ibm-granite/granite-4.0-h-tiny) | dp3 | Yes | 3.0 | 2.99 | `granite-4.0-h-tiny/dp3/adapter/` | |
| | [ibm-granite/granite-4.0-h-tiny](https://huggingface.co/ibm-granite/granite-4.0-h-tiny) | dp8 | Yes | 8.0 | 8.00 | `granite-4.0-h-tiny/dp8/adapter/` | |
| | [bigcode/starcoder2-7b](https://huggingface.co/bigcode/starcoder2-7b) | base | No | β | β | `starcoder2-7b/base/adapter/` | |
| | [bigcode/starcoder2-7b](https://huggingface.co/bigcode/starcoder2-7b) | dp3 | Yes | 3.0 | 3.00 | `starcoder2-7b/dp3/adapter/` | |
| | [bigcode/starcoder2-7b](https://huggingface.co/bigcode/starcoder2-7b) | dp8 | Yes | 8.0 | 8.00 | `starcoder2-7b/dp8/adapter/` | |
| | [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) | base | No | β | β | `qwen3-4b-instruct/base/adapter/` | |
| | [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) | dp3 | Yes | 3.0 | 2.99 | `qwen3-4b-instruct/dp3/adapter/` | |
| | [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) | dp8 | Yes | 8.0 | 8.00 | `qwen3-4b-instruct/dp8/adapter/` | |
|
|
| ## Usage |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| from peft import PeftModel |
| |
| base_model_name = "ibm-granite/granite-4.0-h-tiny" |
| adapter_path = "melihcatal/codedp-cpt-models" |
| subfolder = "granite-4.0-h-tiny/dp8/adapter" |
| |
| tokenizer = AutoTokenizer.from_pretrained(base_model_name, trust_remote_code=True) |
| model = AutoModelForCausalLM.from_pretrained(base_model_name, trust_remote_code=True) |
| model = PeftModel.from_pretrained(model, adapter_path, subfolder=subfolder) |
| ``` |
|
|
| ## Training Details |
|
|
| ### Dataset |
|
|
| - **Dataset:** [melihcatal/codedp-cpt](https://huggingface.co/datasets/melihcatal/codedp-cpt) β code mined from GitHub repositories with quality filtering and decontamination (file-level, Type-1, and Type-2 clone detection against evaluation benchmarks) |
| - **Mode:** Causal language modeling (continued pre-training) |
| - **Validation split:** 5% held out |
|
|
| ### LoRA Configuration |
|
|
| | Parameter | Value | |
| |---|---| |
| | Rank (r) | 16 | |
| | Alpha (Ξ±) | 32 | |
| | Dropout | 0.05 | |
| | Target modules | q_proj, k_proj, v_proj, o_proj | |
| | Modules to save | lm_head | |
| |
| ### Training Hyperparameters |
| |
| | Parameter | No-DP (base) | DP variants | |
| |---|---|---| |
| | Epochs | 2 | 2 | |
| | Micro-batch size (per GPU) | 8 | 8 | |
| | Learning rate | 1e-4 | 2e-4 | |
| | Optimizer | AdamW | AdamW | |
| | LR scheduler | Cosine | Cosine | |
| | Warmup ratio | 5% | 5% | |
| | Max gradient norm | 1.0 | 1.0 | |
| | Sequence length | 1024 | 1024 | |
| | Precision | bfloat16 | bfloat16 | |
| | Seed | 42 | 42 | |
| |
| **Effective batch sizes** (micro-batch Γ gradient accumulation steps Γ GPUs): |
| |
| | Model | GPUs | No-DP | DP Ξ΅=3 / Ξ΅=8 | |
| |---|---|---|---| |
| | Granite-4.0-H-Tiny | 4 | 256 (8Γ8Γ4) | 512 (8Γ16Γ4) | |
| | StarCoder2-7B | 4 | 256 (8Γ8Γ4) | 512 (8Γ16Γ4) | |
| | Qwen3-4B-Instruct | 8 | 256 (8Γ4Γ8) | 512 (8Γ8Γ8) | |
| |
| ### Differential Privacy |
| |
| | Parameter | Value | |
| |---|---| |
| | Engine | Opacus PrivacyEngine | |
| | Mechanism | Gaussian (DP-SGD) | |
| | Per-sample gradients | Hook-based | |
| | Clipping | Flat (global) | |
| | Target Ξ΄ | 1e-5 | |
| | Target Ξ΅ | 3.0 or 8.0 | |
| | Privacy accounting | RDP (RΓ©nyi Differential Privacy) | |
| |
| ### Infrastructure |
| |
| - **GPUs:** NVIDIA H200 (140 GB VRAM each) β 4 GPUs for Granite and StarCoder2, 8 GPUs for Qwen |
| - **CUDA:** 13.0 |
| - **Distributed strategy:** DDP (Distributed Data Parallel) with NCCL backend |
| |
| ## Evaluation Results |
| |
| ### Functional Correctness β CodeDP-FC (Granite-4.0-H-Tiny) |
| |
| 103 code generation tasks, 10 samples per task, temperature 0.8. |
| |
| | Variant | pass@1 | pass@5 | pass@10 | |
| |---|---|---|---| |
| | No fine-tuning | 13.5% | 18.4% | 20.4% | |
| | CPT (no DP) | 10.1% | 16.6% | 18.4% | |
| | CPT + DP (Ξ΅=3) | 13.7% | 19.1% | 21.4% | |
| | CPT + DP (Ξ΅=8) | **14.5%** | **21.1%** | **23.3%** | |
| |
| ### Training Loss (Eval Set) |
| |
| | Model | No-DP | DP Ξ΅=3 | DP Ξ΅=8 | |
| |---|---|---|---| |
| | Granite-4.0-H-Tiny | 0.946 | 1.044 | 1.038 | |
| | StarCoder2-7B | 0.745 | 0.843 | 0.841 | |
| | Qwen3-4B-Instruct | 0.808 | 0.941 | 0.925 | |
| |
| ### Privacy Audit |
| |
| New-token canary audit (500 members, 500 non-members, 49-token random prefixes). Higher AUC = more memorization; lower = better privacy. |
| |
| | Model | Variant | Loss AUC | Embedding AUC | Empirical Ξ΅ (p=0.01) | |
| |---|---|---|---|---| |
| | Granite-4.0-H-Tiny | base | 1.000 | 1.000 | 3.02 | |
| | Granite-4.0-H-Tiny | dp3 | 0.543 | 0.513 | 0.00 | |
| | Granite-4.0-H-Tiny | dp8 | 0.564 | 0.508 | 0.16 | |
| | StarCoder2-7B | base | 1.000 | 0.916 | 3.02 | |
| | StarCoder2-7B | dp3 | 0.526 | 0.521 | 0.00 | |
| | StarCoder2-7B | dp8 | 0.520 | 0.523 | 0.00 | |
| | Qwen3-4B-Instruct | base | 0.969 | 0.884 | 3.02 | |
| | Qwen3-4B-Instruct | dp3 | 0.505 | 0.515 | 0.00 | |
| | Qwen3-4B-Instruct | dp8 | 0.515 | 0.516 | 0.00 | |
| |
| **Key finding:** DP training reduces canary audit AUC to near-random (0.5), with empirical Ξ΅ dropping to 0 in most cases β confirming that the formal privacy guarantees hold in practice. |
| |
| ### MIA Benchmark Validation β BoW Distribution Shift |
| |
| The canary MIA benchmark ([melihcatal/codedp-bench-canary-mia](https://huggingface.co/datasets/melihcatal/codedp-bench-canary-mia)) uses a targeted design where member and non-member samples share the same code prefix and differ only in the PII secret. A bag-of-words Random Forest classifier (5-fold CV) confirms no distribution shift: |
| |
| | PII Type | BoW AUC | Β± std | n | |
| |---|---|---|---| |
| | Overall | 0.099 | 0.018 | 400 | |
| | api_key | 0.033 | 0.047 | 80 | |
| | db_url | 0.311 | 0.105 | 80 | |
| | email | 0.078 | 0.099 | 80 | |
| | internal_ip | 0.028 | 0.021 | 80 | |
| | password | 0.055 | 0.048 | 80 | |
|
|
| All BoW AUC values are well below 0.5, confirming that MIA signal must come from the model's knowledge of the secret, not surface-level text features. |
|
|
| <details> |
| <summary>BoW shift test code</summary> |
|
|
| ```python |
| from sklearn.ensemble import RandomForestClassifier |
| from sklearn.feature_extraction.text import CountVectorizer |
| from sklearn.model_selection import StratifiedKFold |
| from sklearn.metrics import roc_auc_score |
| import numpy as np, json |
| from datasets import load_dataset |
| |
| ds = load_dataset("melihcatal/codedp-bench-canary-mia", split="train") |
| records = list(ds) |
| |
| def bow_shift(texts, labels, n_folds=5): |
| X = CountVectorizer(max_features=5000, stop_words="english").fit_transform(texts) |
| y = np.array(labels) |
| aucs = [] |
| for tr, te in StratifiedKFold(n_folds, shuffle=True, random_state=42).split(X, y): |
| clf = RandomForestClassifier(100, random_state=42, n_jobs=-1) |
| clf.fit(X[tr], y[tr]) |
| aucs.append(roc_auc_score(y[te], clf.predict_proba(X[te])[:, 1])) |
| return np.mean(aucs), np.std(aucs) |
| |
| # Overall |
| texts = [r["input"] for r in records] |
| labels = [r["label"] for r in records] |
| print("Overall:", bow_shift(texts, labels)) |
| |
| # Per PII category |
| for pii_type in sorted(set(r["pii_type"] for r in records)): |
| cat = [r for r in records if r["pii_type"] == pii_type] |
| print(f"{pii_type}:", bow_shift([r["input"] for r in cat], [r["label"] for r in cat])) |
| ``` |
| </details> |
|
|
| ## Repository Structure |
|
|
| ``` |
| βββ granite-4.0-h-tiny/ |
| β βββ base/ # No-DP baseline |
| β βββ dp3/ # DP Ξ΅=3 |
| β βββ dp8/ # DP Ξ΅=8 |
| βββ starcoder2-7b/ |
| β βββ base/ |
| β βββ dp3/ |
| β βββ dp8/ |
| βββ qwen3-4b-instruct/ |
| βββ base/ |
| βββ dp3/ |
| βββ dp8/ |
| ``` |
|
|
| Each variant directory contains: |
| - `adapter/` β LoRA adapter weights (PEFT-compatible) |
| - `tokenizer/` β Tokenizer with any added audit tokens |
| - `resolved_config.yaml` β Full training configuration |
| - `summary.json` β Training and audit metrics |
| - `audit_results.json`, `audit_scores.npz` β Privacy audit artifacts |
| - `metrics.jsonl`, `scalars.csv` β Training logs |
| - `tensorboard/` β TensorBoard events |
| - `codecarbon.csv` β Carbon emissions tracking |
| - `epochs/` β Per-epoch checkpoints and audit results |
|
|
| ## Limitations |
|
|
| - These are **LoRA adapters**, not standalone models. They require the corresponding base model for inference. |
| - The adapters include additional tokenizer tokens added during the privacy audit process (canary tokens). These do not affect normal generation. |
| - Evaluation results are on the CodeDP-FC benchmark; performance may vary on other code generation tasks. |
| - DP training with tight privacy budgets (Ξ΅=3) incurs a utility cost, particularly visible in validation loss. |
|
|
| ## Related Resources |
|
|
| - **Training dataset:** [melihcatal/codedp-cpt](https://huggingface.co/datasets/melihcatal/codedp-cpt) |
| - **MIA benchmark (general):** [melihcatal/codedp-bench-mia-cpt](https://huggingface.co/datasets/melihcatal/codedp-bench-mia-cpt) |
| - **MIA benchmark (canary):** [melihcatal/codedp-bench-canary-mia](https://huggingface.co/datasets/melihcatal/codedp-bench-canary-mia) |
|
|