--- language: - code license: apache-2.0 tags: - differential-privacy - code-generation - continued-pretraining - lora - dp-sgd - opacus - privacy datasets: - melihcatal/codedp-cpt base_model: - ibm-granite/granite-4.0-h-tiny - bigcode/starcoder2-7b - Qwen/Qwen3-4B-Instruct-2507 library_name: peft pipeline_tag: text-generation --- # CodeDP-CPT: Differentially Private Continued Pre-Training for Code Models This repository contains LoRA adapters for code language models trained with **Continued Pre-Training (CPT)** under **Differential Privacy (DP-SGD)**. The models demonstrate that formal privacy guarantees can be applied to code generation models while preserving utility. ## Models Nine adapter checkpoints are provided — three base models × three privacy configurations: | Base Model | Variant | DP | Target ε | Achieved ε | Adapter Path | |---|---|---|---|---|---| | [ibm-granite/granite-4.0-h-tiny](https://huggingface.co/ibm-granite/granite-4.0-h-tiny) | base | No | — | — | `granite-4.0-h-tiny/base/adapter/` | | [ibm-granite/granite-4.0-h-tiny](https://huggingface.co/ibm-granite/granite-4.0-h-tiny) | dp3 | Yes | 3.0 | 2.99 | `granite-4.0-h-tiny/dp3/adapter/` | | [ibm-granite/granite-4.0-h-tiny](https://huggingface.co/ibm-granite/granite-4.0-h-tiny) | dp8 | Yes | 8.0 | 8.00 | `granite-4.0-h-tiny/dp8/adapter/` | | [bigcode/starcoder2-7b](https://huggingface.co/bigcode/starcoder2-7b) | base | No | — | — | `starcoder2-7b/base/adapter/` | | [bigcode/starcoder2-7b](https://huggingface.co/bigcode/starcoder2-7b) | dp3 | Yes | 3.0 | 3.00 | `starcoder2-7b/dp3/adapter/` | | [bigcode/starcoder2-7b](https://huggingface.co/bigcode/starcoder2-7b) | dp8 | Yes | 8.0 | 8.00 | `starcoder2-7b/dp8/adapter/` | | [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) | base | No | — | — | `qwen3-4b-instruct/base/adapter/` | | [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) | dp3 | Yes | 3.0 | 2.99 | `qwen3-4b-instruct/dp3/adapter/` | | [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) | dp8 | Yes | 8.0 | 8.00 | `qwen3-4b-instruct/dp8/adapter/` | ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel base_model_name = "ibm-granite/granite-4.0-h-tiny" adapter_path = "melihcatal/codedp-cpt-models" subfolder = "granite-4.0-h-tiny/dp8/adapter" tokenizer = AutoTokenizer.from_pretrained(base_model_name, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained(base_model_name, trust_remote_code=True) model = PeftModel.from_pretrained(model, adapter_path, subfolder=subfolder) ``` ## Training Details ### Dataset - **Dataset:** [melihcatal/codedp-cpt](https://huggingface.co/datasets/melihcatal/codedp-cpt) — code mined from GitHub repositories with quality filtering and decontamination (file-level, Type-1, and Type-2 clone detection against evaluation benchmarks) - **Mode:** Causal language modeling (continued pre-training) - **Validation split:** 5% held out ### LoRA Configuration | Parameter | Value | |---|---| | Rank (r) | 16 | | Alpha (α) | 32 | | Dropout | 0.05 | | Target modules | q_proj, k_proj, v_proj, o_proj | | Modules to save | lm_head | ### Training Hyperparameters | Parameter | No-DP (base) | DP variants | |---|---|---| | Epochs | 2 | 2 | | Micro-batch size (per GPU) | 8 | 8 | | Learning rate | 1e-4 | 2e-4 | | Optimizer | AdamW | AdamW | | LR scheduler | Cosine | Cosine | | Warmup ratio | 5% | 5% | | Max gradient norm | 1.0 | 1.0 | | Sequence length | 1024 | 1024 | | Precision | bfloat16 | bfloat16 | | Seed | 42 | 42 | **Effective batch sizes** (micro-batch × gradient accumulation steps × GPUs): | Model | GPUs | No-DP | DP ε=3 / ε=8 | |---|---|---|---| | Granite-4.0-H-Tiny | 4 | 256 (8×8×4) | 512 (8×16×4) | | StarCoder2-7B | 4 | 256 (8×8×4) | 512 (8×16×4) | | Qwen3-4B-Instruct | 8 | 256 (8×4×8) | 512 (8×8×8) | ### Differential Privacy | Parameter | Value | |---|---| | Engine | Opacus PrivacyEngine | | Mechanism | Gaussian (DP-SGD) | | Per-sample gradients | Hook-based | | Clipping | Flat (global) | | Target δ | 1e-5 | | Target ε | 3.0 or 8.0 | | Privacy accounting | RDP (Rényi Differential Privacy) | ### Infrastructure - **GPUs:** NVIDIA H200 (140 GB VRAM each) — 4 GPUs for Granite and StarCoder2, 8 GPUs for Qwen - **CUDA:** 13.0 - **Distributed strategy:** DDP (Distributed Data Parallel) with NCCL backend ## Evaluation Results ### Functional Correctness — CodeDP-FC (Granite-4.0-H-Tiny) 103 code generation tasks, 10 samples per task, temperature 0.8. | Variant | pass@1 | pass@5 | pass@10 | |---|---|---|---| | No fine-tuning | 13.5% | 18.4% | 20.4% | | CPT (no DP) | 10.1% | 16.6% | 18.4% | | CPT + DP (ε=3) | 13.7% | 19.1% | 21.4% | | CPT + DP (ε=8) | **14.5%** | **21.1%** | **23.3%** | ### Training Loss (Eval Set) | Model | No-DP | DP ε=3 | DP ε=8 | |---|---|---|---| | Granite-4.0-H-Tiny | 0.946 | 1.044 | 1.038 | | StarCoder2-7B | 0.745 | 0.843 | 0.841 | | Qwen3-4B-Instruct | 0.808 | 0.941 | 0.925 | ### Privacy Audit New-token canary audit (500 members, 500 non-members, 49-token random prefixes). Higher AUC = more memorization; lower = better privacy. | Model | Variant | Loss AUC | Embedding AUC | Empirical ε (p=0.01) | |---|---|---|---|---| | Granite-4.0-H-Tiny | base | 1.000 | 1.000 | 3.02 | | Granite-4.0-H-Tiny | dp3 | 0.543 | 0.513 | 0.00 | | Granite-4.0-H-Tiny | dp8 | 0.564 | 0.508 | 0.16 | | StarCoder2-7B | base | 1.000 | 0.916 | 3.02 | | StarCoder2-7B | dp3 | 0.526 | 0.521 | 0.00 | | StarCoder2-7B | dp8 | 0.520 | 0.523 | 0.00 | | Qwen3-4B-Instruct | base | 0.969 | 0.884 | 3.02 | | Qwen3-4B-Instruct | dp3 | 0.505 | 0.515 | 0.00 | | Qwen3-4B-Instruct | dp8 | 0.515 | 0.516 | 0.00 | **Key finding:** DP training reduces canary audit AUC to near-random (0.5), with empirical ε dropping to 0 in most cases — confirming that the formal privacy guarantees hold in practice. ### MIA Benchmark Validation — BoW Distribution Shift The canary MIA benchmark ([melihcatal/codedp-bench-canary-mia](https://huggingface.co/datasets/melihcatal/codedp-bench-canary-mia)) uses a targeted design where member and non-member samples share the same code prefix and differ only in the PII secret. A bag-of-words Random Forest classifier (5-fold CV) confirms no distribution shift: | PII Type | BoW AUC | ± std | n | |---|---|---|---| | Overall | 0.099 | 0.018 | 400 | | api_key | 0.033 | 0.047 | 80 | | db_url | 0.311 | 0.105 | 80 | | email | 0.078 | 0.099 | 80 | | internal_ip | 0.028 | 0.021 | 80 | | password | 0.055 | 0.048 | 80 | All BoW AUC values are well below 0.5, confirming that MIA signal must come from the model's knowledge of the secret, not surface-level text features.
BoW shift test code ```python from sklearn.ensemble import RandomForestClassifier from sklearn.feature_extraction.text import CountVectorizer from sklearn.model_selection import StratifiedKFold from sklearn.metrics import roc_auc_score import numpy as np, json from datasets import load_dataset ds = load_dataset("melihcatal/codedp-bench-canary-mia", split="train") records = list(ds) def bow_shift(texts, labels, n_folds=5): X = CountVectorizer(max_features=5000, stop_words="english").fit_transform(texts) y = np.array(labels) aucs = [] for tr, te in StratifiedKFold(n_folds, shuffle=True, random_state=42).split(X, y): clf = RandomForestClassifier(100, random_state=42, n_jobs=-1) clf.fit(X[tr], y[tr]) aucs.append(roc_auc_score(y[te], clf.predict_proba(X[te])[:, 1])) return np.mean(aucs), np.std(aucs) # Overall texts = [r["input"] for r in records] labels = [r["label"] for r in records] print("Overall:", bow_shift(texts, labels)) # Per PII category for pii_type in sorted(set(r["pii_type"] for r in records)): cat = [r for r in records if r["pii_type"] == pii_type] print(f"{pii_type}:", bow_shift([r["input"] for r in cat], [r["label"] for r in cat])) ```
## Repository Structure ``` ├── granite-4.0-h-tiny/ │ ├── base/ # No-DP baseline │ ├── dp3/ # DP ε=3 │ └── dp8/ # DP ε=8 ├── starcoder2-7b/ │ ├── base/ │ ├── dp3/ │ └── dp8/ └── qwen3-4b-instruct/ ├── base/ ├── dp3/ └── dp8/ ``` Each variant directory contains: - `adapter/` — LoRA adapter weights (PEFT-compatible) - `tokenizer/` — Tokenizer with any added audit tokens - `resolved_config.yaml` — Full training configuration - `summary.json` — Training and audit metrics - `audit_results.json`, `audit_scores.npz` — Privacy audit artifacts - `metrics.jsonl`, `scalars.csv` — Training logs - `tensorboard/` — TensorBoard events - `codecarbon.csv` — Carbon emissions tracking - `epochs/` — Per-epoch checkpoints and audit results ## Limitations - These are **LoRA adapters**, not standalone models. They require the corresponding base model for inference. - The adapters include additional tokenizer tokens added during the privacy audit process (canary tokens). These do not affect normal generation. - Evaluation results are on the CodeDP-FC benchmark; performance may vary on other code generation tasks. - DP training with tight privacy budgets (ε=3) incurs a utility cost, particularly visible in validation loss. ## Related Resources - **Training dataset:** [melihcatal/codedp-cpt](https://huggingface.co/datasets/melihcatal/codedp-cpt) - **MIA benchmark (general):** [melihcatal/codedp-bench-mia-cpt](https://huggingface.co/datasets/melihcatal/codedp-bench-mia-cpt) - **MIA benchmark (canary):** [melihcatal/codedp-bench-canary-mia](https://huggingface.co/datasets/melihcatal/codedp-bench-canary-mia)