melihcatal
/

codedp-cpt-models

@@ -14,7 +14,7 @@ datasets:
 - melihcatal/codedp-cpt
 base_model:
 - ibm-granite/granite-4.0-h-tiny
-- deepseek-ai/deepseek-coder-6.7b-instruct
 - Qwen/Qwen3-4B-Instruct-2507
 library_name: peft
 pipeline_tag: text-generation
@@ -33,9 +33,9 @@ Nine adapter checkpoints are provided — three base models × three privacy con
 | [ibm-granite/granite-4.0-h-tiny](https://huggingface.co/ibm-granite/granite-4.0-h-tiny) | base | No | — | — | `granite-4.0-h-tiny/base/adapter/` |
 | [ibm-granite/granite-4.0-h-tiny](https://huggingface.co/ibm-granite/granite-4.0-h-tiny) | dp3 | Yes | 3.0 | 2.99 | `granite-4.0-h-tiny/dp3/adapter/` |
 | [ibm-granite/granite-4.0-h-tiny](https://huggingface.co/ibm-granite/granite-4.0-h-tiny) | dp8 | Yes | 8.0 | 8.00 | `granite-4.0-h-tiny/dp8/adapter/` |
-| [deepseek-ai/deepseek-coder-6.7b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct) | base | No | — | — | `deepseek-coder-6.7b/base/adapter/` |
-| [deepseek-ai/deepseek-coder-6.7b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct) | dp3 | Yes | 3.0 | 3.00 | `deepseek-coder-6.7b/dp3/adapter/` |
-| [deepseek-ai/deepseek-coder-6.7b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct) | dp8 | Yes | 8.0 | 8.00 | `deepseek-coder-6.7b/dp8/adapter/` |
 | [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) | base | No | — | — | `qwen3-4b-instruct/base/adapter/` |
 | [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) | dp3 | Yes | 3.0 | 2.99 | `qwen3-4b-instruct/dp3/adapter/` |
 | [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) | dp8 | Yes | 8.0 | 8.00 | `qwen3-4b-instruct/dp8/adapter/` |
@@ -93,7 +93,7 @@ model = PeftModel.from_pretrained(model, adapter_path, subfolder=subfolder)
 | Model | GPUs | No-DP | DP ε=3 / ε=8 |
 |---|---|---|---|
 | Granite-4.0-H-Tiny | 4 | 256 (8×8×4) | 512 (8×16×4) |
-| DeepSeek-Coder-6.7B | 8 | 256 (8×4×8) | 512 (8×8×8) |
 | Qwen3-4B-Instruct | 8 | 256 (8×4×8) | 512 (8×8×8) |
 ### Differential Privacy
@@ -110,7 +110,7 @@ model = PeftModel.from_pretrained(model, adapter_path, subfolder=subfolder)
 ### Infrastructure
-- **GPUs:** NVIDIA H200 (140 GB VRAM each) — 4 GPUs for Granite, 8 GPUs for DeepSeek and Qwen
 - **CUDA:** 13.0
 - **Distributed strategy:** DDP (Distributed Data Parallel) with NCCL backend
@@ -132,7 +132,7 @@ model = PeftModel.from_pretrained(model, adapter_path, subfolder=subfolder)
 | Model | No-DP | DP ε=3 | DP ε=8 |
 |---|---|---|---|
 | Granite-4.0-H-Tiny | 0.946 | 1.044 | 1.038 |
-| DeepSeek-Coder-6.7B | 4.840 | 10.326 | 7.523 |
 | Qwen3-4B-Instruct | 0.808 | 0.941 | 0.925 |
 ### Privacy Audit
@@ -144,9 +144,9 @@ New-token canary audit (500 members, 500 non-members, 49-token random prefixes).
 | Granite-4.0-H-Tiny | base | 1.000 | 1.000 | 3.02 |
 | Granite-4.0-H-Tiny | dp3 | 0.543 | 0.513 | 0.00 |
 | Granite-4.0-H-Tiny | dp8 | 0.564 | 0.508 | 0.16 |
-| DeepSeek-Coder-6.7B | base | 0.957 | 0.968 | 3.02 |
-| DeepSeek-Coder-6.7B | dp3 | 0.522 | 0.543 | 0.00 |
-| DeepSeek-Coder-6.7B | dp8 | 0.533 | 0.545 | 0.00 |
 | Qwen3-4B-Instruct | base | 0.969 | 0.884 | 3.02 |
 | Qwen3-4B-Instruct | dp3 | 0.505 | 0.515 | 0.00 |
 | Qwen3-4B-Instruct | dp8 | 0.515 | 0.516 | 0.00 |
@@ -160,7 +160,7 @@ New-token canary audit (500 members, 500 non-members, 49-token random prefixes).
 │   ├── base/                    # No-DP baseline
 │   ├── dp3/                     # DP ε=3
 │   └── dp8/                     # DP ε=8
-├── deepseek-coder-6.7b/
 │   ├── base/
 │   ├── dp3/
 │   └── dp8/

 - melihcatal/codedp-cpt
 base_model:
 - ibm-granite/granite-4.0-h-tiny
+- bigcode/starcoder2-7b
 - Qwen/Qwen3-4B-Instruct-2507
 library_name: peft
 pipeline_tag: text-generation
 | [ibm-granite/granite-4.0-h-tiny](https://huggingface.co/ibm-granite/granite-4.0-h-tiny) | base | No | — | — | `granite-4.0-h-tiny/base/adapter/` |
 | [ibm-granite/granite-4.0-h-tiny](https://huggingface.co/ibm-granite/granite-4.0-h-tiny) | dp3 | Yes | 3.0 | 2.99 | `granite-4.0-h-tiny/dp3/adapter/` |
 | [ibm-granite/granite-4.0-h-tiny](https://huggingface.co/ibm-granite/granite-4.0-h-tiny) | dp8 | Yes | 8.0 | 8.00 | `granite-4.0-h-tiny/dp8/adapter/` |
+| [bigcode/starcoder2-7b](https://huggingface.co/bigcode/starcoder2-7b) | base | No | — | — | `starcoder2-7b/base/adapter/` |
+| [bigcode/starcoder2-7b](https://huggingface.co/bigcode/starcoder2-7b) | dp3 | Yes | 3.0 | 3.00 | `starcoder2-7b/dp3/adapter/` |
+| [bigcode/starcoder2-7b](https://huggingface.co/bigcode/starcoder2-7b) | dp8 | Yes | 8.0 | 8.00 | `starcoder2-7b/dp8/adapter/` |
 | [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) | base | No | — | — | `qwen3-4b-instruct/base/adapter/` |
 | [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) | dp3 | Yes | 3.0 | 2.99 | `qwen3-4b-instruct/dp3/adapter/` |
 | [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) | dp8 | Yes | 8.0 | 8.00 | `qwen3-4b-instruct/dp8/adapter/` |
 | Model | GPUs | No-DP | DP ε=3 / ε=8 |
 |---|---|---|---|
 | Granite-4.0-H-Tiny | 4 | 256 (8×8×4) | 512 (8×16×4) |
+| StarCoder2-7B | 4 | 256 (8×8×4) | 512 (8×16×4) |
 | Qwen3-4B-Instruct | 8 | 256 (8×4×8) | 512 (8×8×8) |
 ### Differential Privacy
 ### Infrastructure
+- **GPUs:** NVIDIA H200 (140 GB VRAM each) — 4 GPUs for Granite and StarCoder2, 8 GPUs for Qwen
 - **CUDA:** 13.0
 - **Distributed strategy:** DDP (Distributed Data Parallel) with NCCL backend
 | Model | No-DP | DP ε=3 | DP ε=8 |
 |---|---|---|---|
 | Granite-4.0-H-Tiny | 0.946 | 1.044 | 1.038 |
+| StarCoder2-7B | 0.745 | 0.843 | 0.841 |
 | Qwen3-4B-Instruct | 0.808 | 0.941 | 0.925 |
 ### Privacy Audit
 | Granite-4.0-H-Tiny | base | 1.000 | 1.000 | 3.02 |
 | Granite-4.0-H-Tiny | dp3 | 0.543 | 0.513 | 0.00 |
 | Granite-4.0-H-Tiny | dp8 | 0.564 | 0.508 | 0.16 |
+| StarCoder2-7B | base | 1.000 | 0.916 | 3.02 |
+| StarCoder2-7B | dp3 | 0.526 | 0.521 | 0.00 |
+| StarCoder2-7B | dp8 | 0.520 | 0.523 | 0.00 |
 | Qwen3-4B-Instruct | base | 0.969 | 0.884 | 3.02 |
 | Qwen3-4B-Instruct | dp3 | 0.505 | 0.515 | 0.00 |
 | Qwen3-4B-Instruct | dp8 | 0.515 | 0.516 | 0.00 |
 │   ├── base/                    # No-DP baseline
 │   ├── dp3/                     # DP ε=3
 │   └── dp8/                     # DP ε=8
+├── starcoder2-7b/
 │   ├── base/
 │   ├── dp3/
 │   └── dp8/