Yash1005 commited on
Commit
f715ac6
·
verified ·
1 Parent(s): dfa9607

docs: add model card with eval metrics on held-out test set

Browse files
Files changed (1) hide show
  1. README.md +201 -199
README.md CHANGED
@@ -1,207 +1,209 @@
1
  ---
 
2
  base_model: Qwen/Qwen3.5-2B
3
  library_name: peft
4
  pipeline_tag: text-generation
 
 
5
  tags:
6
- - base_model:adapter:Qwen/Qwen3.5-2B
7
- - lora
8
- - transformers
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  ---
10
-
11
- # Model Card for Model ID
12
-
13
- <!-- Provide a quick summary of what the model is/does. -->
14
-
15
-
16
-
17
- ## Model Details
18
-
19
- ### Model Description
20
-
21
- <!-- Provide a longer summary of what this model is. -->
22
-
23
-
24
-
25
- - **Developed by:** [More Information Needed]
26
- - **Funded by [optional]:** [More Information Needed]
27
- - **Shared by [optional]:** [More Information Needed]
28
- - **Model type:** [More Information Needed]
29
- - **Language(s) (NLP):** [More Information Needed]
30
- - **License:** [More Information Needed]
31
- - **Finetuned from model [optional]:** [More Information Needed]
32
-
33
- ### Model Sources [optional]
34
-
35
- <!-- Provide the basic links for the model. -->
36
-
37
- - **Repository:** [More Information Needed]
38
- - **Paper [optional]:** [More Information Needed]
39
- - **Demo [optional]:** [More Information Needed]
40
-
41
- ## Uses
42
-
43
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
44
-
45
- ### Direct Use
46
-
47
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
48
-
49
- [More Information Needed]
50
-
51
- ### Downstream Use [optional]
52
-
53
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
54
-
55
- [More Information Needed]
56
-
57
- ### Out-of-Scope Use
58
-
59
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
60
-
61
- [More Information Needed]
62
-
63
- ## Bias, Risks, and Limitations
64
-
65
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
66
-
67
- [More Information Needed]
68
-
69
- ### Recommendations
70
-
71
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
72
-
73
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
74
-
75
- ## How to Get Started with the Model
76
-
77
- Use the code below to get started with the model.
78
-
79
- [More Information Needed]
80
-
81
- ## Training Details
82
-
83
- ### Training Data
84
-
85
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
86
-
87
- [More Information Needed]
88
-
89
- ### Training Procedure
90
-
91
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
92
-
93
- #### Preprocessing [optional]
94
-
95
- [More Information Needed]
96
-
97
-
98
- #### Training Hyperparameters
99
-
100
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
101
-
102
- #### Speeds, Sizes, Times [optional]
103
-
104
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
105
-
106
- [More Information Needed]
107
-
108
  ## Evaluation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
109
 
110
- <!-- This section describes the evaluation protocols and provides the results. -->
111
-
112
- ### Testing Data, Factors & Metrics
113
-
114
- #### Testing Data
115
-
116
- <!-- This should link to a Dataset Card if possible. -->
117
-
118
- [More Information Needed]
119
-
120
- #### Factors
121
-
122
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
123
-
124
- [More Information Needed]
125
-
126
- #### Metrics
127
-
128
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
129
-
130
- [More Information Needed]
131
-
132
- ### Results
133
-
134
- [More Information Needed]
135
-
136
- #### Summary
137
-
138
-
139
-
140
- ## Model Examination [optional]
141
-
142
- <!-- Relevant interpretability work for the model goes here -->
143
-
144
- [More Information Needed]
145
-
146
- ## Environmental Impact
147
-
148
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
149
-
150
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
151
-
152
- - **Hardware Type:** [More Information Needed]
153
- - **Hours used:** [More Information Needed]
154
- - **Cloud Provider:** [More Information Needed]
155
- - **Compute Region:** [More Information Needed]
156
- - **Carbon Emitted:** [More Information Needed]
157
-
158
- ## Technical Specifications [optional]
159
-
160
- ### Model Architecture and Objective
161
-
162
- [More Information Needed]
163
-
164
- ### Compute Infrastructure
165
-
166
- [More Information Needed]
167
-
168
- #### Hardware
169
-
170
- [More Information Needed]
171
-
172
- #### Software
173
-
174
- [More Information Needed]
175
-
176
- ## Citation [optional]
177
-
178
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
179
-
180
- **BibTeX:**
181
-
182
- [More Information Needed]
183
-
184
- **APA:**
185
-
186
- [More Information Needed]
187
-
188
- ## Glossary [optional]
189
-
190
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
191
-
192
- [More Information Needed]
193
-
194
- ## More Information [optional]
195
-
196
- [More Information Needed]
197
-
198
- ## Model Card Authors [optional]
199
-
200
- [More Information Needed]
201
-
202
- ## Model Card Contact
203
-
204
- [More Information Needed]
205
- ### Framework versions
206
-
207
- - PEFT 0.19.1
 
1
  ---
2
+ license: apache-2.0
3
  base_model: Qwen/Qwen3.5-2B
4
  library_name: peft
5
  pipeline_tag: text-generation
6
+ language:
7
+ - en
8
  tags:
9
+ - lora
10
+ - peft
11
+ - qwen
12
+ - guardrails
13
+ - code-detection
14
+ - language-identification
15
+ - multi-label-classification
16
+ - quantization
17
+ - 8-bit
18
+ metrics:
19
+ - accuracy
20
+ - f1
21
+ - precision
22
+ - recall
23
+ model-index:
24
+ - name: PromptInjection-Qwen3.5-2B-v5
25
+ results:
26
+ - task:
27
+ type: text-classification
28
+ name: Multi-label Programming Language Identification
29
+ dataset:
30
+ name: LangID Guard Held-out Test Set
31
+ type: custom
32
+ metrics:
33
+ - type: accuracy
34
+ name: is_valid accuracy
35
+ value: 1.0000
36
+ - type: accuracy
37
+ name: language-set exact match
38
+ value: 0.9600
39
+ - type: f1
40
+ name: binary F1 (positive=contains code)
41
+ value: 1.0000
42
+ - type: f1
43
+ name: macro F1 over languages
44
+ value: 0.9696
45
+ - type: precision
46
+ name: binary precision (positive=contains code)
47
+ value: 1.0000
48
+ - type: recall
49
+ name: binary recall (positive=contains code)
50
+ value: 1.0000
51
  ---
52
+ # PromptInjection-Qwen3.5-2B-v5
53
+ LoRA adapter for **Qwen/Qwen3.5-2B** that identifies which programming languages are embedded in a user prompt across **25 languages and configuration formats**. Trained on a combined dataset of Rosetta Code snippets and curated config-language samples (Dockerfile, YAML, Terraform, Makefile, SQL).
54
+ The model is fine-tuned to emit a strict JSON object describing the languages found:
55
+
56
+ ```json
57
+ {"is_valid": true, "category": {"Python": true, "Bash": true}}
58
+ ```
59
+
60
+ `is_valid` is `true` when at least one code/config snippet is present and `false` for natural-language-only prompts. `category` contains only the detected languages, each mapped to `true`; if no code is present `category` is `{}`.
61
+ ## Quick start
62
+ ```python
63
+ from peft import PeftModel
64
+ from transformers import AutoModelForCausalLM, AutoTokenizer
65
+ import torch, json, re
66
+
67
+ BASE = "Qwen/Qwen3.5-2B"
68
+ ADAPTER = "Accuknoxtechnologies/PromptInjection-Qwen3.5-2B-v5"
69
+
70
+ SYSTEM_MSG = """You are a code language identifier. For the given user prompt, decide whether it contains any embedded source code (program source or recognizable code-like configuration). Output exactly one JSON object and nothing else: {"is_valid": <true|false>, "category": {"<Lang>": true, ...}}.
71
+ No preamble. No explanation. No <think> tags. No markdown code fences. No trailing prose.
72
+ Rules:
73
+ - is_valid is TRUE when the prompt contains at least one code/config snippet, FALSE when the prompt is plain natural-language only.
74
+ - category contains ONLY the languages that appear, each mapped to true. If no code is present, category is the empty object {}.
75
+ - When multiple languages appear, list every distinct one (still only true).
76
+ Allowed language keys (use these exact spellings):
77
+ Python, JavaScript, Java, C, C++, C#, Go, Rust, Kotlin, Swift, Ruby, R, Scala, Perl, Lua, Bash, PowerShell, Batch, SQL, Dockerfile, YAML, Makefile, Terraform, AWK, jq
78
+
79
+ Examples:
80
+
81
+ Input: What's the weather forecast today?
82
+ Output: {"is_valid": false, "category": {}}
83
+
84
+ Input: Run this for me: print('hello world')
85
+ Output: {"is_valid": true, "category": {"Python": true}}
86
+
87
+ Input: Compare these — SELECT * FROM users vs the snippet: console.log(users)
88
+ Output: {"is_valid": true, "category": {"SQL": true, "JavaScript": true}}"""
89
+
90
+ tokenizer = AutoTokenizer.from_pretrained(BASE, trust_remote_code=True)
91
+ model = AutoModelForCausalLM.from_pretrained(
92
+ BASE, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True,
93
+ )
94
+ model = PeftModel.from_pretrained(model, ADAPTER); model.eval()
95
+
96
+ def langid(prompt: str) -> dict:
97
+ chat = tokenizer.apply_chat_template(
98
+ [{"role":"system","content":SYSTEM_MSG},
99
+ {"role":"user","content":prompt}],
100
+ tokenize=False, add_generation_prompt=True, enable_thinking=False)
101
+ inputs = tokenizer(chat, return_tensors="pt").to(model.device)
102
+ out = model.generate(**inputs, max_new_tokens=220, do_sample=False)
103
+ text = tokenizer.decode(out[0, inputs["input_ids"].shape[1]:], skip_special_tokens=True)
104
+ return json.loads(re.search(r'\{.*\}', text, re.DOTALL).group(0))
105
+ ```
106
+
107
+ ## System prompt
108
+ The model was trained with the exact system prompt below. Pass it verbatim at inference time — the output schema depends on this prompt.
109
+
110
+ ```text
111
+ You are a code language identifier. For the given user prompt, decide whether it contains any embedded source code (program source or recognizable code-like configuration). Output exactly one JSON object and nothing else: {"is_valid": <true|false>, "category": {"<Lang>": true, ...}}.
112
+ No preamble. No explanation. No <think> tags. No markdown code fences. No trailing prose.
113
+ Rules:
114
+ - is_valid is TRUE when the prompt contains at least one code/config snippet, FALSE when the prompt is plain natural-language only.
115
+ - category contains ONLY the languages that appear, each mapped to true. If no code is present, category is the empty object {}.
116
+ - When multiple languages appear, list every distinct one (still only true).
117
+ Allowed language keys (use these exact spellings):
118
+ Python, JavaScript, Java, C, C++, C#, Go, Rust, Kotlin, Swift, Ruby, R, Scala, Perl, Lua, Bash, PowerShell, Batch, SQL, Dockerfile, YAML, Makefile, Terraform, AWK, jq
119
+
120
+ Examples:
121
+
122
+ Input: What's the weather forecast today?
123
+ Output: {"is_valid": false, "category": {}}
124
+
125
+ Input: Run this for me: print('hello world')
126
+ Output: {"is_valid": true, "category": {"Python": true}}
127
+
128
+ Input: Compare these — SELECT * FROM users vs the snippet: console.log(users)
129
+ Output: {"is_valid": true, "category": {"SQL": true, "JavaScript": true}}
130
+ ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
131
  ## Evaluation
132
+ Evaluated on **200 held-out prompts** drawn from `test_dataset_langid.csv` (same single + multi + benign composition as training).
133
+
134
+ - Evaluation timestamp: `2026-05-22 00:42 UTC`
135
+ - GPU: `NVIDIA A10G`
136
+ - Source adapter: `Accuknoxtechnologies/PromptInjection-Qwen3.5-2B-v5`
137
+ - JSON parse errors: `0/200` (`0.0%`)
138
+ ### Top-level metrics
139
+ | Metric | Value |
140
+ |---|---:|
141
+ | `is_valid` accuracy | **1.0000** |
142
+ | Language-set exact match | **0.9600** |
143
+ | Binary F1 (positive = contains code) | **1.0000** |
144
+ | Binary precision | 1.0000 |
145
+ | Binary recall | 1.0000 |
146
+ | Macro F1 across languages | **0.9696** |
147
+ ### Confusion matrix — binary `is_valid` decision
148
+ Positive class = the prompt **contains code** (`is_valid=True`).
149
+
150
+ | | predicted contains-code | predicted no-code |
151
+ |---|---:|---:|
152
+ | **actual contains-code** | TP = 181 | FN = 0 |
153
+ | **actual no-code** | FP = 0 | TN = 19 |
154
+ ### Per-language metrics
155
+ Only languages that appear in either the actual or predicted labels are listed.
156
+
157
+ | Language | support | precision | recall | F1 |
158
+ |---|---:|---:|---:|---:|
159
+ | `Python` | 14 | 1.000 | 1.000 | 1.000 |
160
+ | `Terraform` | 14 | 1.000 | 1.000 | 1.000 |
161
+ | `Java` | 12 | 1.000 | 1.000 | 1.000 |
162
+ | `C` | 12 | 1.000 | 1.000 | 1.000 |
163
+ | `Rust` | 12 | 1.000 | 1.000 | 1.000 |
164
+ | `AWK` | 12 | 1.000 | 0.917 | 0.957 |
165
+ | `Ruby` | 11 | 0.917 | 1.000 | 0.957 |
166
+ | `R` | 11 | 1.000 | 1.000 | 1.000 |
167
+ | `Go` | 10 | 1.000 | 0.900 | 0.947 |
168
+ | `Swift` | 10 | 1.000 | 0.900 | 0.947 |
169
+ | `Scala` | 10 | 1.000 | 0.800 | 0.889 |
170
+ | `SQL` | 10 | 1.000 | 1.000 | 1.000 |
171
+ | `jq` | 10 | 0.909 | 1.000 | 0.952 |
172
+ | `JavaScript` | 9 | 0.900 | 1.000 | 0.947 |
173
+ | `Kotlin` | 9 | 1.000 | 1.000 | 1.000 |
174
+ | `Perl` | 9 | 1.000 | 1.000 | 1.000 |
175
+ | `PowerShell` | 9 | 1.000 | 1.000 | 1.000 |
176
+ | `Batch` | 9 | 1.000 | 1.000 | 1.000 |
177
+ | `YAML` | 9 | 1.000 | 0.889 | 0.941 |
178
+ | `C++` | 7 | 1.000 | 0.857 | 0.923 |
179
+ | `C#` | 7 | 0.875 | 1.000 | 0.933 |
180
+ | `Lua` | 7 | 1.000 | 0.857 | 0.923 |
181
+ | `Bash` | 7 | 1.000 | 1.000 | 1.000 |
182
+ | `Dockerfile` | 6 | 0.857 | 1.000 | 0.923 |
183
+ | `Makefile` | 6 | 1.000 | 1.000 | 1.000 |
184
+
185
+ ### Inference latency
186
+ - Mean: **0.99 s/prompt**
187
+ - Median: 0.94 s/prompt
188
+ - p95: 1.35 s/prompt
189
+ - Max: 1.63 s/prompt
190
+
191
+ ## Training setup
192
+ - Base model: `Qwen/Qwen3.5-2B` (loaded in full precision (bf16 / fp16, no `bitsandbytes` quantization))
193
+ - LoRA: r=16, alpha=32, dropout=0.05, target modules = {q,k,v,o,gate,up,down}_proj
194
+ - Optimizer: adamw_torch, lr=1e-4, cosine schedule, warmup 5%
195
+ - Precision: bf16 if available, else fp16
196
+ - Effective batch size: 8 (per-device 1 + grad-accum 8), gradient checkpointing on
197
+ - Max sequence length: 3200 tokens
198
+ - Training data: 10,000 rows (7,000 single-language + 2,000 multi-language + 1,000 benign)
199
+ - Languages: 25 (programming + config formats)
200
+
201
+ ## Supported languages
202
+ The model emits one or more of these keys in the `category` map of its JSON output:
203
+
204
+ ```
205
+ Python, JavaScript, Java, C, C++, C#, Go, Rust, Kotlin, Swift, Ruby, R, Scala, Perl, Lua, Bash, PowerShell, Batch, SQL, Dockerfile, YAML, Makefile, Terraform, AWK, jq
206
+ ```
207
 
208
+ ---
209
+ *Model card generated automatically by `eval_and_push_card.py` on 2026-05-22 00:42 UTC.*