anicka commited on
Commit
fcaf6ce
·
verified ·
1 Parent(s): 958a1f8

Document safety posture, prompt echo regression, and test generation limits

Browse files
Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -160,12 +160,25 @@ teapot train configs/cve-backport.config --backend qlora-hf
160
  teapot eval configs/cve-backport.config
161
  ```
162
 
163
- All versioned datasets available at the dataset repo (`train-v1.jsonl` through `train-v4.jsonl`).
164
 
165
  ## Intended Use
166
 
167
  This model assists with security patch backporting in Linux distribution maintenance. It is a research tool — all generated patches must be reviewed by a maintainer before application.
168
 
 
 
 
 
 
 
 
 
 
 
 
 
 
169
  ## License
170
 
171
  Apache-2.0 (inherited from Qwen2.5-Coder-32B-Instruct).
 
160
  teapot eval configs/cve-backport.config
161
  ```
162
 
163
+ Dataset: [anicka/cve-backport-codegen-dataset](https://huggingface.co/datasets/anicka/cve-backport-codegen-dataset) (`train.jsonl` + `eval.jsonl`).
164
 
165
  ## Intended Use
166
 
167
  This model assists with security patch backporting in Linux distribution maintenance. It is a research tool — all generated patches must be reviewed by a maintainer before application.
168
 
169
+ **Important:** This model was fine-tuned for code generation accuracy, not for safety alignment. It inherits the base model's safety training but has no additional guardrails. In particular:
170
+
171
+ - The model follows fix descriptions literally. If the fix description contains malicious instructions (e.g., "add a backdoor"), the model will comply. **Fix descriptions must come from trusted sources** — typically upstream patches, not user input.
172
+ - The tool is designed for use with trusted inputs (upstream CVE patches, OBS source packages). It should not be exposed as a public API without input validation.
173
+ - Generated patches and test cases must always be reviewed by a maintainer before application.
174
+
175
+ Adding safety training to the fine-tuning was considered but deliberately deferred — our evaluation showed that domain precision (98% in v3) is sensitive to training data composition, and mixing safety examples risks degrading the model's core capability. The correct mitigation is input validation in the tool, not model-level refusal.
176
+
177
+ ## Known Issues
178
+
179
+ - **Prompt echo (v4):** The v4 model occasionally echoes prompt structure (`## File:`, markdown fences) into its code output, likely from the 5-turn test generation training data. The CLI tool strips these automatically. This is a minor regression from v3.
180
+ - **Test generation quality varies:** Test cases for simple vulnerability patterns (null deref, bounds check, injection) are useful. For complex multi-file patches with adapted context, the model may produce generic placeholder tests.
181
+
182
  ## License
183
 
184
  Apache-2.0 (inherited from Qwen2.5-Coder-32B-Instruct).