btl-2-coder-7B / README.md
affableiq's picture
Release btl-2-coder-7B api4k-template1k LoRA adapter
3f0df2e verified
|
Raw
History Blame Contribute Delete
2.73 kB
---
license: apache-2.0
base_model: unsloth/Qwen2.5-Coder-7B-Instruct
library_name: peft
pipeline_tag: text-generation
tags:
- code
- code-review
- security
- qwen2.5-coder
- lora
- bad-theory-labs
model_name: btl-2-coder-7B
---
# BTL-2 Coder 7B
BTL-2 Coder 7B is a LoRA adapter for `unsloth/Qwen2.5-Coder-7B-Instruct`, trained for structured code-review findings.
## Intended Use
Use this model for local-first code review:
- SQL injection
- path traversal
- authorization bypass
- missing error handling
- boundary/off-by-one logic
- related security and correctness bugs
It is not yet a general autonomous coding agent and should not be marketed as a SWE-Bench repair model.
## Training
- Base: `unsloth/Qwen2.5-Coder-7B-Instruct`
- Trainer: Unsloth LoRA SFT
- Data: `4,000` API teacher traces + `1,000` template traces
- Split: `4,500` train / `500` eval
- Epochs: `2`
- Max length: `4096`
Only redacted, opt-in traces should be used for future training.
## Prompt
Use strict schema prompting:
```text
Return only a JSON array. No markdown and no wrapper object.
Each finding must include: severity, file, line, title, evidence, recommendation, confidence.
severity must be exactly one of: critical, high, medium, low.
Never put a category in severity.
confidence must be a number from 0 to 1, never a string label.
Every finding must include concrete evidence and a non-empty recommendation.
```
Example output:
```json
[
{
"severity": "critical",
"file": "src/users.ts",
"line": 42,
"title": "SQL injection through string-built query",
"evidence": "The user id is concatenated directly into the SQL string.",
"recommendation": "Use a parameterized query.",
"confidence": 0.96
}
]
```
## Evaluation
| Eval | JSON parse | Schema valid | Numeric confidence | Category hit | File hit | Precision | Recall | Weighted severity recall |
| --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
| Heldout 100 strict | 1.000 | 0.952 | 1.000 | 0.783 | 0.840 | - | - | - |
| Heldout 30 strict v2 | 1.000 | 0.975 | 1.000 | 0.867 | 0.867 | - | - | - |
| Seeded 15 strict | 1.000 | 1.000 | 1.000 | 0.933 | 1.000 | 0.933 | 0.933 | 0.956 |
## Limitations
- Strict schema prompting is required for best results.
- The model may miss subtle multi-file issues.
- The model can produce plausible but incorrect findings; keep human review in the loop.
- Do not use on private repositories unless you control the inference environment and data policy.
## Release Artifacts
This Hugging Face repo should include:
- `adapter_model.safetensors`
- `adapter_config.json`
- `tokenizer.json`
- `tokenizer_config.json`
- `chat_template.jinja`
- `training_args.bin`
- this `README.md`