--- license: apache-2.0 base_model: unsloth/Qwen2.5-Coder-7B-Instruct library_name: peft pipeline_tag: text-generation tags: - code - code-review - security - qwen2.5-coder - lora - bad-theory-labs model_name: btl-2-coder-7B --- # BTL-2 Coder 7B BTL-2 Coder 7B is a LoRA adapter for `unsloth/Qwen2.5-Coder-7B-Instruct`, trained for structured code-review findings. ## Intended Use Use this model for local-first code review: - SQL injection - path traversal - authorization bypass - missing error handling - boundary/off-by-one logic - related security and correctness bugs It is not yet a general autonomous coding agent and should not be marketed as a SWE-Bench repair model. ## Training - Base: `unsloth/Qwen2.5-Coder-7B-Instruct` - Trainer: Unsloth LoRA SFT - Data: `4,000` API teacher traces + `1,000` template traces - Split: `4,500` train / `500` eval - Epochs: `2` - Max length: `4096` Only redacted, opt-in traces should be used for future training. ## Prompt Use strict schema prompting: ```text Return only a JSON array. No markdown and no wrapper object. Each finding must include: severity, file, line, title, evidence, recommendation, confidence. severity must be exactly one of: critical, high, medium, low. Never put a category in severity. confidence must be a number from 0 to 1, never a string label. Every finding must include concrete evidence and a non-empty recommendation. ``` Example output: ```json [ { "severity": "critical", "file": "src/users.ts", "line": 42, "title": "SQL injection through string-built query", "evidence": "The user id is concatenated directly into the SQL string.", "recommendation": "Use a parameterized query.", "confidence": 0.96 } ] ``` ## Evaluation | Eval | JSON parse | Schema valid | Numeric confidence | Category hit | File hit | Precision | Recall | Weighted severity recall | | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | | Heldout 100 strict | 1.000 | 0.952 | 1.000 | 0.783 | 0.840 | - | - | - | | Heldout 30 strict v2 | 1.000 | 0.975 | 1.000 | 0.867 | 0.867 | - | - | - | | Seeded 15 strict | 1.000 | 1.000 | 1.000 | 0.933 | 1.000 | 0.933 | 0.933 | 0.956 | ## Limitations - Strict schema prompting is required for best results. - The model may miss subtle multi-file issues. - The model can produce plausible but incorrect findings; keep human review in the loop. - Do not use on private repositories unless you control the inference environment and data policy. ## Release Artifacts This Hugging Face repo should include: - `adapter_model.safetensors` - `adapter_config.json` - `tokenizer.json` - `tokenizer_config.json` - `chat_template.jinja` - `training_args.bin` - this `README.md`