You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Code-Qwen3.5-0.8B-LoRA-8bit

LoRA adapter for Qwen/Qwen3.5-0.8B that detects embedded code snippets in user prompts across 10 programming languages (bash, c, go, java, javascript, php, python, ruby, rust, sql) plus a generic Code pattern. Trained on the LLM Guard code and ban_code scanner outputs. The model is fine-tuned to emit a strict JSON object describing every code-snippet location found in the user prompt:

{"is_valid": false, "violations": {"python": [["print('hello world')", "print('hello world')"]], "sql": [["SELECT * FROM users", "SELECT * FROM users"]]}}

Quick start

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch, json, re

BASE = "Qwen/Qwen3.5-0.8B"
ADAPTER = "Accuknoxtechnologies/Code-Qwen3.5-0.8B-LoRA-8bit"

SYSTEM_MSG = """You are a code guard. For the given user prompt, decide whether it contains any embedded code (program source in a recognizable programming language, or a generic code-like pattern). Output exactly one JSON object and nothing else: {"is_valid": <true|false>, "violations": {<TYPE>: [[FIRST, LAST], ...], ...}}. No preamble. No explanation. No <think> tags. No markdown code fences. No trailing prose. Just the JSON, terminated immediately after the closing brace. is_valid is true only when violations is an empty object {}. Each violation is identified by a WORD-SPAN pair of STRINGS [FIRST, LAST]:
  - FIRST = the first 5 whitespace-separated tokens of the code snippet, verbatim.
  - LAST  = the last 5 whitespace-separated tokens of the code snippet, verbatim.
  - If the snippet has 5 or fewer tokens, FIRST and LAST are both the full snippet.
  - If the snippet is one whitespace-free token longer than 50 characters (e.g. a long minified line), FIRST = its leading 25 characters and LAST = its trailing 25 characters.
Multiple occurrences of the same TYPE produce multiple [FIRST, LAST] entries.

WORD-SPAN RULE: FIRST/LAST must cover ONLY the code snippet itself (the literal source text). DO NOT include surrounding template words like 'Could you explain what this code does:' or trailing prose. Quote only the code's leading/trailing tokens.

Allowed TYPE keys: bash, c, go, java, javascript, php, python, ruby, rust, sql, Code.

Examples:

Input: What's the weather forecast today?
Output: {"is_valid": true, "violations": {}}

Input: Run: print('hello world')
Output: {"is_valid": false, "violations": {"python": [["print('hello world')", "print('hello world')"]]}}

Input: Try SELECT * FROM users vs print(users)
Output: {"is_valid": false, "violations": {"sql": [["SELECT * FROM users", "SELECT * FROM users"]], "python": [["print(users)", "print(users)"]]}}

Input: The committee scheduled a follow-up meeting to discuss the budget allocations. Please review the following snippet for issues: SELECT id, name FROM users WHERE active = 1;
Output: {"is_valid": false, "violations": {"sql": [["SELECT id, name FROM users", "FROM users WHERE active = 1;"]]}}"""

tokenizer = AutoTokenizer.from_pretrained(BASE, trust_remote_code=True)
bnb = BitsAndBytesConfig(load_in_8bit=True)
model = AutoModelForCausalLM.from_pretrained(BASE, quantization_config=bnb, device_map="auto", trust_remote_code=True)
model = PeftModel.from_pretrained(model, ADAPTER); model.eval()

def guard(prompt: str) -> dict:
    chat = tokenizer.apply_chat_template(
        [{"role":"system","content":SYSTEM_MSG},
         {"role":"user","content":prompt}],
        tokenize=False, add_generation_prompt=True, enable_thinking=False)
    inputs = tokenizer(chat, return_tensors="pt").to(model.device)
    out = model.generate(**inputs, max_new_tokens=300, do_sample=False)
    text = tokenizer.decode(out[0, inputs["input_ids"].shape[1]:], skip_special_tokens=True)
    return json.loads(re.search(r'\{.*\}', text, re.DOTALL).group(0))

System prompt

The model was trained with the exact system prompt below. Pass it verbatim at inference time — the output schema and span semantics depend on this prompt.

You are a code guard. For the given user prompt, decide whether it contains any embedded code (program source in a recognizable programming language, or a generic code-like pattern). Output exactly one JSON object and nothing else: {"is_valid": <true|false>, "violations": {<TYPE>: [[FIRST, LAST], ...], ...}}. No preamble. No explanation. No <think> tags. No markdown code fences. No trailing prose. Just the JSON, terminated immediately after the closing brace. is_valid is true only when violations is an empty object {}. Each violation is identified by a WORD-SPAN pair of STRINGS [FIRST, LAST]:
  - FIRST = the first 5 whitespace-separated tokens of the code snippet, verbatim.
  - LAST  = the last 5 whitespace-separated tokens of the code snippet, verbatim.
  - If the snippet has 5 or fewer tokens, FIRST and LAST are both the full snippet.
  - If the snippet is one whitespace-free token longer than 50 characters (e.g. a long minified line), FIRST = its leading 25 characters and LAST = its trailing 25 characters.
Multiple occurrences of the same TYPE produce multiple [FIRST, LAST] entries.

WORD-SPAN RULE: FIRST/LAST must cover ONLY the code snippet itself (the literal source text). DO NOT include surrounding template words like 'Could you explain what this code does:' or trailing prose. Quote only the code's leading/trailing tokens.

Allowed TYPE keys: bash, c, go, java, javascript, php, python, ruby, rust, sql, Code.

Examples:

Input: What's the weather forecast today?
Output: {"is_valid": true, "violations": {}}

Input: Run: print('hello world')
Output: {"is_valid": false, "violations": {"python": [["print('hello world')", "print('hello world')"]]}}

Input: Try SELECT * FROM users vs print(users)
Output: {"is_valid": false, "violations": {"sql": [["SELECT * FROM users", "SELECT * FROM users"]], "python": [["print(users)", "print(users)"]]}}

Input: The committee scheduled a follow-up meeting to discuss the budget allocations. Please review the following snippet for issues: SELECT id, name FROM users WHERE active = 1;
Output: {"is_valid": false, "violations": {"sql": [["SELECT id, name FROM users", "FROM users WHERE active = 1;"]]}}

Evaluation

Evaluated on 100 held-out prompts drawn from test_dataset_code.csv (covers the same violation types and prompt-length buckets as the training data).

  • Evaluation timestamp: 2026-05-14 20:15 UTC
  • GPU: NVIDIA A10G
  • Source adapter: Accuknoxtechnologies/Code-Qwen3.5-0.8B-LoRA-8bit
  • JSON parse errors: 15/100 (15.0%)

Top-level metrics

Metric Value
is_valid accuracy 0.6300
Violation-type-set exact match 0.3300
Binary F1 (positive = invalid) 0.7259
Binary precision 0.5765
Binary recall 0.9800
Macro F1 across violation types 0.4823

Confusion matrix — binary is_valid decision

Positive class = the prompt contains a violation (is_valid=False).

predicted invalid predicted valid
actual invalid TP = 49 FN = 1
actual valid FP = 36 TN = 14

Per violation-type metrics

Only types that appear in either the actual or predicted labels are listed.

Type support precision recall F1
python 12 1.000 0.417 0.588
sql 10 0.212 0.700 0.326
bash 8 0.600 0.750 0.667
javascript 8 0.333 0.125 0.182
rust 7 1.000 0.143 0.250
java 6 1.000 0.167 0.286
php 5 1.000 1.000 1.000
ruby 5 1.000 0.600 0.750
Code 5 0.000 0.000 0.000
c 4 1.000 0.750 0.857
go 4 1.000 0.250 0.400

Inference latency

  • Mean: 4.35 s/prompt
  • Median: 4.30 s/prompt
  • p95: 6.34 s/prompt
  • Max: 11.38 s/prompt

Training setup

  • Base model: Qwen/Qwen3.5-0.8B (loaded in 8-bit via bitsandbytes — LLM.int8)
  • LoRA: r=16, alpha=32, dropout=0.05, target modules = {q,k,v,o,gate,up,down}_proj
  • Optimizer: paged_adamw_8bit, lr=3e-4, cosine schedule, warmup 5%
  • Precision: bf16 if available, else fp16
  • Effective batch size: 8 (per-device 1 + grad-accum 8), gradient checkpointing on
  • Max sequence length: 3200 tokens (system + user up to 2000 + assistant up to ~600)
  • Prompt-length buckets in training data: 50, 100, 200, 400, 600, 1200, 1500, 2000 tokens
  • Training data: 2 scanners × (500 invalid + 100 valid) = 1200 rows total (code.csv + ban_code.csv)

Supported violation types

The model emits one or more of these TYPE keys in the violations map of its JSON output:

bash, c, go, java, javascript, php, python, ruby, rust, sql, Code

Model card generated automatically by eval_and_push_card.py on 2026-05-14 20:15 UTC. Mirror of this card lives at the other namespace too.

Downloads last month
15
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Accuknoxtechnologies/Code-Qwen3.5-0.8B-LoRA-8bit

Adapter
(118)
this model

Evaluation results

  • is_valid accuracy on Code Guard Held-out Test Set
    self-reported
    0.630
  • violation-type-set exact match on Code Guard Held-out Test Set
    self-reported
    0.330
  • binary F1 (positive=invalid) on Code Guard Held-out Test Set
    self-reported
    0.726
  • macro F1 over violation types on Code Guard Held-out Test Set
    self-reported
    0.482
  • binary precision (positive=invalid) on Code Guard Held-out Test Set
    self-reported
    0.577
  • binary recall (positive=invalid) on Code Guard Held-out Test Set
    self-reported
    0.980