TentaFlow
/

TentaGuard-NVFP4

Text Classification

text-generation

compressed-tensors

prompt-injection

Model card Files Files and versions

TentaGuard-NVFP4 / README.md

TentaFlow's picture

Update README.md

46ff4df verified about 17 hours ago

|

history blame contribute delete

2.24 kB

	---
	license: apache-2.0
	base_model:
	- Qwen/Qwen3.5-0.8B
	pipeline_tag: text-classification
	library_name: transformers
	language:
	- en
	- pl
	tags:
	- nvfp4
	- fp4
	- compressed-tensors
	- vllm
	- quantized
	- tentaguard
	- guard
	- security
	- prompt-injection
	- tentaflow
	---

	# TentaGuard — NVFP4 (W4A4, vLLM)

	TentaGuard is a lightweight security classifier (guard) — a fine-tune of
	[`Qwen/Qwen3.5-0.8B`](https://huggingface.co/Qwen/Qwen3.5-0.8B). It is used **mainly inside the
	[TentaFlow](https://github.com/Slyb00ts/TentaFlow) application** to scan external content — messages, documents,
	web-search results, etc. — for hidden attacks (prompt injection / jailbreak) before it
	reaches the main LLM.

	The model does NOT generate user-facing replies — it returns a single digit:

	\| Label \| Meaning \|
	\|-------\|---------\|
	\| `0` \| benign (safe content) \|
	\| `1` \| prompt injection / tool abuse (technical attack) \|
	\| `2` \| jailbreak (behavioural manipulation) \|

	If the text contains BOTH injection and jailbreak → `1`.

	## Input format

	A classifier system prompt + a user message `<\|guard\|>\n{text}`. **Build the prompt with the
	model tokenizer (`apply_chat_template`)** — do not rely on a generic chat template.

	## Accuracy (guard test set)

	- Exact (0/1/2): ~96.6% (full precision) / ~94.8% (Q5_K_M)
	- Safe / Unsafe: ~98.3%

	## Authors

	Trained by: Katarzyna Nowak, Piotr Jarocki, Damian Pala, Jakub Rurański.

	## License & attribution

	Apache-2.0, inherited from the base model [`Qwen/Qwen3.5-0.8B`](https://huggingface.co/Qwen/Qwen3.5-0.8B).
	This checkpoint is a fine-tune for attack detection, built for the [TentaFlow](https://github.com/Slyb00ts/TentaFlow) application.

	## Usage (vLLM)

	`compressed-tensors` format (`nvfp4-pack-quantized`): 4-bit weights (FP4 E2M1, groups of 16,
	FP8 E4M3 block scales + a global FP32 scale), 4-bit activations (W4A4), `lm_head` kept in full
	precision. PTQ calibration via [`llm-compressor`](https://github.com/vllm-project/llm-compressor)
	on real guard prompts.

	NVFP4 is hardware-accelerated on Blackwell (sm_100+); on older GPUs vLLM loads it as
	weight-only (smaller VRAM, no FP4 acceleration).

	```bash
	vllm serve TentaFlow/TentaGuard-NVFP4
	```