Instructions to use exploitintel/cve-cwe-gemma4-12b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use exploitintel/cve-cwe-gemma4-12b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="exploitintel/cve-cwe-gemma4-12b")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("exploitintel/cve-cwe-gemma4-12b")
model = AutoModelForMultimodalLM.from_pretrained("exploitintel/cve-cwe-gemma4-12b", device_map="auto")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use exploitintel/cve-cwe-gemma4-12b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "exploitintel/cve-cwe-gemma4-12b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "exploitintel/cve-cwe-gemma4-12b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/exploitintel/cve-cwe-gemma4-12b

SGLang

How to use exploitintel/cve-cwe-gemma4-12b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "exploitintel/cve-cwe-gemma4-12b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "exploitintel/cve-cwe-gemma4-12b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "exploitintel/cve-cwe-gemma4-12b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "exploitintel/cve-cwe-gemma4-12b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use exploitintel/cve-cwe-gemma4-12b with Docker Model Runner:
```
docker model run hf.co/exploitintel/cve-cwe-gemma4-12b
```

exploitintel commited on Jun 5

Commit

780995b

verified ·

1 Parent(s): f26fb88

Upload blog.md with huggingface_hub

Browse files

Files changed (1) hide show

blog.md +15 -1

blog.md CHANGED Viewed

@@ -4,6 +4,20 @@
 ---
 ## Everyone has met this intern
 You know the type. Razor-sharp, eager, genuinely knows the material — and constitutionally incapable of answering a yes-or-no question without a TED talk. Ask where the printer is and you get a history of toner.
@@ -159,6 +173,6 @@ None of these were hard once spotted. Every one of them was invisible until we t
 - **Macro-F1 is the honest metric for long-tail problems.** It's where v1 failed (0.067), where v2 won (0.500), and where quantization quietly bills you. Report only accuracy and you'd miss all three stories. Pick the metric that can embarrass you.
 - **Verify the unglamorous things.** Our scariest near-miss wasn't a subtle modeling error; it was a 640 MB file wearing a 12 GB model's name tag. Half of doing this honestly is making sure the thing under test is the thing you think it is — and that one flattering average isn't hiding a dead tail.
-The result is a small, fast, refreshingly tight-lipped tool: feed it a messy CVE description, get back a clean set of CWE IDs, in eleven tokens, more reliably than the far chattier model it was carved from. It's not a replacement for a human analyst — it still fumbles the deepest multi-label chains, and it's a triage aid, not an oracle. But for anyone with more bug reports to map than hours to map them, it's a real extra pair of hands. A tireless triage nurse who, at long last, has learned to stop explaining.
 *Grab it: [full model](https://huggingface.co/exploitintel/cve-cwe-gemma4-12b) · [quantized for laptops](https://huggingface.co/exploitintel/cve-cwe-gemma4-12b-GGUF) · [the dataset we trained on](https://huggingface.co/datasets/exploitintel/cve-cwe-consensus)*

 ---
+## The problem: a flood of bugs, and nobody to file them
+Every year the world logs tens of thousands of new software vulnerabilities. Each one gets a **CVE** — a public record stating that one specific thing is broken in one specific product — and the count climbs relentlessly. In 2024 it climbed straight into a wall: the U.S. National Vulnerability Database, the clearinghouse that *enriches* those raw records with structured metadata, fell dramatically behind, leaving a mountain of CVEs with a description and little else attached.
+One of the most valuable pieces of that "little else" is the **CWE** — the weakness category, the answer to *what kind of mistake was this?* (Full definition in a moment.) Without it, a vulnerability is an isolated anecdote. With it, you can do the work that actually matters: notice a vendor shipping the same class of bug over and over, prioritize the weaknesses that genuinely threaten you, measure whether your defenses are improving, and tie a brand-new CVE to everything you already know about that failure mode.
+But assigning a CWE is slow, manual, and inconsistent. Different analysts reach for different categories; many CVEs never receive a dependable mapping at all; and the boundaries between categories are subtle enough to start arguments among experts. Meanwhile the backlog only grows.
+This is exactly our problem at **[exploit-intel.com](https://exploit-intel.com)**. We take in the entire CVE firehose and turn it into usable intelligence — and that depends on every incoming vulnerability being tagged with its weakness category *consistently, immediately, and without waiting in anyone else's queue.* At this volume, by hand, that's hopeless. We needed a classifier of our own: something that reads a raw CVE description and reliably names the CWE(s), at scale, on our own infrastructure.
+The obvious move in 2026 is to point a capable language model at the job. So we did — and immediately met an old friend.
+---
 ## Everyone has met this intern
 You know the type. Razor-sharp, eager, genuinely knows the material — and constitutionally incapable of answering a yes-or-no question without a TED talk. Ask where the printer is and you get a history of toner.
 - **Macro-F1 is the honest metric for long-tail problems.** It's where v1 failed (0.067), where v2 won (0.500), and where quantization quietly bills you. Report only accuracy and you'd miss all three stories. Pick the metric that can embarrass you.
 - **Verify the unglamorous things.** Our scariest near-miss wasn't a subtle modeling error; it was a 640 MB file wearing a 12 GB model's name tag. Half of doing this honestly is making sure the thing under test is the thing you think it is — and that one flattering average isn't hiding a dead tail.
+The result is a small, fast, refreshingly tight-lipped tool: feed it a messy CVE description, get back a clean set of CWE IDs, in eleven tokens, more reliably than the far chattier model it was carved from. It's not a replacement for a human analyst — it still fumbles the deepest multi-label chains, and it's a triage aid, not an oracle. But for us at exploit-intel.com it closes the exact gap we started with: every CVE that lands in the pipeline gets its weakness category the moment it arrives — on our own hardware, in eleven tokens, with no backlog and nobody else's queue to wait on. And for anyone else with more bug reports to map than hours to map them, it's a real extra pair of hands. A tireless triage nurse who, at long last, has learned to stop explaining.
 *Grab it: [full model](https://huggingface.co/exploitintel/cve-cwe-gemma4-12b) · [quantized for laptops](https://huggingface.co/exploitintel/cve-cwe-gemma4-12b-GGUF) · [the dataset we trained on](https://huggingface.co/datasets/exploitintel/cve-cwe-consensus)*