Instructions to use ratnam1510/wardstral-8b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use ratnam1510/wardstral-8b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="ratnam1510/wardstral-8b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("ratnam1510/wardstral-8b")
model = AutoModelForCausalLM.from_pretrained("ratnam1510/wardstral-8b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use ratnam1510/wardstral-8b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "ratnam1510/wardstral-8b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ratnam1510/wardstral-8b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/ratnam1510/wardstral-8b

SGLang

How to use ratnam1510/wardstral-8b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "ratnam1510/wardstral-8b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ratnam1510/wardstral-8b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "ratnam1510/wardstral-8b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ratnam1510/wardstral-8b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use ratnam1510/wardstral-8b with Docker Model Runner:
```
docker model run hf.co/ratnam1510/wardstral-8b
```

Ministral-8B Security Vulnerability Scanner

A fine-tuned version of Ministral-8B-Instruct-2410 specialized for security vulnerability detection in code.

Model Details

Base model: mistralai/Ministral-8B-Instruct-2410
Fine-tuning method: QLoRA (r=16, alpha=32)
Target modules: q_proj, k_proj, v_proj, o_proj
Training: 3 epochs, 156 steps on A10G (~47 min)
Training data: ~864 security vulnerability examples (ratnam1510/security-vuln-dataset)
LoRA adapter: ratnam1510/mistral-small-secure-scan

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("ratnam1510/ministral-8b-security-scanner")
tokenizer = AutoTokenizer.from_pretrained("ratnam1510/ministral-8b-security-scanner")

messages = [
    {"role": "system", "content": "You are an expert security vulnerability analyst."},
    {"role": "user", "content": "Analyze this code for vulnerabilities:\n```python\nimport os\nos.system(user_input)\n```"},
]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
outputs = model.generate(inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Evaluation

Evaluated using an LLM-as-judge approach (GLM-4.5-air via OpenRouter) on 20 test samples. The judge was given the code snippet, ground truth analysis, and the model's response, then scored each response on 5 dimensions (1-5 scale).

What Each Metric Measures

Metric	What it measures	1 (worst)	5 (best)
Vulnerability ID	Did the model correctly identify the vulnerability type?	Wrong or none identified	Precise CWE classification
Severity Accuracy	Is the severity rating reasonable?	Wildly off	Matches ground truth
Explanation Quality	Is the explanation clear and actionable?	Vague hand-waving	Cites specific lines, root cause
Fix Suggestion	Does it suggest correct remediation?	No fix or wrong fix	Production-ready fix
Relevance	Does the response address the actual code shown?	Completely unrelated	Directly analyzes the snippet

Score Comparison: Base vs Fine-tuned

Dimension	Base Model	Fine-tuned	Delta	Winner
Overall	1.55/5	1.81/5	+0.26	Fine-tuned
Vulnerability Identification	1.90	2.31	+0.41	Fine-tuned
Severity Accuracy	1.60	2.62	+1.02	Fine-tuned
Fix Suggestion	1.40	1.44	+0.04	Fine-tuned

Quality Distribution

Metric	Base	Fine-tuned
% Good (>=4/5)	0.0%	6.2%
% Poor (<=2/5)	90.0%	81.2%

Downloads last month: 5

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for ratnam1510/wardstral-8b

Base model

mistralai/Ministral-8B-Instruct-2410

Finetuned

(242)

this model