Instructions to use richfrem/smart-secrets-scanner-gguf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use richfrem/smart-secrets-scanner-gguf with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="richfrem/smart-secrets-scanner-gguf",
	filename="smart-secrets-scanner-Q4_K_M.gguf",
)

output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use richfrem/smart-secrets-scanner-gguf with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf richfrem/smart-secrets-scanner-gguf:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf richfrem/smart-secrets-scanner-gguf:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf richfrem/smart-secrets-scanner-gguf:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf richfrem/smart-secrets-scanner-gguf:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf richfrem/smart-secrets-scanner-gguf:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf richfrem/smart-secrets-scanner-gguf:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf richfrem/smart-secrets-scanner-gguf:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf richfrem/smart-secrets-scanner-gguf:Q4_K_M

Use Docker

docker model run hf.co/richfrem/smart-secrets-scanner-gguf:Q4_K_M

LM Studio
Jan

vLLM

How to use richfrem/smart-secrets-scanner-gguf with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "richfrem/smart-secrets-scanner-gguf"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "richfrem/smart-secrets-scanner-gguf",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/richfrem/smart-secrets-scanner-gguf:Q4_K_M

Ollama
How to use richfrem/smart-secrets-scanner-gguf with Ollama:
```
ollama run hf.co/richfrem/smart-secrets-scanner-gguf:Q4_K_M
```

Unsloth Studio

How to use richfrem/smart-secrets-scanner-gguf with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for richfrem/smart-secrets-scanner-gguf to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for richfrem/smart-secrets-scanner-gguf to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for richfrem/smart-secrets-scanner-gguf to start chatting

Docker Model Runner
How to use richfrem/smart-secrets-scanner-gguf with Docker Model Runner:
```
docker model run hf.co/richfrem/smart-secrets-scanner-gguf:Q4_K_M
```

Lemonade

How to use richfrem/smart-secrets-scanner-gguf with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull richfrem/smart-secrets-scanner-gguf:Q4_K_M

Run and chat with the model

lemonade run user.smart-secrets-scanner-gguf-Q4_K_M

List all available models

lemonade list

🔒 Smart-Secrets-Scanner — Code Security Analysis Model (GGUF Edition)

Version: 1.2 (Dataset Expansion & Llama 3.1 Format) Date: 2024-10-18 Developer: richfrem Base Model: meta-llama/Llama-3.1-8B-Instruct Training Environment: Local CUDA environment / PyTorch 2.9.0+cu126

![Built With: PEFT + llama.cpp](https://img.shields.io/badge/Built With-PEFT %2B llama.cpp-orange)

🔍 Overview

Smart-Secrets-Scanner is a specialized AI model fine-tuned for detecting accidental hardcoded secrets in source code. This GGUF edition merges the complete fine-tuned LoRA adapter into the base Llama-3.1-8B-Instruct model, then quantizes the result to GGUF (q4_k_m) for universal inference compatibility via Ollama and llama.cpp.

🔒 Part of the open-source Smart-Secrets-Scanner GitHub repository, providing comprehensive code security analysis tools.

✨ Key Features (v1.2 Update)

Expanded Dataset: Trained on 536 curated examples for improved accuracy and coverage
Llama 3.1 Instruct Format: Uses official Llama 3.1 chat templates for consistent training and inference
Flexible Input Handling: Accepts any code analysis request without requiring specific instruction text
Standard Template Support: Compatible with Ollama's default Llama 3.1 templates and other Instruct-based interfaces

📦 Artifacts Produced

Type	Artifact	Description
🧩 LoRA Adapter	`smart-secrets-scanner-lora`	Fine-tuned LoRA deltas for secret detection
🔥 GGUF Model	`smart-secrets-scanner-gguf`	Fully merged + quantized model (Ollama-ready q4_k_m)
⚙️ Config Files	system, template, params.json	Individual files for Ollama config override (Llama 3.1 Instruct)
📜 Ollama Modelfile	Modelfile	Defines final runtime parameters for local deployment

⚒️ Technical Details

Built using transformers 4.56.2, torch 2.9.0 + cu126, PEFT, TRL, and llama.cpp (GGUF converter) on CUDA-enabled hardware.

Training Improvements (v1.2):

Llama 3.1 Instruct Formatting: Updated formatting_prompts_func to use official Llama 3.1 chat templates
Dataset Expansion: Increased training examples to 536 for better generalization and accuracy
Template Consistency: Eliminated prompt drift through standardized Llama 3.1 format across training and inference

Pipeline

📊 Data Preparation — Curate secret detection dataset (536 examples) with Llama 3.1 instruction format
🎯 Fine-tuning — LoRA fine-tuning with Llama 3.1 chat formatting for template consistency
🔄 Model Merge — Combine LoRA adapter with base model
📦 Quantization — Convert to GGUF (q4_k_m) format
☁️ Distribution — Upload to Hugging Face for deployment

💽 Deployment Guide (Ollama / llama.cpp)

Option A — Local Ollama Deployment

ollama create smart-secrets-scanner -f ./Modelfile
ollama run smart-secrets-scanner

Option B — Direct Pull (from Hugging Face)

ollama run hf.co/richfrem/smart-secrets-scanner-gguf:Q4_K_M

Option C — Llama 3.1 Instruct Template (Recommended for v1.2)

This model uses the official Llama 3.1 Instruct chat template for optimal performance.

# Works with Ollama's default Llama 3.1 template
ollama run hf.co/richfrem/smart-secrets-scanner-gguf:Q4_K_M

# Or use with LM Studio, llama.cpp, or any Llama 3.1-compatible interface
# Just provide your code analysis request directly

The model uses the official Llama 3.1 Instruct system prompt that accepts any code analysis instruction, ensuring consistent behavior across different interfaces.

⚙️ Intended Use

Category	Description
Primary Purpose	Automated detection of hardcoded secrets in source code
Recommended Interfaces	Ollama CLI, LM Studio, llama.cpp API, security tools
Target Environment	Code repositories, CI/CD pipelines, security audits
Context Length	4096 tokens
Quantization	q4_k_m (optimized for speed and accuracy)
Template Compatibility	Standard Llama 3.1 Instruct chat templates (official format)

🔐 Supported Secret Types

API Keys: AWS, Stripe, OpenAI, GitHub, etc.
Authentication Tokens: JWT, Bearer tokens, OAuth tokens
Database Credentials: Connection strings, usernames, passwords
Private Keys: SSH keys, SSL certificates, encryption keys
Access Codes: Passwords, API secrets, access tokens
Environment Variables: Proper usage validation

⚖️ Governance and Source

This model is a derivative product of the Smart-Secrets-Scanner project, governed by the BC Government.

For comprehensive details on development, governance, and contribution policies, please refer to the source GitHub repository:

Document	Link
GitHub Source	bcgov/Smart-Secrets-Scanner
License	LICENSE
Code of Conduct	CODE_OF_CONDUCT.md
Contributing	CONTRIBUTING.md

⚖️ License & Attribution

This model is licensed under the Creative Commons Attribution 4.0 International Public License (CC BY 4.0).

You are free to share and adapt this model, provided appropriate credit is given.

Required Attribution:

🧬 Model Lineage

Base Model: meta-llama/Llama-3.1-8B-Instruct
Fine-tuning Framework: PEFT + TRL (LoRA)
Dataset: Smart-Secrets-Scanner Dataset (536 examples, JSONL)
Formatting: Llama 3.1 Instruct (v1.2) - Official chat templates for consistent training/inference
Quantization: GGUF (q4_k_m)
Architecture: Decoder-only transformer
Key Improvements (v1.2): Dataset expansion to 536 examples, Llama 3.1 format standardization

🧪 Testing the Model

Security Analysis Examples

The Smart-Secrets-Scanner model analyzes code snippets for potential security risks. With v1.2, the model uses Llama 3.1 Instruct format for natural language instructions.

Example 1 - API Key Detection (Flexible Prompt):

>>> Check this code for any secrets: API_KEY = 'sk-1234567890abcdef'

Expected Response: "ALERT: OpenAI API key detected - High risk of credential exposure"

Example 2 - Safe Pattern Recognition:

>>> Analyze this code for secrets: import os; api_key = os.getenv('API_KEY')

Expected Response: "No secrets detected - Environment variable usage is secure"

Example 3 - Database Credentials (Natural Language):

>>> Look for hardcoded secrets in this code: const DB_PASS = 'admin123!'; const DB_USER = 'root';

Expected Response: "ALERT: Database password detected - High risk of unauthorized access"

Example 4 - Multiple Languages:

>>> Scan this JavaScript for security issues: let token = "ghp_1234567890abcdef";

Expected Response: "ALERT: GitHub personal access token detected - High risk of repository compromise"

📊 Performance Metrics

Secret Detection Accuracy: 0.92
Precision: 0.89 (low false positive rate)
Recall: 0.94 (high detection coverage)
Supported Languages: Python, JavaScript, Java, Go, C++, and more

Full technical documentation and training notebooks are available in the 👉 Smart-Secrets-Scanner GitHub Repository.

Ollama Usage

ollama run ai-secret-scanner

Downloads last month: 814

GGUF

Model size

8B params

Architecture

llama

Hardware compatibility

4-bit