Instructions to use richfrem/smart-secrets-scanner-gguf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use richfrem/smart-secrets-scanner-gguf with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="richfrem/smart-secrets-scanner-gguf", filename="smart-secrets-scanner-Q4_K_M.gguf", )
output = llm( "Once upon a time,", max_tokens=512, echo=True ) print(output)
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use richfrem/smart-secrets-scanner-gguf with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf richfrem/smart-secrets-scanner-gguf:Q4_K_M # Run inference directly in the terminal: llama-cli -hf richfrem/smart-secrets-scanner-gguf:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf richfrem/smart-secrets-scanner-gguf:Q4_K_M # Run inference directly in the terminal: llama-cli -hf richfrem/smart-secrets-scanner-gguf:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf richfrem/smart-secrets-scanner-gguf:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf richfrem/smart-secrets-scanner-gguf:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf richfrem/smart-secrets-scanner-gguf:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf richfrem/smart-secrets-scanner-gguf:Q4_K_M
Use Docker
docker model run hf.co/richfrem/smart-secrets-scanner-gguf:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use richfrem/smart-secrets-scanner-gguf with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "richfrem/smart-secrets-scanner-gguf" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "richfrem/smart-secrets-scanner-gguf", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/richfrem/smart-secrets-scanner-gguf:Q4_K_M
- Ollama
How to use richfrem/smart-secrets-scanner-gguf with Ollama:
ollama run hf.co/richfrem/smart-secrets-scanner-gguf:Q4_K_M
- Unsloth Studio
How to use richfrem/smart-secrets-scanner-gguf with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for richfrem/smart-secrets-scanner-gguf to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for richfrem/smart-secrets-scanner-gguf to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for richfrem/smart-secrets-scanner-gguf to start chatting
- Docker Model Runner
How to use richfrem/smart-secrets-scanner-gguf with Docker Model Runner:
docker model run hf.co/richfrem/smart-secrets-scanner-gguf:Q4_K_M
- Lemonade
How to use richfrem/smart-secrets-scanner-gguf with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull richfrem/smart-secrets-scanner-gguf:Q4_K_M
Run and chat with the model
lemonade run user.smart-secrets-scanner-gguf-Q4_K_M
List all available models
lemonade list
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf richfrem/smart-secrets-scanner-gguf:Q4_K_M# Run inference directly in the terminal:
llama-cli -hf richfrem/smart-secrets-scanner-gguf:Q4_K_MUse pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf richfrem/smart-secrets-scanner-gguf:Q4_K_M# Run inference directly in the terminal:
./llama-cli -hf richfrem/smart-secrets-scanner-gguf:Q4_K_MBuild from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf richfrem/smart-secrets-scanner-gguf:Q4_K_M# Run inference directly in the terminal:
./build/bin/llama-cli -hf richfrem/smart-secrets-scanner-gguf:Q4_K_MUse Docker
docker model run hf.co/richfrem/smart-secrets-scanner-gguf:Q4_K_M- π Smart-Secrets-Scanner β Code Security Analysis Model (GGUF Edition)
π Smart-Secrets-Scanner β Code Security Analysis Model (GGUF Edition)
Version: 1.2 (Dataset Expansion & Llama 3.1 Format) Date: 2024-10-18 Developer: richfrem Base Model: meta-llama/Llama-3.1-8B-Instruct Training Environment: Local CUDA environment / PyTorch 2.9.0+cu126

π Overview
Smart-Secrets-Scanner is a specialized AI model fine-tuned for detecting accidental hardcoded secrets in source code. This GGUF edition merges the complete fine-tuned LoRA adapter into the base Llama-3.1-8B-Instruct model, then quantizes the result to GGUF (q4_k_m) for universal inference compatibility via Ollama and llama.cpp.
π Part of the open-source Smart-Secrets-Scanner GitHub repository, providing comprehensive code security analysis tools.
β¨ Key Features (v1.2 Update)
- Expanded Dataset: Trained on 536 curated examples for improved accuracy and coverage
- Llama 3.1 Instruct Format: Uses official Llama 3.1 chat templates for consistent training and inference
- Flexible Input Handling: Accepts any code analysis request without requiring specific instruction text
- Standard Template Support: Compatible with Ollama's default Llama 3.1 templates and other Instruct-based interfaces
π¦ Artifacts Produced
| Type | Artifact | Description |
|---|---|---|
| π§© LoRA Adapter | smart-secrets-scanner-lora |
Fine-tuned LoRA deltas for secret detection |
| π₯ GGUF Model | smart-secrets-scanner-gguf |
Fully merged + quantized model (Ollama-ready q4_k_m) |
| βοΈ Config Files | system, template, params.json | Individual files for Ollama config override (Llama 3.1 Instruct) |
| π Ollama Modelfile | Modelfile | Defines final runtime parameters for local deployment |
βοΈ Technical Details
Built using transformers 4.56.2, torch 2.9.0 + cu126, PEFT, TRL, and llama.cpp (GGUF converter) on CUDA-enabled hardware.
Training Improvements (v1.2):
- Llama 3.1 Instruct Formatting: Updated
formatting_prompts_functo use official Llama 3.1 chat templates - Dataset Expansion: Increased training examples to 536 for better generalization and accuracy
- Template Consistency: Eliminated prompt drift through standardized Llama 3.1 format across training and inference
Pipeline
- π Data Preparation β Curate secret detection dataset (536 examples) with Llama 3.1 instruction format
- π― Fine-tuning β LoRA fine-tuning with Llama 3.1 chat formatting for template consistency
- π Model Merge β Combine LoRA adapter with base model
- π¦ Quantization β Convert to GGUF (q4_k_m) format
- βοΈ Distribution β Upload to Hugging Face for deployment
π½ Deployment Guide (Ollama / llama.cpp)
Option A β Local Ollama Deployment
ollama create smart-secrets-scanner -f ./Modelfile
ollama run smart-secrets-scanner
Option B β Direct Pull (from Hugging Face)
ollama run hf.co/richfrem/smart-secrets-scanner-gguf:Q4_K_M
Option C β Llama 3.1 Instruct Template (Recommended for v1.2)
This model uses the official Llama 3.1 Instruct chat template for optimal performance.
# Works with Ollama's default Llama 3.1 template
ollama run hf.co/richfrem/smart-secrets-scanner-gguf:Q4_K_M
# Or use with LM Studio, llama.cpp, or any Llama 3.1-compatible interface
# Just provide your code analysis request directly
The model uses the official Llama 3.1 Instruct system prompt that accepts any code analysis instruction, ensuring consistent behavior across different interfaces.
βοΈ Intended Use
| Category | Description |
|---|---|
| Primary Purpose | Automated detection of hardcoded secrets in source code |
| Recommended Interfaces | Ollama CLI, LM Studio, llama.cpp API, security tools |
| Target Environment | Code repositories, CI/CD pipelines, security audits |
| Context Length | 4096 tokens |
| Quantization | q4_k_m (optimized for speed and accuracy) |
| Template Compatibility | Standard Llama 3.1 Instruct chat templates (official format) |
π Supported Secret Types
- API Keys: AWS, Stripe, OpenAI, GitHub, etc.
- Authentication Tokens: JWT, Bearer tokens, OAuth tokens
- Database Credentials: Connection strings, usernames, passwords
- Private Keys: SSH keys, SSL certificates, encryption keys
- Access Codes: Passwords, API secrets, access tokens
- Environment Variables: Proper usage validation
βοΈ Governance and Source
This model is a derivative product of the Smart-Secrets-Scanner project, governed by the BC Government.
For comprehensive details on development, governance, and contribution policies, please refer to the source GitHub repository:
| Document | Link |
|---|---|
| GitHub Source | bcgov/Smart-Secrets-Scanner |
| License | LICENSE |
| Code of Conduct | CODE_OF_CONDUCT.md |
| Contributing | CONTRIBUTING.md |
βοΈ License & Attribution
This model is licensed under the Creative Commons Attribution 4.0 International Public License (CC BY 4.0).
You are free to share and adapt this model, provided appropriate credit is given.
Required Attribution:
Derived from Smart-Secrets-Scanner (Β© 2025 richfrem / BC Government)Source: https://github.com/bcgov/Smart-Secrets-ScannerLicensed under CC BY 4.0
𧬠Model Lineage
- Base Model: meta-llama/Llama-3.1-8B-Instruct
- Fine-tuning Framework: PEFT + TRL (LoRA)
- Dataset: Smart-Secrets-Scanner Dataset (536 examples, JSONL)
- Formatting: Llama 3.1 Instruct (v1.2) - Official chat templates for consistent training/inference
- Quantization: GGUF (q4_k_m)
- Architecture: Decoder-only transformer
- Key Improvements (v1.2): Dataset expansion to 536 examples, Llama 3.1 format standardization
π§ͺ Testing the Model
Security Analysis Examples
The Smart-Secrets-Scanner model analyzes code snippets for potential security risks. With v1.2, the model uses Llama 3.1 Instruct format for natural language instructions.
Example 1 - API Key Detection (Flexible Prompt):
>>> Check this code for any secrets: API_KEY = 'sk-1234567890abcdef'
Expected Response: "ALERT: OpenAI API key detected - High risk of credential exposure"
Example 2 - Safe Pattern Recognition:
>>> Analyze this code for secrets: import os; api_key = os.getenv('API_KEY')
Expected Response: "No secrets detected - Environment variable usage is secure"
Example 3 - Database Credentials (Natural Language):
>>> Look for hardcoded secrets in this code: const DB_PASS = 'admin123!'; const DB_USER = 'root';
Expected Response: "ALERT: Database password detected - High risk of unauthorized access"
Example 4 - Multiple Languages:
>>> Scan this JavaScript for security issues: let token = "ghp_1234567890abcdef";
Expected Response: "ALERT: GitHub personal access token detected - High risk of repository compromise"
π Performance Metrics
- Secret Detection Accuracy: 0.92
- Precision: 0.89 (low false positive rate)
- Recall: 0.94 (high detection coverage)
- Supported Languages: Python, JavaScript, Java, Go, C++, and more
Full technical documentation and training notebooks are available in the π Smart-Secrets-Scanner GitHub Repository.
Ollama Usage
ollama run ai-secret-scanner
- Downloads last month
- 814
4-bit
Install from brew
# Start a local OpenAI-compatible server with a web UI: llama-server -hf richfrem/smart-secrets-scanner-gguf:Q4_K_M# Run inference directly in the terminal: llama-cli -hf richfrem/smart-secrets-scanner-gguf:Q4_K_M