autoscan / docs /api /rules.md
Chris4K's picture
Initial commit v5.0.0.
5248e3b verified

API Reference β€” rules/

rules/__init__.py

Semgrep rule pack registry. All constants are Path objects pointing to YAML files in the project root. The ALL_* lists are consumed by core/scanner.py's scan_repo() to build the parallel task list.


Individual path constants

from rules import CORE, WEB, CRYPTO, ML, SECRETS, PERF, LLM
Constant File Description
CORE core.yaml Core Python security β€” subprocess injection, eval, pickle deserialization, unsafe YAML loading
WEB web.yaml Web security β€” XSS, SSRF, open redirect, path traversal
CRYPTO crypto.yaml Cryptographic failures β€” weak ciphers, hardcoded keys, insecure RNG
ML ml.yaml ML-specific β€” unsafe pickle.load, torch.load without weights_only, model-path injection
SECRETS secrets.yaml Secret patterns β€” API keys, tokens, credentials in code
PERF perf.yaml Performance anti-patterns β€” list building in loops, try/except in loops
LLM llm.yaml LLM/agent security β€” prompt injection (LLM01), insecure output handling (LLM02), PII in prompts (LLM06)

Aggregated list constants

ALL_SECURITY

ALL_SECURITY: List[Tuple[str, Path, str]] = [
    ("Semgrep:Core",    CORE,    "security"),
    ("Semgrep:Web",     WEB,     "security"),
    ("Semgrep:Crypto",  CRYPTO,  "security"),
    ("Semgrep:ML",      ML,      "security"),
    ("Semgrep:Secrets", SECRETS, "security"),
]

Iterated in scan_repo() when run_security=True. Each tuple (label, path, category) produces one semgrep_pack() call per entry.

ALL_PERFORMANCE

ALL_PERFORMANCE: List[Tuple[str, Path, str]] = [
    ("Semgrep:Perf", PERF, "performance"),
]

Iterated when run_performance=True.

ALL_LLM

ALL_LLM: List[Tuple[str, Path, str]] = [
    ("Semgrep:LLM", LLM, "security"),
]

Iterated when run_llm=True.


Semgrep YAML rule format

Each .yaml file follows the Semgrep rule schema. The metadata block controls how findings are categorized:

rules:
  - id: my-rule-id
    patterns:
      - pattern: |
          dangerous_call($X, ...)
    message: |
      Dangerous call detected. $X may be user-controlled.
    severity: ERROR           # ERROR | WARNING | INFO
    languages: [python]
    metadata:
      owasp:
        - A03:2021-Injection
      confidence: confirmed   # confirmed | likely | possible
      category: security

metadata fields used by autoscan:

Field Usage
owasp Stored in finding["owasp"]; drives SARIF help URIs and HTML badges
confidence Stored in finding["confidence"]
category Used as finding["category"] ("security" or "performance")

Adding a new rule pack

  1. Create myrules.yaml in the project root.
  2. Add a constant and list entry in rules/__init__.py:
MYRULES = _ROOT / "myrules.yaml"

ALL_SECURITY = [
    ...
    ("Semgrep:MyRules", MYRULES, "security"),
]

scan_repo() automatically picks it up β€” no changes to core/scanner.py needed.

See How to Extend for the full walkthrough.