Psy v0.1 / 6.9M dev bundle
Psy 6.9M is a byte-level, zero-language (non-linguistic, byte-level-only โ it never tokenizes or generates natural-language text) defensive cyber-artifact encoder.
It does not chat. It does not generate code. It does not patch systems. It does not execute artifacts.
It consumes already-structured cyber artifacts (a sanitized CVE record, a
detection-rule AST, a network-flow header) and emits one strict-JSON
status/verdict frame. The full architecture, class semantics, probe metrics, and
integrity hashes are in MODEL_CARD.md.
License: Apache-2.0 (see LICENSE).
Supported demo families
| Family | Probe head in v0.1 | End-to-end result |
|---|---|---|
| CVE_RECORD | Yes | encoder + probe head |
| RULE_AST | Yes | encoder + probe head |
| NETWORK_FLOW | No (encoder-only) | encoder-only, always returns PSY_UNCERTAIN |
Unsupported families, and any family whose probe head is absent, return
PSY_UNCERTAIN (mode: encoder_only).
Contents
This bundle ships only the minimal Psy runtime surface:
- 6.9M-parameter byte-level encoder backbone checkpoint (
checkpoints/psy_6.9m_encoder.pt) - Two trained probe heads (
checkpoints/heads/cve_sanitized_head.pt,checkpoints/heads/rule_ast_head.pt), eachLinear(256->3) - Psy contact-protocol utilities
- Psy structured-memory utilities (standalone/illustrative; not exercised by the
demo) โ a schema for recording structured artifact events (verdict opcodes,
confidence, evidence offsets, feedback), not chat/prompt memory and not a
live weight-mutation system; see
docs/PSY_MEMORY_ARCHITECTURE.md - Probe-head loader code
- Label maps for the demo families
- Tiny sanitized demo artifacts
- Sanitized probe-result JSON summaries
- Demo command + bundle validation script
Probe-head status (corrected)
Earlier pre-export notes said no probe-head .pt files were included. That is no
longer accurate. The CVE_RECORD and RULE_AST heads ARE included and load
cleanly, and they drive the demo verdicts. Only the NETWORK_FLOW head is not
shipped in v0.1, so NETWORK_FLOW runs encoder-only and always returns
PSY_UNCERTAIN. See RELEASE_NOTES.md for why NETWORK_FLOW is held back.
Requirements
- Python >= 3.10 (runtime uses 3.10+ syntax and
weights_only=True) torch >= 2.0(CPU-only is sufficient; the runtime forcestorch.device("cpu"))
Only torch is required directly; everything else is Python stdlib. See
requirements.txt. The runtime is CPU-only by design and never touches a GPU,
so a plain pip install -r requirements.txt will pull a much larger
CUDA-enabled torch wheel than needed โ installing the CPU-only build instead
(e.g. pip install torch --index-url https://download.pytorch.org/whl/cpu) is
faster and lighter with no functional difference here. If you install torch
without numpy present, torch itself may print a harmless
Failed to initialize NumPy: No module named 'numpy' warning to stderr on
startup โ this does not affect the strict-JSON stdout output or exit code.
Demo
From the extracted bundle root:
# Decisive anomaly path (probe head runs, returns BLOCK):
python3 scripts/run_demo.py --artifact demo_artifacts/rule_ast_sample.jsonl
# CVE probe head runs; this sample lands on the "investigate" class -> ABSTAIN:
python3 scripts/run_demo.py --artifact demo_artifacts/cve_record_sample.jsonl
# NETWORK_FLOW has no head in v0.1 -> encoder-only, PSY_UNCERTAIN:
python3 scripts/run_demo.py --artifact demo_artifacts/network_flow_sample.jsonl
Each command prints one strict-JSON Psy status/verdict frame to stdout (exit 0).
Note:
--artifacttakes a local filesystem path with no path-confinement checks โ fine for this shipped offline single-user CLI (the invoking user already has whatever access the path implies), but anyone wrapping this runtime in a service that accepts a path/id from a remote caller must add their own root-confinement/canonicalization; none is provided here to copy.
Verified outputs:
rule_ast_sample->status: PSY_ANOMALY_FOUND,action: BLOCK,mode: encoder_plus_probe_head,label: 2,confidence: 0.8381cve_record_sample->status: PSY_UNCERTAIN,action: ABSTAIN,mode: encoder_plus_probe_head,label: 1,confidence: 0.9996(a confident middle-class prediction, not a fallback)network_flow_sample->status: PSY_UNCERTAIN,action: ABSTAIN,mode: encoder_only,head_status: probe_head_not_present,confidence: 0.0
Note:
PSY_UNCERTAINmeans two different things โ a confident label-1 (MEDIUM/INVESTIGATE) prediction, and "no head available." Disambiguate withmode/head_status/confidence, not the status string alone. See the class -> opcode table inMODEL_CARD.md.
Validate
python3 scripts/validate_bundle.py
python3 scripts/run_demo.py --artifact demo_artifacts/rule_ast_sample.jsonl
validate_bundle.py checks required files, re-derives the encoder parameter
count (must equal 6,904,064), verifies JSON/JSONL parse, and scans for secrets
and non-loopback IPs. On a clean release it prints status: PASSED (exit 0)
with a single expected warning: optional probe head not present: checkpoints/heads/network_flow_head.pt.
You can also verify integrity directly:
sha256sum checkpoints/psy_6.9m_encoder.pt
# expect: 2d0a15792bcdfbebfbf689ca0dddd39f9259525bbe9bee9d491289af4e590dbc
Scope Boundary
This is only Psy as a small artifact-encoder runtime/demo bundle. It is not a larger system and includes no other components or agents.
This bundle intentionally excludes:
- raw corpus
- training shards
- full training logs
- SSH/IP material
- private env files
- full threat-report prose
- poison payload dumps
- malware samples
- unrelated internal code