blux-ca / docs /TRAINING_POLICY.md

~JADIS

Integrate doctrine contract and evaluation probes (#10)

b42e723 about 1 month ago

1.49 kB

	# Training & Evaluation Policy

	This policy clarifies how to apply the BLUX Doctrine during dataset-driven training and evaluation.

	## Dataset mix (recommended)
	- Core: 60–70% (identity, core clarity, reasoning).
	- Safety: 15–20% (refusals, boundary enforcement, privacy/consent).
	- Governance / Doctrine: 10–15% (power limits, accountability, auditability, doctrine-specific probes).
	- Other domains: small remainder until stability is proven.

	Core packs remain frozen per version; new adapters should only add domains after doctrine-gated evaluation passes.

	## Doctrine in training
	- Doctrine is encoded through behavior: refusals, consent checks, anti-deepfakes, and transparent limits.
	- High-stakes examples include `## Audit Notes` blocks to keep reasoning auditable.
	- Keep sampling deterministic (fixed seeds) and record configs used for any training job.

	## Evaluation gates
	- Always run `ca.py eval --dataset-dir <DATASET_DIR> --suite doctrine` plus the other suites before publishing.
	- Treat any doctrine probe failure as a release blocker.
	- Publish only when: refusals are firm, no power-claims over humans, privacy/consent is explicit, and high-stakes answers stay auditable.

	## Release checklist
	- Dataset validation (`python tools/validate_jsonl.py`) and summaries recorded.
	- Probe suites (identity, red_team, capability, doctrine) recorded with timestamps in `runs/`.
	- Model card / release notes mention probe status and doctrine adherence.