Spaces:

Flamehaven
/

stem-bio-ai

Running

App Files Files Community

stem-bio-ai / docs /ARCHITECTURE.md

Flamehaven Initiative

docs: sync hf surfaces to v1.8.3

637fa8a about 8 hours ago

preview code

raw

history blame contribute delete

4.69 kB

	# STEM BIO-AI Architecture

	This document describes the implemented repository structure and runtime boundaries in `v1.8.3`.

	## Purpose

	STEM BIO-AI scans a local repository and classifies its observable evidence surface.

	It does not execute target training runs, call an LLM in the default scoring path, or assert clinical safety.

	## Operating Model

	The default path is local and deterministic:

	1. read repository files from a local clone
	2. extract README, docs, code, CI, and package metadata signals
	3. score those signals across Stage 1, Stage 2R, Stage 3, and Stage 4 lanes
	4. apply code-integrity penalties and policy caps
	5. emit traceable JSON, Markdown, HTML, and PDF artifacts

	Optional advisory flows exist as a separate trust boundary and are documented in:

	- [ADVISORY_RUNTIME.md](ADVISORY_RUNTIME.md)
	- [ADVISORY_SECRET_HANDLING.md](ADVISORY_SECRET_HANDLING.md)
	- [API_CONTRACT.md](API_CONTRACT.md)

	## High-Level Flow

	```text
	Target repository
	-> scanner.py
	-> detector_surface.py / detectors.py
	-> detector_ast.py / detector_bio.py / detector_contract.py / detector_stage4.py
	-> evidence.py + policy_intent.py + calibration_profile.py
	-> render.py / render_html.py
	-> JSON / Markdown / HTML / PDF evidence packet
	```

	## Core Modules

	\| Module \| Responsibility \|
	\|---\|---\|
	\| `stem_ai/cli.py` \| CLI entry points for `scan`, `gate`, `policy`, and advisory commands \|
	\| `stem_ai/scanner.py` \| Repository walk, signal orchestration, score assembly \|
	\| `stem_ai/detectors.py` \| Shared detector entry surfaces and signal collection glue \|
	\| `stem_ai/detector_surface.py` \| README/docs/package/CI evidence extraction \|
	\| `stem_ai/detector_ast.py` \| AST-based code integrity and contract checks \|
	\| `stem_ai/detector_bio.py` \| Bio-adjacent deterministic diagnostics and parser-guard lanes \|
	\| `stem_ai/detector_contract.py` \| Contract and mismatch detection across advertised and implemented surfaces \|
	\| `stem_ai/detector_stage4.py` \| Replication and reproducibility evidence lane \|
	\| `stem_ai/evidence.py` \| Trace object construction and proof ledger materialization \|
	\| `stem_ai/render.py` / `stem_ai/render_html.py` \| Markdown, PDF, and HTML report generation \|
	\| `stem_ai/calibration_profile.py` \| Versioned policy profile loading and preview/simulation support \|
	\| `stem_ai/policy_intent.py` \| Governed profile-derivation and policy-intent handling \|
	\| `stem_ai/advisory_*` \| Provider-neutral advisory packet/export/validation boundary \|

	## Score Construction

	The canonical score path is documented in [SCORING_RATIONALE.md](SCORING_RATIONALE.md). At a high level:

	- Stage 1 measures README evidence and hype/responsibility signals
	- Stage 2R measures repo-local consistency across docs, package metadata, CI, and tests
	- Stage 3 measures code/bio responsibility and integrity-adjacent evidence
	- Stage 4 reports replication evidence as a separate lane
	- C1-C6 penalties and caps apply bounded deductions or floors

	The architecture preserves a strict rule: preview policy simulations must not silently rewrite the authoritative deterministic score.

	## Trust Boundaries

	### Default deterministic boundary

	- local repository only
	- no required network access
	- no LLM in the scoring loop
	- evidence must point to a concrete file, line, pattern, or artifact

	### Advisory boundary

	- entered only through explicit advisory commands
	- packet export, provider call intent, and response validation are separated
	- secret handling and claim filters are governed independently of the deterministic score path

	### Output boundary

	- reports are review aids
	- tiers classify observable evidence posture
	- reports are not clinical validation, regulatory approval, or deployment certification

	## Failure And Fallback Behavior

	- missing optional extras reduce output capabilities rather than rewrite the score semantics
	- PDF generation depends on the `pdf` extra; JSON/Markdown paths remain available without it
	- advisory provider execution is intentionally separate from deterministic scoring
	- policy simulation can preview alternate postures without mutating the canonical scoring profile

	## Verification Surface

	The repository exposes these concrete verification paths:

	```bash
	pip install -e ".[pdf]"
	python -m py_compile stem_ai/cli.py stem_ai/scanner.py stem_ai/render.py stem_ai/app.py
	stem --help
	python -m stem_ai --help
	python -m pytest -q
	python -m build
	```

	## Related Documents

	- [README.md](../README.md)
	- [CLI_REFERENCE.md](CLI_REFERENCE.md)
	- [SCORING_RATIONALE.md](SCORING_RATIONALE.md)
	- [DETERMINISTIC_DIAGNOSTICS.md](DETERMINISTIC_DIAGNOSTICS.md)
	- [API_CONTRACT.md](API_CONTRACT.md)
	- [ADVISORY_RUNTIME.md](ADVISORY_RUNTIME.md)