Observations, Under-Observation, and Repair Loops
From “No Jump Without Observations” to Practical Design Patterns Draft v0.1 — Non-normative supplement to SI-Core / SI-NOS / SIM/SIS / SCP / Semantic Compression
This document is non-normative. It explains how to design and operate the observation side of an SI-Core system:
- what counts as an “observation unit,”
- what “under-observation” actually means, and
- how to build repair loops instead of silently jumping in the dark.
Normative contracts live in the SI-Core / SI-NOS core specs, SIM/SIS design, SCP spec, and the relevant evaluation packs.
1. Why OBS needs a cookbook
SI-Core has a simple ethos on paper:
No effectful Jump without PARSED observations.
Two clarifications (to avoid misreading):
“No jump” means the Jump transition is blocked. Under-observed conditions may still run read / eval_pre / jump-sandbox, but MUST NOT execute a commit jump, and MUST NOT publish results. If you run a sandboxed dry-run, set
publish_result=falseandmemory_writes=disabled.Even when observations are
PARSED, effectful commits SHOULD be gated by coverage/confidence minima (declared by the implementer). IfPARSEDbut below minima, the system should request observation extension or enter a conservative/safe path.
This doc is a cookbook for making that ethos operational: explicit obs units, explicit under-observation detection, and explicit repair loops.
Raw logs → half-parsed events → LLM / heuristics → side-effects (!)
Common failure modes:
- “We had logs, but they weren’t structured, so the system thought it had context.”
- “Sensor was down, but we just reused stale values.”
- “We compressed too aggressively and didn’t notice whole regions becoming invisible.”
This doc is about:
- Observation units — how to shape them so they’re usable and auditable.
- Under-observation — how to detect that you don’t know enough.
- Repair loops — what the system does when it’s under-observed, short-term and long-term.
- Integration with semantic compression (art-60-007) and SIM/SIS.
You can read it as the “OBS cookbook” that pairs with:
- Goal-Native Algorithms (GCS, trade-offs)
- Semantic Compression (SCE, SIM/SIS, SCP)
- ETH overlays and operational runbooks.
2. What is an “observation unit” in SI-Core?
We assume an explicit observation unit abstraction, not “random JSON blobs.”
Very roughly:
obs_unit:
id: "obs-2028-04-15T10:23:42Z-1234"
sem_type: "city.flood_risk_state/v1"
scope:
city_id: "city-01"
sector_id: 12
horizon_min: 60
payload:
risk_score: 0.73
expected_damage_eur: 1.9e6
observation_status: "PARSED" # see below
confidence: 0.87 # 0.0–1.0
source:
channel: "sensor_grid"
semantic_path: "sce://flood/v1"
backing_refs:
- "raw://sensors/canal-12@2028-04-15T10:20:00Z/10:23:00Z"
ethics_tags:
jurisdictions: ["EU", "DE"]
gdpr_basis: ["art_6_1_e"]
created_at: "2028-04-15T10:23:42Z"
Key properties:
sem_type— matches the semantic types catalog (SIM/SIS).scope— where/when/who this observation applies to.payload— goal-relevant content, syntactically validated.observation_status— how trustworthy / usable this unit is.confidence— continuous measure, separate from status.backing_refs— path back to raw data when allowed.
2.1 Observation status taxonomy
We’ll use a small, explicit enumeration:
PARSED — syntactically valid, semantically well-typed, within spec.
DEGRADED — usable, but below normal quality (e.g., partial data, lower res).
STUB — placeholder syntactically valid, but missing key payload.
ESTIMATED — forward-filled / model-filled; not directly observed.
MISSING — we know we wanted this, but we don’t have it.
REDACTED — intentionally removed (privacy, policy).
INVALID — failed parsing / integrity; must not be used.
SI-Core policy (non-normative but recommended):
Jumps that materially depend on a goal must not proceed if their required observation bundle has:
- any
INVALID, or MISSINGwhere no fallback is defined, orREDACTEDwithout appropriate ETH escalation.
- any
DEGRADED/ESTIMATEDare allowed only under explicit degradation policies (see §5).
2.2 Compatibility note: mapping to SI-Core core status
Core SI-Core interfaces often expose a smaller status set (e.g. PARSED | PARTIAL | PENDING).
This cookbook uses a more detailed taxonomy for operational clarity.
A practical mapping is:
PARSED→ corePARSEDDEGRADED | ESTIMATED | STUB→ corePARTIAL(orPARSED_BELOW_MINIMAas a label)MISSING | REDACTED→ corePENDING(decision must not assume presence)INVALID→ corePENDING+ integrity-fail flag (must not be used)
Recommendation: keep both fields:
observation_status(detailed, cookbook)status_core(minimal, spec-facing)
3. What does “under-observation” mean?
Under-observation is not just “sensor offline.” It is any condition where the observation layer is structurally insufficient for the goals at stake.
We can classify three main forms:
Missing observations
- No obs unit at all where the goal contract says one is required.
- Example: no recent
canal_segment_statefor a sector in a high-risk flood area.
Coarse or degraded observations
We have something, but:
- resolution is too low,
- sampling is too sparse,
- semantics were over-compressed (semantic compression too aggressive).
Example: only city-wide average water level, no per-sector breakdown, during a storm.
Biased or structurally skewed observations
- Observations systematically leave out certain regions / groups / states.
- Example: traffic sensors mostly in wealthy districts; ND learners less observed because they avoid the system.
In SI-Core terms, “under-observation” typically shows up via:
- Coverage metrics (
SCover_obs) dropping. - Integrity metrics (
SInt_obs) reporting semantic violations. - Observation-status maps showing
DEGRADED / ESTIMATED / STUB / MISSING / REDACTEDwhere contracts requirePARSED.
4. Coverage and integrity: SCover / SInt for OBS
You can think in terms of two basic metrics families:
- Coverage — “have we observed enough of the world we care about?”
- Integrity — “are those observations structurally consistent?”
4.1 Observation coverage (SCover_obs)
Non-normative sketch:
scover_obs:
window: "2028-04-15T10:00:00Z/11:00:00Z"
domain: "city.flood"
required_scopes:
- {city_id: "city-01", sector_id: 1}
- {city_id: "city-01", sector_id: 2}
- ...
observed:
parsed_pct: 0.86 # observation_status=PARSED
degraded_pct: 0.06 # DEGRADED
estimated_pct: 0.02 # ESTIMATED
stub_pct: 0.01 # STUB
missing_pct: 0.03 # MISSING (wanted but not present)
redacted_pct: 0.01 # REDACTED (policy/consent; ETH-significant)
invalid_pct: 0.01 # INVALID (must not be used)
thresholds:
parsed_min: 0.90
invalid_max: 0.00
missing_max: 0.02
redacted_max: 0.02 # optional; depends on domain + ETH posture
status: "warning"
notes:
- "Do not treat REDACTED as 'sensor outage'—it is a policy/consent state."
- "INVALID is an integrity failure; requires parser/pipeline repair."
You can compute scover_obs at different levels:
- per goal (e.g., flood risk vs traffic vs learning),
- per region or population,
- per time window.
4.2 Observation integrity (SInt)
We also want a sense of contract violations:
- sem_type / payload mismatch,
- impossible values (negative water depth),
- inconsistent scopes (two obs units claiming exclusive states for same scope/time).
Sketch:
sint_obs:
window: "2028-04-15T10:00:00Z/11:00:00Z"
violations:
type_mismatch: 3 # bad payload vs sem_type schema
impossible_values: 1 # negative water level
duplicate_conflicts: 2 # two mutually exclusive states
violations_per_1e4_units: 1.7
status: "ok" # or "warning"/"critical"
Together, SCover_obs + SInt serve as:
- “OBS health” metrics,
- early warning for under-observation,
- gates for safe mode / degraded mode decisions (§5).
5. Degradation and safe-mode patterns when under-observed
When scover_obs drops or observation_status degrades, what does the system actually do? You want explicit patterns, not ad-hoc improvisation.
5.1 Intra-jump behavior: per-decision handling
For each jump type, define an observation contract:
jump_contract:
name: "city.adjust_flood_gates"
# Bundle-level minima (align with spec-style "declared minimums"):
bundle_minima:
coverage_min: 0.50
confidence_min: 0.65
required_obs:
- sem_type: "city.flood_risk_state/v1"
scope: ["city_id", "sector_id"]
status_allowed: ["PARSED"]
min_confidence: 0.80
max_age_sec: 300
- sem_type: "city.traffic_state/v1"
scope: ["city_id", "sector_id"]
# Define “degraded band” explicitly instead of a single label.
status_allowed: ["PARSED", "DEGRADED", "ESTIMATED"]
min_confidence: 0.50
max_age_sec: 900
effect_gating:
# If contract not satisfied, allow reasoning but block effectful ops.
# Escalation produces an operator-facing ticket/request; it does not execute external effects.
on_contract_fail: "no_effectful_ops_and_escalate"
fallback_policy:
if_missing_or_invalid_flood_risk_state:
action: "safe_mode" # no gate moves, raise alert
if_degraded_traffic_state:
action: "conservative" # assume worst plausible congestion
if_redacted_required_obs:
action: "eth_escalation" # treat as policy/consent state, not outage
Pseudo-code:
def prepare_jump(request):
obs_bundle = load_obs_bundle(request.scope, request.time)
if not obs_bundle.satisfies_contract(jump_contract):
return handle_under_observation(obs_bundle, jump_contract)
return proceed_with_normal_decision(obs_bundle)
Typical fallback actions:
Safe-mode
- No effectful ops.
- Only diagnostics / alerts / logging.
- “We don’t know enough to move actuators.”
Conservative default
- Choose action that dominates on safety / ETH goals under uncertainty.
- Example: hold floodgates in safer position; slow traffic lights.
Sandbox-only
- Run the jump through evaluation / simulation.
publish_result=falseandmemory_writes=disabled.- No effectful ops (no external effects). Output is evaluation-only.
Human-in-loop escalation
- Forward to operator dashboard with “under-observation” banner.
- Require manual override to proceed.
5.2 Inter-jump behavior: repair loops over time
In addition to per-jump fallback, SI-Core should repair its observation regime over time.
Repair loop outline:
under_observation_event:
detected_at: "2028-04-15T10:23:42Z"
domain: "city.flood"
scope: {city_id: "city-01"}
scover_obs_delta: -0.12
root_causes:
- "sensor_cluster_central_offline"
- "compression_policy_too_aggressive"
immediate_actions:
- "enable_safe_mode_for_flood_jumps"
- "switch_SCE_to_high_fidelity_profile"
long_term_actions:
- "PLB proposal: add backup sensors"
- "update_jump_contracts to handle degraded obs"
This is where PLB / self-modification comes in:
PLB can propose changes to:
- observation contracts (what is required),
- SCE policies (how aggressively to compress),
- SIM indexing strategies,
- sampling periods.
Those proposals run through:
- simulation / sandbox,
- ETH + governance review,
- staged rollout.
6. How this fits semantic compression (art-60-007)
Observation and semantic compression are deeply intertwined:
- SCE decides what to emit as obs units.
- OBS decides whether that’s enough to jump.
A few alignment patterns:
6.1 OBS-aware SCE policies
In high-risk domains, SCE should be goal- and OBS-aware:
sce_policy:
domain: "city.flood"
risk_profile: "HIGH"
target_semantic_ratio_Rs: 8.0
epsilon_max: 0.05 # max goal utility loss
obs_requirements:
require_parsed_for:
- "city.flood_risk_state/v1"
allow_degraded_for:
- "city.traffic_state/v1"
The SCE:
- must ensure that required obs types hit their
parsed_mintargets, - can be more aggressive on low-stakes channels.
6.2 Fallback to raw when OBS says “not enough”
Pattern:
If semantic view is too uncertain / degraded, request raw for this scope.
Implementation sketch:
def ensure_obs_for_jump(scope, time):
obs_bundle = fetch_semantic_obs(scope, time)
if obs_bundle.confidence < CONF_THRESHOLD:
raw_window = fetch_raw(scope, time_window="5m")
sem_units = sce.recompress(raw_window, context=HIGH_FIDELITY_CTX)
write_to_sim(sem_units)
obs_bundle = fetch_semantic_obs(scope, time)
return obs_bundle
This creates a closed loop:
- OBS detects low confidence / under-observation.
- SCE temporarily raises fidelity for the affected region.
- SIM/SIS gets higher-quality obs units.
6.3 Semantic compression vs under-observation
Under-observation can come from:
- missing raw (sensor down), or
- over-aggressive semantic compression (SCE dropped too much).
You want metrics to distinguish these:
under_observation_breakdown:
window: "2028-04-15T10:00:00Z/11:00:00Z"
missing_raw_pct: 0.03
compression_loss_pct: 0.07
compression_policies:
- policy_id: "flood_low_power_v2"
epsilon_est: 0.09
epsilon_max: 0.05
status: "review_required"
PLB proposals can then be targeted:
- add sensors vs adjust SCE vs adjust goals.
7. Under-observation patterns by domain
Concrete patterns help keep this grounded.
7.1 CityOS: flood + traffic
Obs units:
city.flood_risk_state/v1(per sector, horizon).city.traffic_state/v1(per segment, time of day).
Typical under-observation:
- upstream sensor cluster offline → entire canal invisible.
- radar feed down → rainfall nowcast missing.
- semantic compression dropped per-segment detail; only city-level average remains.
Repair loops:
- safe-mode for floodgates; hold positions.
- emergency rule: assume worst plausible flood risk where unobserved.
- PLB proposal: reallocate sensors; adjust SCE hierarchy; add redundancy.
7.2 Learning / developmental support
Obs units:
learner.session_event/v1(exercise attempts, hints).learner.self_report.simple/v1(I liked this / too hard / too easy).learner.affect.estimate/v1(stress proxies; see ETH gating).
Typical under-observation:
- learner rarely uses the system → very sparse
session_eventunits. - some ND learners avoid self-reports → affect mostly inferred, low confidence.
- consent withdrawn for affect monitoring → entire affect channel REDACTED.
- learner rarely uses the system → very sparse
Repair loops:
- system must not over-interpret absence; treat as “unknown”, not “fine”.
- teacher dashboards flag “low observation coverage” learner-by-learner.
- PLB proposals: adjust cadence of short check-ins; non-intrusive self-report patterns; redesign for better accessibility.
7.3 OSS / CI pipelines
Obs units:
ci.pipeline_run/v1(tests run, duration, outcome).ci.commit_diff_summary/v1(files, subsystems touched).
Typical under-observation:
- test logs missing for certain branches.
- CI skipped on some PRs (“trivial change” heuristics too aggressive).
- flaky network results in partial logs, marked
DEGRADED.
Repair loops:
- mark coverage holes as explicit under-observation; do not over-trust success rates.
- PLB proposals: adjust “trivial change” rules; enforce periodic full runs; better log pipelines.
8. Testing the observation layer
OBS needs its own test discipline, not just “if code compiles, we’re fine.”
8.1 Property tests for observation contracts
Sketch:
@given(random_jump_request())
def test_no_jump_without_required_obs(request):
obs_bundle = synthesize_obs_bundle(request, missing_required=True)
with raises(UnderObservationError):
si_core.process_jump(request, obs_bundle)
Other properties:
- “If obs units violate schema → jump rejected, not silently coerced.”
- “If obs bundle has INVALID units → no jump, ETH alert raised.”
- “Degraded obs only used when contract explicitly allows it.”
8.2 Chaos for observations: sensor & SCE failure drills
Randomly drop sensor streams in a sandbox:
- verify
scover_obssignals under-observation, - verify safe-mode / fallback behavior.
- verify
Randomly break semantic compression:
- produce STUB / DEGRADED units,
- verify system doesn’t treat them as PARSED.
8.3 sirrev / golden-diff for observation code
Capture golden observation traces for known scenarios.
When you refactor SCE or observation contracts:
- re-run scenarios; compare semantic obs streams (
si-golden-diff). - any difference beyond tolerance must be justified (better semantics or deliberate design).
- re-run scenarios; compare semantic obs streams (
9. Repair loops as first-class citizens
Under-observation repair should be visible in SI-Core, not hidden implementation detail.
9.1 OBS repair events in the Effect Ledger
Every significant under-observation episode should produce effect ledger entries like:
effect:
type: "obs.repair_request/v1"
scope: {city_id: "city-01", sector_id: 12}
reason: "scover_obs_below_threshold"
requested_actions:
- "increase_sampling_rate"
- "enable_redundant_sensor_path"
- "lower_compression_Rs_target"
created_by: "[OBS]"
jump_id: "jump-042"
This gives PLB, EVAL, and governance something concrete to work with.
9.2 PLB proposals: observation-side improvements
PLB can:
mine the effect ledger for recurring under-observation patterns,
propose:
- new sem_types (better abstractions),
- better feature engineering for SCE,
- updated observation contracts for jumps,
- instrumented logging in under-observed domains.
Governance should treat “change to observation contract” as significant:
- it effectively changes where SI-Core is allowed to act confidently.
10. Anti-patterns to avoid
A few recurring traps:
“Logs are observations, right?”
- Treating arbitrary text logs as structured obs without sem_types, scope, status, or schemas.
- Fix: always wrap into obs units with explicit sem_type and contracts.
“If we don’t see a problem, there is no problem.”
- Absence of evidence ≠ evidence of absence.
- Fix: treat lack of obs as state (
MISSING), not as “OK.”
“Just measure more things.”
- Adding sensors / events without thinking about goal relevance or semantics.
- Fix: start from goal surface, derive required obs surface.
“Semantic compression is invisible to OBS.”
- Letting SCE drop detail without OBS adjusting contracts or SCover thresholds.
- Fix: tie SCE policies and OBS contracts together; monitor ε and SCover.
“Degraded obs treated as full fidelity.”
- Using
DEGRADEDobs as if they werePARSED. - Fix: enforce status-aware logic in jump contracts; ETH should monitor misuse.
- Using
11. Summary checklist
When you say “we implemented [OBS],” you ideally mean:
Observation units are explicit (
sem_type,scope,payload,observation_status,confidence).Observation status taxonomy is implemented (
PARSED,DEGRADED,ESTIMATED,MISSING,REDACTED,INVALID).Each jump type has a written observation contract:
- required sem_types,
- allowed statuses,
- fallback / safe-mode / escalation behavior.
You compute SCover_obs and SInt per goal / region / window.
Under-observation triggers repair loops:
- immediate: safe-mode, conservative decisions, sandbox only.
- long-term: PLB proposals to improve observation and compression design.
OBS and SCE / semantic compression are linked:
- SCE policies respect OBS requirements,
- OBS can trigger high-fidelity recompression on demand.
The observation layer has its own tests:
- property tests (“no jump without required obs”),
- chaos drills (sensor failures, compression failures),
- sirrev / golden-diff for observation code.
Under-observation episodes are logged as first-class effects, visible to PLB, ETH, and governance.
If you can tick most of these boxes, you are no longer “just reading logs.” You have a structured observation regime that knows when it doesn’t know enough, and can repair itself over time instead of quietly failing in the dark.
12. Observation quality ranking and repair prioritization
Not all under-observation is equally critical. We need a principled way to:
- rank observations by criticality, and
- allocate limited repair effort to the things that actually matter.
12.1 Observation criticality tiers
A simple non-normative pattern is to define tiers per semantic type:
safety_critical_obs:
city.flood_risk_state/v1:
criticality: "CRITICAL"
max_missing_time_sec: 300 # 5 min
repair_priority: 1
fallback: "safe_mode_immediately"
learner.stress_indicators/v1:
criticality: "HIGH"
max_missing_time_sec: 1800 # 30 min
repair_priority: 2
fallback: "conservative_load"
efficiency_obs:
city.traffic_state/v1:
criticality: "MEDIUM"
max_degraded_time_sec: 3600
repair_priority: 3
fallback: "use_historical_average"
auxiliary_obs:
city.parking_availability/v1:
criticality: "LOW"
max_missing_time_sec: 7200
repair_priority: 4
fallback: "skip_feature"
Criticality drives:
- online fallback behavior (what to do now), and
- offline repair priority (what to fix first).
12.2 Repair prioritization
Given a set of under-observation events, SI-Core needs to decide what to repair first.
Illustrative prioritizer:
import numpy as np
from datetime import datetime, timezone
def now():
return datetime.now(timezone.utc)
class ObservationRepairPrioritizer:
def prioritize_repairs(self, under_obs_events):
"""Rank repair actions by impact, urgency, and feasibility."""
scored = []
for event in under_obs_events:
impact = self._compute_impact(event)
feasibility = self._compute_feasibility(event)
urgency = self._compute_urgency(event)
score = (
0.5 * impact +
0.3 * urgency +
0.2 * feasibility
)
scored.append((event, score))
return sorted(scored, key=lambda x: x[1], reverse=True)
def _compute_impact(self, event):
"""How many goals / jumps / people are affected?"""
affected_goals = self._get_affected_goals(event.sem_type)
goal_weights = sum(g.weight for g in affected_goals)
population_affected = event.scope.get("population_size", 1)
return goal_weights * np.log1p(population_affected)
def _compute_feasibility(self, event):
"""Can we fix this quickly and cheaply?"""
rc = event.root_cause
if rc == "sensor_offline":
return 0.8 # usually fixable
if rc == "compression_too_aggressive":
return 0.9 # policy change
if rc == "structural_bias":
return 0.3 # hard, redesign
return 0.5
def _compute_urgency(self, event):
"""How close we are to critical missing-time thresholds."""
criticality = self.obs_catalog.get_criticality(event.sem_type)
time_missing = (now() - event.last_good_obs_time).total_seconds()
max_allowed = self.obs_catalog.get_max_missing_time(event.sem_type)
if criticality == "CRITICAL" and time_missing > max_allowed:
return 1.0 # already beyond allowed window
ratio = time_missing / max_allowed if max_allowed > 0 else 1.0
return min(1.0, ratio ** 2) # super-linear urgency
This is illustrative; real systems will tune weights and heuristics.
12.3 Repair budgets and multi-objective planning
You can treat observation repair as a budgeted optimization problem:
repair_budget:
daily_budget_hours: 40
allocations:
tier_1_safety_critical: "50% of budget"
tier_2_efficiency: "30% of budget"
tier_3_auxiliary: "20% of budget"
escalation_rules:
- "If tier_1 needs > 50%, borrow from tier_3"
- "Never reduce tier_1 below 40%"
Non-normative multi-objective sketch:
def optimize_repair_plan(under_obs_events, budget_hours):
"""
Solve a simple multi-objective repair plan:
- Maximize safety goal coverage
- Maximize efficiency goal coverage
- Minimize total repair cost
Subject to: total effort <= budget_hours
"""
problem = RepairOptimizationProblem()
for event in under_obs_events:
problem.add_repair_action(
event=event,
safety_impact=event.safety_impact,
efficiency_impact=event.efficiency_impact,
cost_hours=event.estimated_repair_hours,
)
solution = problem.solve(budget=budget_hours)
return solution.selected_repairs
Example result:
repair_queue_2028_04_15:
rank_1:
event: "flood_risk_state missing, sector_12"
impact: 0.95
urgency: 0.98
feasibility: 0.80
action: "Deploy backup sensor immediately"
rank_2:
event: "learner_affect degraded, 15 students"
impact: 0.70
urgency: 0.65
feasibility: 0.90
action: "Switch to high-fidelity affect model"
rank_3:
event: "traffic_state coarse, district_A"
impact: 0.45
urgency: 0.30
feasibility: 0.85
action: "Adjust SCE compression policy"
13. Observation cost-benefit analysis
More observation has real costs:
- sensors and deployment,
- bandwidth and storage,
- compute for SCE / analytics,
- governance overhead.
We want a simple way to ask:
“Is this extra observation worth it, in terms of reduced risk or improved goals?”
13.1 Cost model
Non-normative example:
observation_costs:
city.flood_risk_state/v1:
sensor_capex_per_unit: 5000 # EUR
sensor_opex_per_year: 500
bandwidth_gb_per_day: 2.5
storage_tb_per_year: 0.9
compute_for_sce_cpu_hours_per_day: 0.5
total_annual_cost_per_sensor: 1200
learner.session_event/v1:
storage_cost_per_learner_per_year: 10
compute_for_analytics_per_learner: 5
total_annual_cost_per_learner: 15
13.2 Benefit model (risk reduction)
We can approximate the benefit of improving observation coverage using historical incidents linked to under-observation:
class ObservationBenefitEstimator:
def estimate_benefit(self, obs_type, coverage_increase):
"""Estimate annual risk reduction from better observation (very approximate)."""
incidents = self.get_incidents_with_root_cause("under_observation")
relevant = [i for i in incidents if i.missing_obs_type == obs_type]
if not relevant:
return 0.0
avg_incident_cost = np.mean([i.cost_eur for i in relevant])
incident_rate_per_year = len(relevant) / self.years_of_history
# Assume linear reduction with coverage (illustrative only)
risk_reduction = coverage_increase * incident_rate_per_year * avg_incident_cost
return risk_reduction
13.3 ROI calculation
def compute_observation_roi(obs_type, proposed_coverage_increase):
"""
Compute a rough ROI for increasing observation coverage.
ROI = (Benefit - Cost) / Cost
"""
current_sensors = get_current_sensor_count(obs_type)
needed_sensors = estimate_sensors_for_coverage(
obs_type, proposed_coverage_increase
)
additional_sensors = max(0, needed_sensors - current_sensors)
annual_cost = (
additional_sensors * SENSOR_CAPEX_AMORTIZED +
additional_sensors * SENSOR_OPEX_PER_YEAR +
additional_sensors * STORAGE_COST_PER_YEAR +
additional_sensors * COMPUTE_COST_PER_YEAR
)
risk_reduction_eur = estimate_benefit(obs_type, proposed_coverage_increase)
if annual_cost <= 0:
return {
"roi": None,
"annual_cost": annual_cost,
"risk_reduction": risk_reduction_eur,
"payback_period_years": None,
}
roi = (risk_reduction_eur - annual_cost) / annual_cost
payback_period = (
annual_cost / risk_reduction_eur if risk_reduction_eur > 0 else float("inf")
)
return {
"roi": roi,
"annual_cost": annual_cost,
"risk_reduction": risk_reduction_eur,
"payback_period_years": payback_period,
}
Example:
observation_roi_analysis_2028:
proposal_1:
obs_type: "city.flood_risk_state/v1"
current_coverage: 0.75
proposed_coverage: 0.95
additional_sensors: 8
annual_cost_eur: 9600
risk_reduction_eur: 45000
roi: 3.69
payback_period_years: 0.21
recommendation: "APPROVE — high ROI"
proposal_2:
obs_type: "city.parking_availability/v1"
current_coverage: 0.60
proposed_coverage: 0.90
additional_sensors: 20
annual_cost_eur: 24000
risk_reduction_eur: 5000
roi: -0.79
payback_period_years: 4.8
recommendation: "DEFER — low ROI"
A simple cost-benefit dashboard can visualize proposals:
cost_benefit_dashboard:
x_axis: "Annual cost (EUR)"
y_axis: "Estimated risk reduction (EUR)"
points: "Observation proposals"
quadrants:
high_benefit_low_cost: "Approve immediately"
high_benefit_high_cost: "Evaluate carefully"
low_benefit_low_cost: "Nice to have"
low_benefit_high_cost: "Reject"
Again, all numbers are illustrative; real deployments need domain-specific calibration.
14. Multi-modal observation fusion and conflict resolution
Real systems rarely have a single “truth sensor.” Instead, they have multiple, imperfect views:
- ground sensors + radar + model forecasts for flood,
- several affect estimators for learners,
- redundant power grid telemetry, etc.
We need clear patterns for:
- fusing multiple observations into one semantic unit, and
- handling conflicts between sources.
14.1 Fusion patterns
Pattern 1: Confidence-weighted averaging
STATUS_SEVERITY = {
"INVALID": 0,
"MISSING": 1,
"REDACTED": 2,
"STUB": 3,
"ESTIMATED": 4,
"DEGRADED": 5,
"PARSED": 6,
}
def worst_status(statuses):
return min(statuses, key=lambda s: STATUS_SEVERITY.get(s, 0))
class ConfidenceWeightedFusion:
def fuse(self, obs_units):
"""Combine observations weighted by confidence (illustrative)."""
if not obs_units:
raise ValueError("No observations to fuse")
# If any INVALID is present, fused unit must not be promoted.
input_status = worst_status([o.observation_status for o in obs_units])
if input_status == "INVALID":
return ObsUnit(
sem_type=obs_units[0].sem_type,
scope=obs_units[0].scope,
payload={},
confidence=0.0,
observation_status="INVALID",
source={"fusion": [o.source for o in obs_units], "kind": "fused"},
)
total_weight = sum(o.confidence for o in obs_units) or 1.0
fused_payload = {}
keys = obs_units[0].payload.keys()
for key in keys:
weighted_sum = sum(o.payload[key] * o.confidence for o in obs_units)
fused_payload[key] = weighted_sum / total_weight
fused_confidence = max(o.confidence for o in obs_units) * 0.9
# Promotion rule: only call it PARSED if all inputs are PARSED and confidence is healthy.
promoted = (
all(o.observation_status == "PARSED" for o in obs_units)
and fused_confidence >= 0.65
)
fused_status = "PARSED" if promoted else "DEGRADED"
return ObsUnit(
sem_type=obs_units[0].sem_type,
scope=obs_units[0].scope,
payload=fused_payload,
confidence=fused_confidence,
observation_status=fused_status,
source={"fusion": [o.source for o in obs_units], "kind": "fused"},
)
Pattern 2: Kalman-style fusion
Toy 1D example:
class KalmanFusion:
def fuse(self, prior_obs, new_obs):
"""
Bayesian-style update:
posterior = (prior_precision * prior + new_precision * new)
/ (prior_precision + new_precision)
"""
prior_var = max(1e-6, 1.0 - prior_obs.confidence)
new_var = max(1e-6, 1.0 - new_obs.confidence)
prior_precision = 1.0 / prior_var
new_precision = 1.0 / new_var
posterior_precision = prior_precision + new_precision
posterior_mean = (
prior_precision * prior_obs.payload["value"] +
new_precision * new_obs.payload["value"]
) / posterior_precision
posterior_var = 1.0 / posterior_precision
posterior_confidence = 1.0 - min(0.99, posterior_var)
return ObsUnit(
sem_type=prior_obs.sem_type,
scope=prior_obs.scope,
payload={"value": posterior_mean},
confidence=posterior_confidence,
observation_status="PARSED",
source={"fusion": [prior_obs.source, new_obs.source]},
)
Pattern 3: Ensemble with outlier detection
class EnsembleFusionWithOutlierDetection:
def fuse(self, obs_units):
"""Remove outliers before fusion."""
values = np.array([o.payload["value"] for o in obs_units])
median = np.median(values)
mad = np.median(np.abs(values - median)) or 1e-6
outliers = [
i for i, v in enumerate(values)
if np.abs(v - median) > 3 * mad
]
filtered_obs = [o for i, o in enumerate(obs_units) if i not in outliers]
if not filtered_obs:
# Everything looks inconsistent; mark as degraded
return ObsUnit(
sem_type=obs_units[0].sem_type,
scope=obs_units[0].scope,
payload={"value": float(median)},
confidence=0.3,
observation_status="DEGRADED",
source={"fusion": [o.source for o in obs_units]},
)
return ConfidenceWeightedFusion().fuse(filtered_obs)
14.2 Conflict resolution strategies
Strategy 1: Hierarchical source priority
source_priority:
city.flood_risk_state/v1:
priority_order:
1: "ground_sensors" # most trusted
2: "radar_nowcast"
3: "physics_model"
4: "ml_model"
conflict_resolution: "use_highest_priority_available"
Strategy 2: Temporal precedence
temporal_rules:
fresh_data_window_sec: 300
resolution:
if_multiple_within_window: "fuse_with_confidence_weighting"
if_one_fresh_one_stale: "use_fresh"
if_all_stale: "mark_DEGRADED"
Strategy 3: Domain-specific logic
def resolve_flood_risk_conflict(obs_units):
"""Domain-specific: be conservative on high risk."""
max_risk = max(o.payload["risk_score"] for o in obs_units)
if max_risk > 0.7:
# Conservative path: take highest risk
return max(obs_units, key=lambda o: o.payload["risk_score"])
# Otherwise, fuse normally
return ConfidenceWeightedFusion().fuse(obs_units)
Example:
fusion_scenario_flood:
scope: {city_id: "city-01", sector_id: 12}
time: "2028-04-15T10:00:00Z"
input_observations:
- source: "ground_sensor_A"
payload: {risk_score: 0.65}
confidence: 0.90
- source: "ground_sensor_B"
payload: {risk_score: 0.68}
confidence: 0.85
- source: "radar_nowcast"
payload: {risk_score: 0.72}
confidence: 0.75
- source: "physics_model"
payload: {risk_score: 0.80}
confidence: 0.60
fusion_output:
method: "ensemble_with_outlier_detection"
removed_outliers: ["physics_model"]
fused_payload: {risk_score: 0.67}
fused_confidence: 0.81
observation_status: "PARSED"
Fusion itself should be observable:
fusion_health_metrics:
conflicts_detected_per_hour: 3.2
outliers_removed_per_hour: 1.1
avg_confidence_before_fusion: 0.78
avg_confidence_after_fusion: 0.82
conflict_patterns:
- pair: ["ground_sensor_A", "radar_nowcast"]
conflict_rate: 0.15
typical_difference: 0.08
action: "Calibration check recommended"
15. Temporal patterns and predictive under-observation
Under-observation is often predictable:
- sensors degrade gradually,
- coverage is poor on weekends or nights,
- outages correlate with weather or load.
Instead of only reacting, SI-Core can predict where under-observation will emerge and schedule repairs proactively.
15.1 Gradual degradation
Illustrative detector:
class GradualDegradationDetector:
def detect(self, obs_history):
"""Detect slow decline in observation confidence."""
if len(obs_history) < 5:
return None
# Convert times to numeric (seconds since epoch)
times = np.array([o.created_at.timestamp() for o in obs_history])
confidences = np.array([o.confidence for o in obs_history])
slope, intercept = np.polyfit(times, confidences, 1)
conf_threshold = 0.80
if slope >= 0:
return None
# Predict when confidence will cross threshold
# slope * t + intercept = conf_threshold → t*
t_cross = (conf_threshold - intercept) / slope
now_ts = times[-1]
time_to_threshold = t_cross - now_ts
if 0 < time_to_threshold < 7 * 24 * 3600: # within a week
return {
"warning": "gradual_degradation_detected",
"time_to_critical_sec": time_to_threshold,
"recommended_action": "proactive_sensor_maintenance",
}
return None
15.2 Periodic gaps
class PeriodicGapDetector:
def detect(self, obs_history, window_days=30):
"""Detect recurring coverage gaps by hour of day (toy example)."""
if not obs_history:
return None
by_hour = collections.defaultdict(list)
for o in obs_history:
hour = o.created_at.hour
by_hour[hour].append(o.observation_status)
problematic_hours = []
for hour, statuses in by_hour.items():
missing_rate = (
sum(1 for s in statuses if s in ["MISSING", "DEGRADED"])
/ len(statuses)
)
if missing_rate > 0.2:
problematic_hours.append({"hour": hour, "missing_rate": missing_rate})
if problematic_hours:
return {
"warning": "periodic_gaps_detected",
"hours": problematic_hours,
"recommended_action": "adjust_sampling_schedule_or_expectations",
}
return None
15.3 Correlation with external events
class ExternalEventCorrelation:
def detect(self, obs_history, external_events):
"""Correlate under-observation with external factors (e.g., weather, load)."""
results = []
for event_type in {"heavy_rain", "high_load", "power_outage"}:
event_times = [e.time for e in external_events if e.type == event_type]
if not event_times:
continue
under_obs_during = 0
under_obs_baseline = 0
for o in obs_history:
is_under_obs = o.observation_status in ["MISSING", "DEGRADED"]
near_event = any(
abs((o.created_at - et).total_seconds()) < 3600
for et in event_times
)
if is_under_obs and near_event:
under_obs_during += 1
elif is_under_obs:
under_obs_baseline += 1
if under_obs_during == 0:
continue
correlation = under_obs_during / (under_obs_baseline + 1)
if correlation > 2.0:
results.append({
"event_type": event_type,
"correlation_factor": correlation,
"recommended_action": f"improve_resilience_to_{event_type}",
})
return results or None
15.4 Predictive alerts and proactive scheduling
predictive_alerts_2028_04_15:
alert_1:
type: "gradual_degradation"
obs_type: "city.flood_risk_state/v1"
sector: 12
current_confidence: 0.82
predicted_confidence_7d: 0.74
time_to_threshold: "4.2 days"
recommended_action: "Schedule sensor maintenance within 3 days"
alert_2:
type: "periodic_gap"
obs_type: "learner.session_event/v1"
pattern: "Low coverage on weekends"
recommended_action: "Add weekend-friendly observation patterns"
A simple scheduler can then allocate maintenance windows before under-observation becomes critical. (Same repair budget machinery as §12 can be reused.)
16. Observation governance and audit trails
Finally, changing what you observe is a first-class governance concern. Adding or removing observations changes what SI-Core can see, and therefore what it can justifiably do.
16.1 Proposal and approval process
Example proposal for adding an observation:
observation_proposal:
id: "OBS-PROP-2028-042"
type: "add_observation"
proposed_by: "city_ops_team"
date: "2028-04-15"
details:
sem_type: "city.air_quality/v1"
scope: "city-wide, per-district"
rationale: |
"Recent air quality incidents suggest we're under-observed.
Adding 15 sensors across districts."
cost_analysis:
capex: 75000
opex_annual: 15000
roi_analysis: "See attachment OBS-PROP-2028-042-ROI"
impact_analysis:
goals_enabled: ["city.health_risk_minimization"]
jumps_affected: ["traffic_control", "industrial_permits"]
scover_obs_improvement: "+0.25"
approval_workflow:
- step: "Technical review"
reviewers: ["SI-Core team", "Domain experts"]
status: "approved"
- step: "Cost approval"
reviewers: ["Finance", "City council"]
status: "approved"
- step: "Ethics review"
reviewers: ["Ethics board"]
status: "approved"
- step: "Deployment"
timeline: "2 months"
status: "in_progress"
Example for deprecation:
observation_deprecation:
id: "OBS-DEPR-2028-013"
sem_type: "city.parking_meters/v1"
rationale: "Low ROI, better alternatives available"
impact_analysis:
jumps_affected: ["parking_pricing"]
alternative_obs: "city.parking_availability_app/v1"
scover_obs_change: "-0.05 (acceptable)"
deprecation_timeline:
announce: "2028-05-01"
grace_period: "6 months"
final_removal: "2028-11-01"
migration_plan:
- "Update jump contracts to use alternative obs"
- "Run parallel for 3 months"
- "Validate no regressions in GCS / safety metrics"
16.2 Observation inventory and audits
A living observation catalog:
observation_catalog:
city.flood_risk_state/v1:
status: "active"
added: "2026-03-15"
owner: "flood_management_team"
criticality: "CRITICAL"
current_coverage: 0.92
target_coverage: 0.95
cost_per_year: 50000
goals_served: ["flood_risk_min", "hospital_access"]
jumps_using: ["adjust_flood_gates", "issue_flood_alerts"]
quality_metrics:
avg_confidence: 0.87
integrity_violations_per_month: 0.2
last_audit: "2028-03-01"
audit_status: "passed"
Audit trail for observation changes:
observation_audit_log:
- timestamp: "2028-04-15T10:00:00Z"
event: "observation_added"
obs_type: "city.air_quality/v1"
proposal_id: "OBS-PROP-2028-042"
approvers: ["tech_lead", "finance", "ethics"]
justification: "Link to proposal document"
- timestamp: "2028-04-20T14:30:00Z"
event: "observation_contract_updated"
jump_name: "city.traffic_control"
change: "Added air_quality as optional obs"
approved_by: "jump_owner"
- timestamp: "2028-05-01T09:00:00Z"
event: "observation_deprecated_announced"
obs_type: "city.parking_meters/v1"
deprecation_id: "OBS-DEPR-2028-013"
16.3 Periodic observation audits
Non-normative example auditor:
class ObservationAuditor:
def quarterly_audit(self):
"""Review all observation types for continued relevance and quality."""
findings = []
for obs_type in self.catalog.all_obs_types():
jumps_using = self.find_jumps_using(obs_type)
if len(jumps_using) == 0:
findings.append({
"obs_type": obs_type,
"issue": "unused_observation",
"recommendation": "Consider deprecation",
})
coverage = self.get_coverage(obs_type)
target = self.catalog.get_target_coverage(obs_type)
if coverage < target - 0.10:
findings.append({
"obs_type": obs_type,
"issue": "below_target_coverage",
"recommendation": "Invest in more sensors or adjust target",
})
roi = self.compute_roi(obs_type)
if roi is not None and roi < 0.5:
findings.append({
"obs_type": obs_type,
"issue": "low_roi",
"recommendation": "Review cost-effectiveness",
})
return AuditReport(findings=findings, date=now())
Observation governance then links back to [ETH], [EVAL], and [MEM]:
- ETH: ensures new observations don’t introduce unjustified surveillance or bias.
- EVAL: verifies that added/removed observations actually help or at least don’t harm goals.
- MEM: keeps the full audit trail of what we decided to observe, when, and why.