Observations, Under-Observation, and Repair Loops

Community Article Published January 17, 2026

From “No Jump Without Observations” to Practical Design Patterns Draft v0.1 — Non-normative supplement to SI-Core / SI-NOS / SIM/SIS / SCP / Semantic Compression

This document is non-normative. It explains how to design and operate the observation side of an SI-Core system:

what counts as an “observation unit,”

what “under-observation” actually means, and

how to build repair loops instead of silently jumping in the dark.

Normative contracts live in the SI-Core / SI-NOS core specs, SIM/SIS design, SCP spec, and the relevant evaluation packs.

1. Why OBS needs a cookbook

SI-Core has a simple ethos on paper:

No effectful Jump without PARSED observations.

Two clarifications (to avoid misreading):

“No jump” means the Jump transition is blocked. Under-observed conditions may still run read / eval_pre / jump-sandbox, but MUST NOT execute a commit jump, and MUST NOT publish results. If you run a sandboxed dry-run, set publish_result=false and memory_writes=disabled.
Even when observations are PARSED, effectful commits SHOULD be gated by coverage/confidence minima (declared by the implementer). If PARSED but below minima, the system should request observation extension or enter a conservative/safe path.

This doc is a cookbook for making that ethos operational: explicit obs units, explicit under-observation detection, and explicit repair loops.

Raw logs → half-parsed events → LLM / heuristics → side-effects (!)

Common failure modes:

“We had logs, but they weren’t structured, so the system thought it had context.”
“Sensor was down, but we just reused stale values.”
“We compressed too aggressively and didn’t notice whole regions becoming invisible.”

This doc is about:

Observation units — how to shape them so they’re usable and auditable.
Under-observation — how to detect that you don’t know enough.
Repair loops — what the system does when it’s under-observed, short-term and long-term.
Integration with semantic compression (art-60-007) and SIM/SIS.

You can read it as the “OBS cookbook” that pairs with:

Goal-Native Algorithms (GCS, trade-offs)
Semantic Compression (SCE, SIM/SIS, SCP)
ETH overlays and operational runbooks.

2. What is an “observation unit” in SI-Core?

We assume an explicit observation unit abstraction, not “random JSON blobs.”

Very roughly:

obs_unit:
  id: "obs-2028-04-15T10:23:42Z-1234"
  sem_type: "city.flood_risk_state/v1"
  scope:
    city_id: "city-01"
    sector_id: 12
    horizon_min: 60
  payload:
    risk_score: 0.73
    expected_damage_eur: 1.9e6
  observation_status: "PARSED"   # see below
  confidence: 0.87               # 0.0–1.0
  source:
    channel: "sensor_grid"
    semantic_path: "sce://flood/v1"
  backing_refs:
    - "raw://sensors/canal-12@2028-04-15T10:20:00Z/10:23:00Z"
  ethics_tags:
    jurisdictions: ["EU", "DE"]
    gdpr_basis: ["art_6_1_e"]
  created_at: "2028-04-15T10:23:42Z"

Key properties:

sem_type — matches the semantic types catalog (SIM/SIS).
scope — where/when/who this observation applies to.
payload — goal-relevant content, syntactically validated.
observation_status — how trustworthy / usable this unit is.
confidence — continuous measure, separate from status.
backing_refs — path back to raw data when allowed.

2.1 Observation status taxonomy

We’ll use a small, explicit enumeration:

PARSED         — syntactically valid, semantically well-typed, within spec.
DEGRADED       — usable, but below normal quality (e.g., partial data, lower res).
STUB           — placeholder syntactically valid, but missing key payload.
ESTIMATED      — forward-filled / model-filled; not directly observed.
MISSING        — we know we wanted this, but we don’t have it.
REDACTED       — intentionally removed (privacy, policy).
INVALID        — failed parsing / integrity; must not be used.

SI-Core policy (non-normative but recommended):

Jumps that materially depend on a goal must not proceed if their required observation bundle has:
- any INVALID, or
- MISSING where no fallback is defined, or
- REDACTED without appropriate ETH escalation.
DEGRADED / ESTIMATED are allowed only under explicit degradation policies (see §5).

2.2 Compatibility note: mapping to SI-Core core status

Core SI-Core interfaces often expose a smaller status set (e.g. PARSED | PARTIAL | PENDING). This cookbook uses a more detailed taxonomy for operational clarity.

A practical mapping is:

PARSED → core PARSED
DEGRADED | ESTIMATED | STUB → core PARTIAL (or PARSED_BELOW_MINIMA as a label)
MISSING | REDACTED → core PENDING (decision must not assume presence)
INVALID → core PENDING + integrity-fail flag (must not be used)

Recommendation: keep both fields:

observation_status (detailed, cookbook)
status_core (minimal, spec-facing)

3. What does “under-observation” mean?

Under-observation is not just “sensor offline.” It is any condition where the observation layer is structurally insufficient for the goals at stake.

We can classify three main forms:

Missing observations
- No obs unit at all where the goal contract says one is required.
- Example: no recent canal_segment_state for a sector in a high-risk flood area.
Coarse or degraded observations
- We have something, but:
  - resolution is too low,
  - sampling is too sparse,
  - semantics were over-compressed (semantic compression too aggressive).
- Example: only city-wide average water level, no per-sector breakdown, during a storm.
Biased or structurally skewed observations
- Observations systematically leave out certain regions / groups / states.
- Example: traffic sensors mostly in wealthy districts; ND learners less observed because they avoid the system.

In SI-Core terms, “under-observation” typically shows up via:

Coverage metrics (SCover_obs) dropping.
Integrity metrics (SInt_obs) reporting semantic violations.
Observation-status maps showing DEGRADED / ESTIMATED / STUB / MISSING / REDACTED where contracts require PARSED.

4. Coverage and integrity: SCover / SInt for OBS

You can think in terms of two basic metrics families:

Coverage — “have we observed enough of the world we care about?”
Integrity — “are those observations structurally consistent?”

4.1 Observation coverage (SCover_obs)

Non-normative sketch:

scover_obs:
  window: "2028-04-15T10:00:00Z/11:00:00Z"
  domain: "city.flood"
  required_scopes:
    - {city_id: "city-01", sector_id: 1}
    - {city_id: "city-01", sector_id: 2}
    - ...

  observed:
    parsed_pct:    0.86      # observation_status=PARSED
    degraded_pct:  0.06      # DEGRADED
    estimated_pct: 0.02      # ESTIMATED
    stub_pct:      0.01      # STUB
    missing_pct:   0.03      # MISSING (wanted but not present)
    redacted_pct:  0.01      # REDACTED (policy/consent; ETH-significant)
    invalid_pct:   0.01      # INVALID (must not be used)

  thresholds:
    parsed_min:   0.90
    invalid_max:  0.00
    missing_max:  0.02
    redacted_max: 0.02       # optional; depends on domain + ETH posture

  status: "warning"
  notes:
    - "Do not treat REDACTED as 'sensor outage'—it is a policy/consent state."
    - "INVALID is an integrity failure; requires parser/pipeline repair."

You can compute scover_obs at different levels:

per goal (e.g., flood risk vs traffic vs learning),
per region or population,
per time window.

4.2 Observation integrity (SInt)

We also want a sense of contract violations:

sem_type / payload mismatch,
impossible values (negative water depth),
inconsistent scopes (two obs units claiming exclusive states for same scope/time).

Sketch:

sint_obs:
  window: "2028-04-15T10:00:00Z/11:00:00Z"
  violations:
    type_mismatch: 3            # bad payload vs sem_type schema
    impossible_values: 1        # negative water level
    duplicate_conflicts: 2      # two mutually exclusive states
  violations_per_1e4_units: 1.7
  status: "ok"  # or "warning"/"critical"

Together, SCover_obs + SInt serve as:

“OBS health” metrics,
early warning for under-observation,
gates for safe mode / degraded mode decisions (§5).

5. Degradation and safe-mode patterns when under-observed

When scover_obs drops or observation_status degrades, what does the system actually do? You want explicit patterns, not ad-hoc improvisation.

5.1 Intra-jump behavior: per-decision handling

For each jump type, define an observation contract:

jump_contract:
  name: "city.adjust_flood_gates"

  # Bundle-level minima (align with spec-style "declared minimums"):
  bundle_minima:
    coverage_min: 0.50
    confidence_min: 0.65

  required_obs:
    - sem_type: "city.flood_risk_state/v1"
      scope: ["city_id", "sector_id"]
      status_allowed: ["PARSED"]
      min_confidence: 0.80
      max_age_sec: 300

    - sem_type: "city.traffic_state/v1"
      scope: ["city_id", "sector_id"]
      # Define “degraded band” explicitly instead of a single label.
      status_allowed: ["PARSED", "DEGRADED", "ESTIMATED"]
      min_confidence: 0.50
      max_age_sec: 900

  effect_gating:
    # If contract not satisfied, allow reasoning but block effectful ops.
    # Escalation produces an operator-facing ticket/request; it does not execute external effects.
    on_contract_fail: "no_effectful_ops_and_escalate"
　　　
  fallback_policy:
    if_missing_or_invalid_flood_risk_state:
      action: "safe_mode"          # no gate moves, raise alert
    if_degraded_traffic_state:
      action: "conservative"       # assume worst plausible congestion
    if_redacted_required_obs:
      action: "eth_escalation"     # treat as policy/consent state, not outage

Pseudo-code:

def prepare_jump(request):
    obs_bundle = load_obs_bundle(request.scope, request.time)

    if not obs_bundle.satisfies_contract(jump_contract):
        return handle_under_observation(obs_bundle, jump_contract)

    return proceed_with_normal_decision(obs_bundle)

Typical fallback actions:

Safe-mode
- No effectful ops.
- Only diagnostics / alerts / logging.
- “We don’t know enough to move actuators.”
Conservative default
- Choose action that dominates on safety / ETH goals under uncertainty.
- Example: hold floodgates in safer position; slow traffic lights.
Sandbox-only
- Run the jump through evaluation / simulation.
- publish_result=false and memory_writes=disabled.
- No effectful ops (no external effects). Output is evaluation-only.
Human-in-loop escalation
- Forward to operator dashboard with “under-observation” banner.
- Require manual override to proceed.

5.2 Inter-jump behavior: repair loops over time

In addition to per-jump fallback, SI-Core should repair its observation regime over time.

Repair loop outline:

under_observation_event:
  detected_at: "2028-04-15T10:23:42Z"
  domain: "city.flood"
  scope: {city_id: "city-01"}
  scover_obs_delta: -0.12
  root_causes:
    - "sensor_cluster_central_offline"
    - "compression_policy_too_aggressive"
  immediate_actions:
    - "enable_safe_mode_for_flood_jumps"
    - "switch_SCE_to_high_fidelity_profile"
  long_term_actions:
    - "PLB proposal: add backup sensors"
    - "update_jump_contracts to handle degraded obs"

This is where PLB / self-modification comes in:

PLB can propose changes to:
- observation contracts (what is required),
- SCE policies (how aggressively to compress),
- SIM indexing strategies,
- sampling periods.
Those proposals run through:
- simulation / sandbox,
- ETH + governance review,
- staged rollout.

6. How this fits semantic compression (art-60-007)

Observation and semantic compression are deeply intertwined:

SCE decides what to emit as obs units.
OBS decides whether that’s enough to jump.

A few alignment patterns:

6.1 OBS-aware SCE policies

In high-risk domains, SCE should be goal- and OBS-aware:

sce_policy:
  domain: "city.flood"
  risk_profile: "HIGH"
  target_semantic_ratio_Rs: 8.0
  epsilon_max: 0.05      # max goal utility loss
  obs_requirements:
    require_parsed_for:
      - "city.flood_risk_state/v1"
    allow_degraded_for:
      - "city.traffic_state/v1"

The SCE:

must ensure that required obs types hit their parsed_min targets,
can be more aggressive on low-stakes channels.

6.2 Fallback to raw when OBS says “not enough”

Pattern:

If semantic view is too uncertain / degraded, request raw for this scope.

Implementation sketch:

def ensure_obs_for_jump(scope, time):
    obs_bundle = fetch_semantic_obs(scope, time)

    if obs_bundle.confidence < CONF_THRESHOLD:
        raw_window = fetch_raw(scope, time_window="5m")
        sem_units = sce.recompress(raw_window, context=HIGH_FIDELITY_CTX)
        write_to_sim(sem_units)
        obs_bundle = fetch_semantic_obs(scope, time)

    return obs_bundle

This creates a closed loop:

OBS detects low confidence / under-observation.
SCE temporarily raises fidelity for the affected region.
SIM/SIS gets higher-quality obs units.

6.3 Semantic compression vs under-observation

Under-observation can come from:

missing raw (sensor down), or
over-aggressive semantic compression (SCE dropped too much).

You want metrics to distinguish these:

under_observation_breakdown:
  window: "2028-04-15T10:00:00Z/11:00:00Z"
  missing_raw_pct: 0.03
  compression_loss_pct: 0.07
  compression_policies:
    - policy_id: "flood_low_power_v2"
      epsilon_est: 0.09
      epsilon_max: 0.05
      status: "review_required"

PLB proposals can then be targeted:

add sensors vs adjust SCE vs adjust goals.

7. Under-observation patterns by domain

Concrete patterns help keep this grounded.

7.1 CityOS: flood + traffic

Obs units:
- city.flood_risk_state/v1 (per sector, horizon).
- city.traffic_state/v1 (per segment, time of day).
Typical under-observation:
- upstream sensor cluster offline → entire canal invisible.
- radar feed down → rainfall nowcast missing.
- semantic compression dropped per-segment detail; only city-level average remains.
Repair loops:
- safe-mode for floodgates; hold positions.
- emergency rule: assume worst plausible flood risk where unobserved.
- PLB proposal: reallocate sensors; adjust SCE hierarchy; add redundancy.

7.2 Learning / developmental support

Obs units:
- learner.session_event/v1 (exercise attempts, hints).
- learner.self_report.simple/v1 (I liked this / too hard / too easy).
- learner.affect.estimate/v1 (stress proxies; see ETH gating).
Typical under-observation:
- learner rarely uses the system → very sparse session_event units.
- some ND learners avoid self-reports → affect mostly inferred, low confidence.
- consent withdrawn for affect monitoring → entire affect channel REDACTED.
Repair loops:
- system must not over-interpret absence; treat as “unknown”, not “fine”.
- teacher dashboards flag “low observation coverage” learner-by-learner.
- PLB proposals: adjust cadence of short check-ins; non-intrusive self-report patterns; redesign for better accessibility.

7.3 OSS / CI pipelines

Obs units:
- ci.pipeline_run/v1 (tests run, duration, outcome).
- ci.commit_diff_summary/v1 (files, subsystems touched).
Typical under-observation:
- test logs missing for certain branches.
- CI skipped on some PRs (“trivial change” heuristics too aggressive).
- flaky network results in partial logs, marked DEGRADED.
Repair loops:
- mark coverage holes as explicit under-observation; do not over-trust success rates.
- PLB proposals: adjust “trivial change” rules; enforce periodic full runs; better log pipelines.

8. Testing the observation layer

OBS needs its own test discipline, not just “if code compiles, we’re fine.”

8.1 Property tests for observation contracts

Sketch:

@given(random_jump_request())
def test_no_jump_without_required_obs(request):
    obs_bundle = synthesize_obs_bundle(request, missing_required=True)

    with raises(UnderObservationError):
        si_core.process_jump(request, obs_bundle)

Other properties:

“If obs units violate schema → jump rejected, not silently coerced.”
“If obs bundle has INVALID units → no jump, ETH alert raised.”
“Degraded obs only used when contract explicitly allows it.”

8.2 Chaos for observations: sensor & SCE failure drills

Randomly drop sensor streams in a sandbox:
- verify scover_obs signals under-observation,
- verify safe-mode / fallback behavior.
Randomly break semantic compression:
- produce STUB / DEGRADED units,
- verify system doesn’t treat them as PARSED.

8.3 sirrev / golden-diff for observation code

Capture golden observation traces for known scenarios.
When you refactor SCE or observation contracts:
- re-run scenarios; compare semantic obs streams (si-golden-diff).
- any difference beyond tolerance must be justified (better semantics or deliberate design).

9. Repair loops as first-class citizens

Under-observation repair should be visible in SI-Core, not hidden implementation detail.

9.1 OBS repair events in the Effect Ledger

Every significant under-observation episode should produce effect ledger entries like:

effect:
  type: "obs.repair_request/v1"
  scope: {city_id: "city-01", sector_id: 12}
  reason: "scover_obs_below_threshold"
  requested_actions:
    - "increase_sampling_rate"
    - "enable_redundant_sensor_path"
    - "lower_compression_Rs_target"
  created_by: "[OBS]"
  jump_id: "jump-042"

This gives PLB, EVAL, and governance something concrete to work with.

9.2 PLB proposals: observation-side improvements

PLB can:

mine the effect ledger for recurring under-observation patterns,
propose:
- new sem_types (better abstractions),
- better feature engineering for SCE,
- updated observation contracts for jumps,
- instrumented logging in under-observed domains.

Governance should treat “change to observation contract” as significant:

it effectively changes where SI-Core is allowed to act confidently.

10. Anti-patterns to avoid

A few recurring traps:

“Logs are observations, right?”
- Treating arbitrary text logs as structured obs without sem_types, scope, status, or schemas.
- Fix: always wrap into obs units with explicit sem_type and contracts.
“If we don’t see a problem, there is no problem.”
- Absence of evidence ≠ evidence of absence.
- Fix: treat lack of obs as state (MISSING), not as “OK.”
“Just measure more things.”
- Adding sensors / events without thinking about goal relevance or semantics.
- Fix: start from goal surface, derive required obs surface.
“Semantic compression is invisible to OBS.”
- Letting SCE drop detail without OBS adjusting contracts or SCover thresholds.
- Fix: tie SCE policies and OBS contracts together; monitor ε and SCover.
“Degraded obs treated as full fidelity.”
- Using DEGRADED obs as if they were PARSED.
- Fix: enforce status-aware logic in jump contracts; ETH should monitor misuse.

11. Summary checklist

When you say “we implemented [OBS],” you ideally mean:

Observation units are explicit (sem_type, scope, payload, observation_status, confidence).
Observation status taxonomy is implemented (PARSED, DEGRADED, ESTIMATED, MISSING, REDACTED, INVALID).
Each jump type has a written observation contract:
- required sem_types,
- allowed statuses,
- fallback / safe-mode / escalation behavior.
You compute SCover_obs and SInt per goal / region / window.
Under-observation triggers repair loops:
- immediate: safe-mode, conservative decisions, sandbox only.
- long-term: PLB proposals to improve observation and compression design.
OBS and SCE / semantic compression are linked:
- SCE policies respect OBS requirements,
- OBS can trigger high-fidelity recompression on demand.
The observation layer has its own tests:
- property tests (“no jump without required obs”),
- chaos drills (sensor failures, compression failures),
- sirrev / golden-diff for observation code.
Under-observation episodes are logged as first-class effects, visible to PLB, ETH, and governance.

If you can tick most of these boxes, you are no longer “just reading logs.” You have a structured observation regime that knows when it doesn’t know enough, and can repair itself over time instead of quietly failing in the dark.

12. Observation quality ranking and repair prioritization

Not all under-observation is equally critical. We need a principled way to:

rank observations by criticality, and
allocate limited repair effort to the things that actually matter.

12.1 Observation criticality tiers

A simple non-normative pattern is to define tiers per semantic type:

safety_critical_obs:
  city.flood_risk_state/v1:
    criticality: "CRITICAL"
    max_missing_time_sec: 300        # 5 min
    repair_priority: 1
    fallback: "safe_mode_immediately"

  learner.stress_indicators/v1:
    criticality: "HIGH"
    max_missing_time_sec: 1800       # 30 min
    repair_priority: 2
    fallback: "conservative_load"

efficiency_obs:
  city.traffic_state/v1:
    criticality: "MEDIUM"
    max_degraded_time_sec: 3600
    repair_priority: 3
    fallback: "use_historical_average"

auxiliary_obs:
  city.parking_availability/v1:
    criticality: "LOW"
    max_missing_time_sec: 7200
    repair_priority: 4
    fallback: "skip_feature"

Criticality drives:
- online fallback behavior (what to do now), and
- offline repair priority (what to fix first).

12.2 Repair prioritization

Given a set of under-observation events, SI-Core needs to decide what to repair first.

Illustrative prioritizer:

import numpy as np
from datetime import datetime, timezone

def now():
    return datetime.now(timezone.utc)

class ObservationRepairPrioritizer:
    def prioritize_repairs(self, under_obs_events):
        """Rank repair actions by impact, urgency, and feasibility."""
        scored = []
        for event in under_obs_events:
            impact = self._compute_impact(event)
            feasibility = self._compute_feasibility(event)
            urgency = self._compute_urgency(event)

            score = (
                0.5 * impact +
                0.3 * urgency +
                0.2 * feasibility
            )
            scored.append((event, score))

        return sorted(scored, key=lambda x: x[1], reverse=True)

    def _compute_impact(self, event):
        """How many goals / jumps / people are affected?"""
        affected_goals = self._get_affected_goals(event.sem_type)
        goal_weights = sum(g.weight for g in affected_goals)

        population_affected = event.scope.get("population_size", 1)
        return goal_weights * np.log1p(population_affected)

    def _compute_feasibility(self, event):
        """Can we fix this quickly and cheaply?"""
        rc = event.root_cause
        if rc == "sensor_offline":
            return 0.8        # usually fixable
        if rc == "compression_too_aggressive":
            return 0.9        # policy change
        if rc == "structural_bias":
            return 0.3        # hard, redesign
        return 0.5

    def _compute_urgency(self, event):
        """How close we are to critical missing-time thresholds."""
        criticality = self.obs_catalog.get_criticality(event.sem_type)
        time_missing = (now() - event.last_good_obs_time).total_seconds()
        max_allowed = self.obs_catalog.get_max_missing_time(event.sem_type)

        if criticality == "CRITICAL" and time_missing > max_allowed:
            return 1.0  # already beyond allowed window

        ratio = time_missing / max_allowed if max_allowed > 0 else 1.0
        return min(1.0, ratio ** 2)    # super-linear urgency

This is illustrative; real systems will tune weights and heuristics.

12.3 Repair budgets and multi-objective planning

You can treat observation repair as a budgeted optimization problem:

repair_budget:
  daily_budget_hours: 40
  allocations:
    tier_1_safety_critical: "50% of budget"
    tier_2_efficiency: "30% of budget"
    tier_3_auxiliary: "20% of budget"

  escalation_rules:
    - "If tier_1 needs > 50%, borrow from tier_3"
    - "Never reduce tier_1 below 40%"

Non-normative multi-objective sketch:

def optimize_repair_plan(under_obs_events, budget_hours):
    """
    Solve a simple multi-objective repair plan:

    - Maximize safety goal coverage
    - Maximize efficiency goal coverage
    - Minimize total repair cost

    Subject to: total effort <= budget_hours
    """
    problem = RepairOptimizationProblem()

    for event in under_obs_events:
        problem.add_repair_action(
            event=event,
            safety_impact=event.safety_impact,
            efficiency_impact=event.efficiency_impact,
            cost_hours=event.estimated_repair_hours,
        )

    solution = problem.solve(budget=budget_hours)
    return solution.selected_repairs

Example result:

repair_queue_2028_04_15:
  rank_1:
    event: "flood_risk_state missing, sector_12"
    impact: 0.95
    urgency: 0.98
    feasibility: 0.80
    action: "Deploy backup sensor immediately"

  rank_2:
    event: "learner_affect degraded, 15 students"
    impact: 0.70
    urgency: 0.65
    feasibility: 0.90
    action: "Switch to high-fidelity affect model"

  rank_3:
    event: "traffic_state coarse, district_A"
    impact: 0.45
    urgency: 0.30
    feasibility: 0.85
    action: "Adjust SCE compression policy"

13. Observation cost-benefit analysis

More observation has real costs:

sensors and deployment,
bandwidth and storage,
compute for SCE / analytics,
governance overhead.

We want a simple way to ask:

“Is this extra observation worth it, in terms of reduced risk or improved goals?”

13.1 Cost model

Non-normative example:

observation_costs:
  city.flood_risk_state/v1:
    sensor_capex_per_unit: 5000      # EUR
    sensor_opex_per_year: 500
    bandwidth_gb_per_day: 2.5
    storage_tb_per_year: 0.9
    compute_for_sce_cpu_hours_per_day: 0.5

    total_annual_cost_per_sensor: 1200

  learner.session_event/v1:
    storage_cost_per_learner_per_year: 10
    compute_for_analytics_per_learner: 5
    total_annual_cost_per_learner: 15

13.2 Benefit model (risk reduction)

We can approximate the benefit of improving observation coverage using historical incidents linked to under-observation:

class ObservationBenefitEstimator:
    def estimate_benefit(self, obs_type, coverage_increase):
        """Estimate annual risk reduction from better observation (very approximate)."""
        incidents = self.get_incidents_with_root_cause("under_observation")
        relevant = [i for i in incidents if i.missing_obs_type == obs_type]

        if not relevant:
            return 0.0

        avg_incident_cost = np.mean([i.cost_eur for i in relevant])
        incident_rate_per_year = len(relevant) / self.years_of_history

        # Assume linear reduction with coverage (illustrative only)
        risk_reduction = coverage_increase * incident_rate_per_year * avg_incident_cost
        return risk_reduction

13.3 ROI calculation

def compute_observation_roi(obs_type, proposed_coverage_increase):
    """
    Compute a rough ROI for increasing observation coverage.

    ROI = (Benefit - Cost) / Cost
    """
    current_sensors = get_current_sensor_count(obs_type)
    needed_sensors = estimate_sensors_for_coverage(
        obs_type, proposed_coverage_increase
    )
    additional_sensors = max(0, needed_sensors - current_sensors)

    annual_cost = (
        additional_sensors * SENSOR_CAPEX_AMORTIZED +
        additional_sensors * SENSOR_OPEX_PER_YEAR +
        additional_sensors * STORAGE_COST_PER_YEAR +
        additional_sensors * COMPUTE_COST_PER_YEAR
    )

    risk_reduction_eur = estimate_benefit(obs_type, proposed_coverage_increase)

    if annual_cost <= 0:
        return {
            "roi": None,
            "annual_cost": annual_cost,
            "risk_reduction": risk_reduction_eur,
            "payback_period_years": None,
        }

    roi = (risk_reduction_eur - annual_cost) / annual_cost
    payback_period = (
        annual_cost / risk_reduction_eur if risk_reduction_eur > 0 else float("inf")
    )

    return {
        "roi": roi,
        "annual_cost": annual_cost,
        "risk_reduction": risk_reduction_eur,
        "payback_period_years": payback_period,
    }

Example:

observation_roi_analysis_2028:
  proposal_1:
    obs_type: "city.flood_risk_state/v1"
    current_coverage: 0.75
    proposed_coverage: 0.95
    additional_sensors: 8
    annual_cost_eur: 9600
    risk_reduction_eur: 45000
    roi: 3.69
    payback_period_years: 0.21
    recommendation: "APPROVE — high ROI"

  proposal_2:
    obs_type: "city.parking_availability/v1"
    current_coverage: 0.60
    proposed_coverage: 0.90
    additional_sensors: 20
    annual_cost_eur: 24000
    risk_reduction_eur: 5000
    roi: -0.79
    payback_period_years: 4.8
    recommendation: "DEFER — low ROI"

A simple cost-benefit dashboard can visualize proposals:

cost_benefit_dashboard:
  x_axis: "Annual cost (EUR)"
  y_axis: "Estimated risk reduction (EUR)"
  points: "Observation proposals"
  quadrants:
    high_benefit_low_cost: "Approve immediately"
    high_benefit_high_cost: "Evaluate carefully"
    low_benefit_low_cost: "Nice to have"
    low_benefit_high_cost: "Reject"

Again, all numbers are illustrative; real deployments need domain-specific calibration.

14. Multi-modal observation fusion and conflict resolution

Real systems rarely have a single “truth sensor.” Instead, they have multiple, imperfect views:

ground sensors + radar + model forecasts for flood,
several affect estimators for learners,
redundant power grid telemetry, etc.

We need clear patterns for:

fusing multiple observations into one semantic unit, and
handling conflicts between sources.

14.1 Fusion patterns

Pattern 1: Confidence-weighted averaging

STATUS_SEVERITY = {
    "INVALID": 0,
    "MISSING": 1,
    "REDACTED": 2,
    "STUB": 3,
    "ESTIMATED": 4,
    "DEGRADED": 5,
    "PARSED": 6,
}

def worst_status(statuses):
    return min(statuses, key=lambda s: STATUS_SEVERITY.get(s, 0))

class ConfidenceWeightedFusion:
    def fuse(self, obs_units):
        """Combine observations weighted by confidence (illustrative)."""
        if not obs_units:
            raise ValueError("No observations to fuse")

        # If any INVALID is present, fused unit must not be promoted.
        input_status = worst_status([o.observation_status for o in obs_units])
        if input_status == "INVALID":
            return ObsUnit(
                sem_type=obs_units[0].sem_type,
                scope=obs_units[0].scope,
                payload={},
                confidence=0.0,
                observation_status="INVALID",
                source={"fusion": [o.source for o in obs_units], "kind": "fused"},
            )

        total_weight = sum(o.confidence for o in obs_units) or 1.0
        fused_payload = {}
        keys = obs_units[0].payload.keys()
        for key in keys:
            weighted_sum = sum(o.payload[key] * o.confidence for o in obs_units)
            fused_payload[key] = weighted_sum / total_weight

        fused_confidence = max(o.confidence for o in obs_units) * 0.9

        # Promotion rule: only call it PARSED if all inputs are PARSED and confidence is healthy.
        promoted = (
            all(o.observation_status == "PARSED" for o in obs_units)
            and fused_confidence >= 0.65
        )
        fused_status = "PARSED" if promoted else "DEGRADED"

        return ObsUnit(
            sem_type=obs_units[0].sem_type,
            scope=obs_units[0].scope,
            payload=fused_payload,
            confidence=fused_confidence,
            observation_status=fused_status,
            source={"fusion": [o.source for o in obs_units], "kind": "fused"},
        )

Pattern 2: Kalman-style fusion

Toy 1D example:

class KalmanFusion:
    def fuse(self, prior_obs, new_obs):
        """
        Bayesian-style update:

        posterior = (prior_precision * prior + new_precision * new)
                    / (prior_precision + new_precision)
        """
        prior_var = max(1e-6, 1.0 - prior_obs.confidence)
        new_var = max(1e-6, 1.0 - new_obs.confidence)

        prior_precision = 1.0 / prior_var
        new_precision = 1.0 / new_var

        posterior_precision = prior_precision + new_precision
        posterior_mean = (
            prior_precision * prior_obs.payload["value"] +
            new_precision * new_obs.payload["value"]
        ) / posterior_precision

        posterior_var = 1.0 / posterior_precision
        posterior_confidence = 1.0 - min(0.99, posterior_var)

        return ObsUnit(
            sem_type=prior_obs.sem_type,
            scope=prior_obs.scope,
            payload={"value": posterior_mean},
            confidence=posterior_confidence,
            observation_status="PARSED",
            source={"fusion": [prior_obs.source, new_obs.source]},
        )

Pattern 3: Ensemble with outlier detection

class EnsembleFusionWithOutlierDetection:
    def fuse(self, obs_units):
        """Remove outliers before fusion."""
        values = np.array([o.payload["value"] for o in obs_units])
        median = np.median(values)
        mad = np.median(np.abs(values - median)) or 1e-6

        outliers = [
            i for i, v in enumerate(values)
            if np.abs(v - median) > 3 * mad
        ]
        filtered_obs = [o for i, o in enumerate(obs_units) if i not in outliers]

        if not filtered_obs:
            # Everything looks inconsistent; mark as degraded
            return ObsUnit(
                sem_type=obs_units[0].sem_type,
                scope=obs_units[0].scope,
                payload={"value": float(median)},
                confidence=0.3,
                observation_status="DEGRADED",
                source={"fusion": [o.source for o in obs_units]},
            )

        return ConfidenceWeightedFusion().fuse(filtered_obs)

14.2 Conflict resolution strategies

Strategy 1: Hierarchical source priority

source_priority:
  city.flood_risk_state/v1:
    priority_order:
      1: "ground_sensors"   # most trusted
      2: "radar_nowcast"
      3: "physics_model"
      4: "ml_model"

    conflict_resolution: "use_highest_priority_available"

Strategy 2: Temporal precedence

temporal_rules:
  fresh_data_window_sec: 300

  resolution:
    if_multiple_within_window: "fuse_with_confidence_weighting"
    if_one_fresh_one_stale:   "use_fresh"
    if_all_stale:             "mark_DEGRADED"

Strategy 3: Domain-specific logic

def resolve_flood_risk_conflict(obs_units):
    """Domain-specific: be conservative on high risk."""
    max_risk = max(o.payload["risk_score"] for o in obs_units)

    if max_risk > 0.7:
        # Conservative path: take highest risk
        return max(obs_units, key=lambda o: o.payload["risk_score"])

    # Otherwise, fuse normally
    return ConfidenceWeightedFusion().fuse(obs_units)

Example:

fusion_scenario_flood:
  scope: {city_id: "city-01", sector_id: 12}
  time: "2028-04-15T10:00:00Z"

  input_observations:
    - source: "ground_sensor_A"
      payload: {risk_score: 0.65}
      confidence: 0.90

    - source: "ground_sensor_B"
      payload: {risk_score: 0.68}
      confidence: 0.85

    - source: "radar_nowcast"
      payload: {risk_score: 0.72}
      confidence: 0.75

    - source: "physics_model"
      payload: {risk_score: 0.80}
      confidence: 0.60

  fusion_output:
    method: "ensemble_with_outlier_detection"
    removed_outliers: ["physics_model"]
    fused_payload: {risk_score: 0.67}
    fused_confidence: 0.81
    observation_status: "PARSED"

Fusion itself should be observable:

fusion_health_metrics:
  conflicts_detected_per_hour: 3.2
  outliers_removed_per_hour: 1.1
  avg_confidence_before_fusion: 0.78
  avg_confidence_after_fusion: 0.82

  conflict_patterns:
    - pair: ["ground_sensor_A", "radar_nowcast"]
      conflict_rate: 0.15
      typical_difference: 0.08
      action: "Calibration check recommended"

15. Temporal patterns and predictive under-observation

Under-observation is often predictable:

sensors degrade gradually,
coverage is poor on weekends or nights,
outages correlate with weather or load.

Instead of only reacting, SI-Core can predict where under-observation will emerge and schedule repairs proactively.

15.1 Gradual degradation

Illustrative detector:

class GradualDegradationDetector:
    def detect(self, obs_history):
        """Detect slow decline in observation confidence."""
        if len(obs_history) < 5:
            return None

        # Convert times to numeric (seconds since epoch)
        times = np.array([o.created_at.timestamp() for o in obs_history])
        confidences = np.array([o.confidence for o in obs_history])

        slope, intercept = np.polyfit(times, confidences, 1)

        conf_threshold = 0.80
        if slope >= 0:
            return None

        # Predict when confidence will cross threshold
        # slope * t + intercept = conf_threshold → t*
        t_cross = (conf_threshold - intercept) / slope
        now_ts = times[-1]
        time_to_threshold = t_cross - now_ts

        if 0 < time_to_threshold < 7 * 24 * 3600:  # within a week
            return {
                "warning": "gradual_degradation_detected",
                "time_to_critical_sec": time_to_threshold,
                "recommended_action": "proactive_sensor_maintenance",
            }
        return None

15.2 Periodic gaps

class PeriodicGapDetector:
    def detect(self, obs_history, window_days=30):
        """Detect recurring coverage gaps by hour of day (toy example)."""
        if not obs_history:
            return None

        by_hour = collections.defaultdict(list)
        for o in obs_history:
            hour = o.created_at.hour
            by_hour[hour].append(o.observation_status)

        problematic_hours = []
        for hour, statuses in by_hour.items():
            missing_rate = (
                sum(1 for s in statuses if s in ["MISSING", "DEGRADED"])
                / len(statuses)
            )
            if missing_rate > 0.2:
                problematic_hours.append({"hour": hour, "missing_rate": missing_rate})

        if problematic_hours:
            return {
                "warning": "periodic_gaps_detected",
                "hours": problematic_hours,
                "recommended_action": "adjust_sampling_schedule_or_expectations",
            }
        return None

15.3 Correlation with external events

class ExternalEventCorrelation:
    def detect(self, obs_history, external_events):
        """Correlate under-observation with external factors (e.g., weather, load)."""
        results = []

        for event_type in {"heavy_rain", "high_load", "power_outage"}:
            event_times = [e.time for e in external_events if e.type == event_type]
            if not event_times:
                continue

            under_obs_during = 0
            under_obs_baseline = 0

            for o in obs_history:
                is_under_obs = o.observation_status in ["MISSING", "DEGRADED"]
                near_event = any(
                    abs((o.created_at - et).total_seconds()) < 3600
                    for et in event_times
                )

                if is_under_obs and near_event:
                    under_obs_during += 1
                elif is_under_obs:
                    under_obs_baseline += 1

            if under_obs_during == 0:
                continue

            correlation = under_obs_during / (under_obs_baseline + 1)
            if correlation > 2.0:
                results.append({
                    "event_type": event_type,
                    "correlation_factor": correlation,
                    "recommended_action": f"improve_resilience_to_{event_type}",
                })

        return results or None

15.4 Predictive alerts and proactive scheduling

predictive_alerts_2028_04_15:
  alert_1:
    type: "gradual_degradation"
    obs_type: "city.flood_risk_state/v1"
    sector: 12
    current_confidence: 0.82
    predicted_confidence_7d: 0.74
    time_to_threshold: "4.2 days"
    recommended_action: "Schedule sensor maintenance within 3 days"

  alert_2:
    type: "periodic_gap"
    obs_type: "learner.session_event/v1"
    pattern: "Low coverage on weekends"
    recommended_action: "Add weekend-friendly observation patterns"

A simple scheduler can then allocate maintenance windows before under-observation becomes critical. (Same repair budget machinery as §12 can be reused.)

16. Observation governance and audit trails

Finally, changing what you observe is a first-class governance concern. Adding or removing observations changes what SI-Core can see, and therefore what it can justifiably do.

16.1 Proposal and approval process

Example proposal for adding an observation:

observation_proposal:
  id: "OBS-PROP-2028-042"
  type: "add_observation"
  proposed_by: "city_ops_team"
  date: "2028-04-15"

  details:
    sem_type: "city.air_quality/v1"
    scope: "city-wide, per-district"
    rationale: |
      "Recent air quality incidents suggest we're under-observed.
       Adding 15 sensors across districts."

    cost_analysis:
      capex: 75000
      opex_annual: 15000
      roi_analysis: "See attachment OBS-PROP-2028-042-ROI"

    impact_analysis:
      goals_enabled: ["city.health_risk_minimization"]
      jumps_affected: ["traffic_control", "industrial_permits"]
      scover_obs_improvement: "+0.25"

  approval_workflow:
    - step: "Technical review"
      reviewers: ["SI-Core team", "Domain experts"]
      status: "approved"

    - step: "Cost approval"
      reviewers: ["Finance", "City council"]
      status: "approved"

    - step: "Ethics review"
      reviewers: ["Ethics board"]
      status: "approved"

    - step: "Deployment"
      timeline: "2 months"
      status: "in_progress"

Example for deprecation:

observation_deprecation:
  id: "OBS-DEPR-2028-013"
  sem_type: "city.parking_meters/v1"
  rationale: "Low ROI, better alternatives available"

  impact_analysis:
    jumps_affected: ["parking_pricing"]
    alternative_obs: "city.parking_availability_app/v1"
    scover_obs_change: "-0.05 (acceptable)"

  deprecation_timeline:
    announce: "2028-05-01"
    grace_period: "6 months"
    final_removal: "2028-11-01"

  migration_plan:
    - "Update jump contracts to use alternative obs"
    - "Run parallel for 3 months"
    - "Validate no regressions in GCS / safety metrics"

16.2 Observation inventory and audits

A living observation catalog:

observation_catalog:
  city.flood_risk_state/v1:
    status: "active"
    added: "2026-03-15"
    owner: "flood_management_team"
    criticality: "CRITICAL"
    current_coverage: 0.92
    target_coverage: 0.95
    cost_per_year: 50000
    goals_served: ["flood_risk_min", "hospital_access"]
    jumps_using: ["adjust_flood_gates", "issue_flood_alerts"]

    quality_metrics:
      avg_confidence: 0.87
      integrity_violations_per_month: 0.2
      last_audit: "2028-03-01"
      audit_status: "passed"

Audit trail for observation changes:

observation_audit_log:
  - timestamp: "2028-04-15T10:00:00Z"
    event: "observation_added"
    obs_type: "city.air_quality/v1"
    proposal_id: "OBS-PROP-2028-042"
    approvers: ["tech_lead", "finance", "ethics"]
    justification: "Link to proposal document"

  - timestamp: "2028-04-20T14:30:00Z"
    event: "observation_contract_updated"
    jump_name: "city.traffic_control"
    change: "Added air_quality as optional obs"
    approved_by: "jump_owner"

  - timestamp: "2028-05-01T09:00:00Z"
    event: "observation_deprecated_announced"
    obs_type: "city.parking_meters/v1"
    deprecation_id: "OBS-DEPR-2028-013"

16.3 Periodic observation audits

Non-normative example auditor:

class ObservationAuditor:
    def quarterly_audit(self):
        """Review all observation types for continued relevance and quality."""
        findings = []

        for obs_type in self.catalog.all_obs_types():
            jumps_using = self.find_jumps_using(obs_type)
            if len(jumps_using) == 0:
                findings.append({
                    "obs_type": obs_type,
                    "issue": "unused_observation",
                    "recommendation": "Consider deprecation",
                })

            coverage = self.get_coverage(obs_type)
            target = self.catalog.get_target_coverage(obs_type)
            if coverage < target - 0.10:
                findings.append({
                    "obs_type": obs_type,
                    "issue": "below_target_coverage",
                    "recommendation": "Invest in more sensors or adjust target",
                })

            roi = self.compute_roi(obs_type)
            if roi is not None and roi < 0.5:
                findings.append({
                    "obs_type": obs_type,
                    "issue": "low_roi",
                    "recommendation": "Review cost-effectiveness",
                })

        return AuditReport(findings=findings, date=now())

Observation governance then links back to [ETH], [EVAL], and [MEM]:

ETH: ensures new observations don’t introduce unjustified surveillance or bias.
EVAL: verifies that added/removed observations actually help or at least don’t harm goals.
MEM: keeps the full audit trail of what we decided to observe, when, and why.