Spaces:
Running
Running
| # Validation Map β `replicalab/utils/validation.py` | |
| > Deterministic protocol validation against scenario constraints. | |
| > Pure functions β no LLM calls, no side effects. | |
| > | |
| > **Tasks implemented:** MOD 05 | |
| ## Public API | |
| ### `validate_protocol(protocol: Protocol, scenario: NormalizedScenarioPack) -> ValidationResult` | |
| Main entry point. Never raises β always returns a `ValidationResult`. | |
| **Checks run (in order):** | |
| 1. `_check_obvious_impossibilities` β sample_size < 1, no controls, duration < 1 | |
| 2. `_check_duration_vs_time_limit` β protocol days vs lab time_limit_days | |
| 3. `_check_equipment_vocabulary` β items vs available/booked/substitutable | |
| 4. `_check_reagent_vocabulary` β items vs in-stock/out-of-stock/substitutable | |
| 5. `_check_required_element_coverage` β protocol text vs hidden_reference_spec.required_elements | |
| **Result:** `valid=True` only if zero ERROR-level issues. | |
| ## Data Classes | |
| ### `IssueSeverity(str, Enum)` | |
| | Value | Meaning | | |
| |-------|---------| | |
| | `error` | Hard failure β protocol cannot proceed | | |
| | `warning` | Advisory β protocol is suboptimal but possible | | |
| ### `ValidationIssue(BaseModel)` β `extra="forbid"` | |
| | Field | Type | Example | | |
| |-------|------|---------| | |
| | `severity` | `IssueSeverity` | `ERROR` | | |
| | `category` | `str` | `"equipment"`, `"duration"`, `"sample_size"` | | |
| | `message` | `str` | `"Equipment 'X' is booked and has no substitution."` | | |
| ### `ValidationResult(BaseModel)` β `extra="forbid"` | |
| | Field | Type | | |
| |-------|------| | |
| | `valid` | `bool` | | |
| | `issues` | `list[ValidationIssue]` | | |
| **Properties:** | |
| - `errors` β `list[ValidationIssue]` (severity=ERROR only) | |
| - `warnings` β `list[ValidationIssue]` (severity=WARNING only) | |
| ## Check Details | |
| ### `_check_obvious_impossibilities` | |
| | Condition | Severity | Category | | |
| |-----------|----------|----------| | |
| | `sample_size < 1` | ERROR | `sample_size` | | |
| | `controls` empty | WARNING | `controls` | | |
| | `duration_days < 1` | ERROR | `duration` | | |
| ### `_check_duration_vs_time_limit` | |
| | Condition | Severity | Category | | |
| |-----------|----------|----------| | |
| | `duration_days > time_limit_days` | ERROR | `duration` | | |
| ### `_check_equipment_vocabulary` | |
| | Condition | Severity | Category | | |
| |-----------|----------|----------| | |
| | Item available | β (pass) | β | | |
| | Item booked + has substitution | WARNING | `equipment` | | |
| | Item booked + no substitution | ERROR | `equipment` | | |
| | Item unknown (not in inventory) | WARNING | `equipment` | | |
| ### `_check_reagent_vocabulary` | |
| | Condition | Severity | Category | | |
| |-----------|----------|----------| | |
| | Item in stock | β (pass) | β | | |
| | Item out of stock + has substitution | WARNING | `reagent` | | |
| | Item out of stock + no substitution | ERROR | `reagent` | | |
| | Item unknown (not in inventory) | WARNING | `reagent` | | |
| ### `_check_required_element_coverage` | |
| Checks each `hidden_reference_spec.required_elements` against protocol text fields using token matching. | |
| **Protocol text searched:** technique, rationale, controls, equipment, reagents (joined, lowercased). | |
| **Token extraction:** `_element_tokens(element)` splits on spaces, keeps tokens with 3+ chars. | |
| **Match:** any token from element found in protocol text β covered. | |
| | Condition | Severity | Category | | |
| |-----------|----------|----------| | |
| | Element not addressed | WARNING | `required_element` | | |
| ## Internal Helpers | |
| | Function | Purpose | | |
| |----------|---------| | |
| | `_normalize(label)` | Lowercase, strip, collapse whitespace | | |
| | `_element_tokens(element)` | Split element string into searchable tokens (3+ chars) | | |
| | `_substitution_alternatives(scenario)` | Set of normalized original items from `allowed_substitutions` | | |
| ## Who Consumes This | |
| - **`lab_manager_policy.py`** β `check_feasibility()` calls `validate_protocol()` and wraps result in `protocol` DimensionCheck | |
| - **`scoring/`** (future) β JDG 01 rigor score will reuse `_element_tokens` for required element matching | |