lifeops / ARCHITECTURE_REVIEW.md
avlukas's picture
Add EpisodeTrace import and enhance travel issue checks
5bc5483

LifeOps Architecture & Logic Review

Executive Summary

The LifeOps environment is well-structured and mostly correct. Several bugs, edge cases, and design inconsistencies were identified. The most critical issues are: (1) baseline agent creates double-booked focus blocks, (2) dead reward code for conflict resolution, (3) no travel feasibility check for the first event of the day.


1. Environment Logic

1.1 reset()

Status: Generally correct.

  • Loads scenario, persona, calendar, tasks, pending requests, travel times.
  • max_steps = max(5, len(pending) + 5) is reasonable.
  • Edge case: reset("invalid_id") raises KeyError β€” consider catching and raising a clearer error.

1.2 step()

Status: Correct flow.

  • Validates action against valid_actions() via _action_key.
  • Applies request or focus action, increments step count, computes reward and done.

Potential issue: _action_key does not include all fields that distinguish actions. For block_focus_time, two actions with same (new_start_min, duration_min) but different new_end_min (one None, one computed) could theoretically collide β€” in practice new_end_min is None for focus blocks, so this is fine.

1.3 State Transitions

Status: Correct.

  • Request actions: accept/reschedule add to calendar; reject/propose do not; all pop the current request.
  • Focus actions: add focus event, progress highest-priority unfinished task.

1.4 Termination Conditions (_is_done)

Status: Correct.

  • Done when: step_count >= max_steps OR (no pending requests AND all tasks complete).
  • Edge case: If valid_actions() returns empty (e.g., hypothetical scenario with no request and no unfinished tasks but _is_done False), the demo runner would crash on valid[0]. Current scenarios do not hit this.

2. Reward Calculation

2.1 Correctness

Overlap penalty: Correct. -5.0 * len(next_overlaps).

Travel penalty: Correct. -4.0 * len(issues) * travel_aversion_weight.

Rejected important penalty: Correct. -4.0 when rejecting importance β‰₯ 3.

Preference penalty: Correct. Applied to accept/reschedule/propose for meeting-like events.

Focus reward: Correct. (1.0 + 0.02 * progress) * focus_time_weight.

Wasted focus penalty: -0.5 when block_focus_time with progress == 0. In practice this is rare because focus blocks are only generated when has_unfinished is true, and progress is always made when there is an unfinished task. Defensive.

2.2 Dead Code: conflict_resolved_bonus

Bug: The reward includes:

if prev_overlaps and len(next_overlaps) < len(prev_overlaps):
    reward += 3.0
    breakdown["conflict_resolved_bonus"] = 3.0

The calendar is append-only β€” events are never removed. Therefore next_overlaps can never have fewer pairs than prev_overlaps. This branch is never executed.

Fix: Remove this block, or redesign if you later add event-removal/cancellation.

2.3 Missing Penalties / Rewards

  • No reward for accepting important requests β€” only penalty for rejecting. Consider a small positive reward for accepting high-importance requests.
  • No explicit penalty for propose_new_time that suggests an infeasible time β€” the preference penalty applies, but overlap/travel of the proposed time are not penalized (since it is not added to the calendar). This may be intentional (proposal quality is soft).

2.4 Unintended Reward Loops

  • None identified. The reward structure is straightforward.

3. State Consistency

3.1 Calendar Updates

Status: Correct. Events are appended; no removal or modification.

3.2 Task Tracking

Status: Correct. remaining_minutes is decremented in-place for the highest-priority unfinished task during focus blocks.

3.3 Message/Request Handling

Status: Correct. FIFO via pending_requests.pop(0). current_request is always the first pending.


4. Travel Feasibility

4.1 Detection of Impossible Travel

Status: Correct for consecutive events. travel_issues() sorts by start time and checks each pair.

4.2 Missing: Travel to First Event

Bug: travel_issues() only checks prev β†’ next for consecutive events. It never checks whether the user can reach the first event of the day. The model assumes the user is already at the location of the first event at its start time.

Example: First event at 8:00 at Office, persona at Home, travel 25 min. User would need to leave by 7:35. This is not validated.

Fix: Add an optional start_location (e.g., Home) to the persona/state and check travel from that location to the first event.

4.3 Overlap Logic

Status: Correct. _overlap(a_start, a_end, b_start, b_end) uses a_start < b_end and b_start < a_end. Touching events (a_end == b_start) do not overlap.

4.4 Rescheduling Edge Cases

  • Reschedule/propose options use fixed deltas (-30, 30, 60). At day boundaries, new_start can clamp to the same value for different deltas, producing duplicate actions. This is harmless (same key).
  • No validation that rescheduled time is free β€” overlaps are penalized by reward. Acceptable for RL.

5. Action Handling

5.1 All Actions Update State Correctly

Action Calendar Pending Requests Tasks
accept_event +1 event pop β€”
reject_event β€” pop β€”
reschedule_event +1 event pop β€”
propose_new_time β€” pop β€”
block_focus_time +1 focus event β€” progress

Status: Correct.

5.2 Invalid Action Handling

Status: Correct. step() raises ValueError if action key is not in valid_keys.

5.3 Action Constraints

Issue: generate_valid_actions() does not filter out:

  • Focus blocks that overlap with existing calendar events.
  • Reschedule/propose times that would overlap or cause travel issues.

This is acceptable for RL (agent learns from penalties) but means the baseline can choose β€œvalid” actions that create overlaps.


6. Demo Runner Correctness

Note: There is no separate play_episode.py; the demo lives in env/lifeops_env.py under if __name__ == "__main__".

6.1 Reflects Real Environment Behavior

Status: Yes. Uses env.reset(), env.observation(), env.valid_actions(), env.step().

6.2 Trajectories Exercise Key Logic

Status: Partially. Tests cover accept, reject, propose, focus, overlap penalty, travel penalty. However:

Bug: The baseline agent can create double-booked focus blocks. Observed in a run:

  • After handling the request, it scheduled focus blocks at 9:00, 11:00, 14:00, then again at 9:00 and again at 11:00, causing overlaps.

Cause: _choose_simple_action scores focus blocks by simulating each option against the current calendar. Once a slot is used (e.g., 9:00), the next time it considers 9:00 vs 11:00 vs 14:00 vs 16:00, they may all overlap with existing focus blocks. When scores tie, it picks the first (9:00). So it reuses occupied slots.

Fix: Filter focus blocks to exclude slots that overlap with the current calendar, or improve the baseline to prefer slots with zero overlaps (and handle ties by picking a free slot).


7. Baseline Agent Logic

7.1 Avoids Double Booking?

No. As above, the baseline can schedule overlapping focus blocks. For request actions it minimizes overlaps when choosing accept/reschedule/propose, so it tends to avoid double-booking requests. But for focus blocks it does not.

7.2 Respects Travel Constraints?

Yes. For request actions, it scores by (overlaps, travel_issues) and picks the action with the fewest. For focus blocks, it also minimizes travel issues. So it prefers feasible travel.

7.3 Prioritizes High-Priority Obligations?

Partially. It strongly prefers scheduling over rejecting (reject scores (999, 999)), so it rarely rejects important requests. But it does not explicitly prioritize by importance. It only minimizes overlaps and travel. For optional low-importance meetings it may still accept if that minimizes violations, instead of rejecting to free time for high-priority tasks.


8. Summary of Issues

Severity Issue Location Fix Status
High Baseline creates overlapping focus blocks lifeops_env.py _choose_simple_action Filter or re-score focus blocks to avoid already-used slots Fixed – prefer non-overlapping slots; fall back to least-bad when all overlap
Medium conflict_resolved_bonus never triggers reward.py Remove dead code or add event removal to enable it Fixed – removed dead code
Medium No travel check to first event of day reward.py travel_issues Add optional check from start_location to first event Fixed – added start_location param, uses home_location
Low reset("bad_id") raises raw KeyError lifeops_env.py Catch and re-raise with clearer message Not applied (minor)
Low Duplicate reschedule actions at boundaries actions.py Optional: deduplicate by (new_start, new_end) Not applied (harmless)
Low Baseline never rejects (scores reject as 999,999) lifeops_env.py Consider allowing reject when all scheduling options are bad Fixed – reject now scores (0, 0) so it wins when scheduling causes issues

9. Suggested Fixes (Minimal Changes)

Fix 1: Baseline focus block selection

In _choose_simple_action, when scoring focus blocks, prefer actions that result in zero overlaps. If all have overlaps, pick the one with the smallest overlap count, and among those prefer the one that overlaps with the fewest events (e.g., break ties by total overlap duration or event count).

A simpler approach: filter focus_actions to exclude those whose (new_start_min, duration_min) would overlap with any existing calendar event. Use detect_overlaps with a simulated calendar including the candidate focus block.

Fix 2: Remove dead conflict_resolved_bonus

Delete or comment out lines 99–101 in reward.py until the environment supports event removal.

Fix 3: Travel to first event (optional)

Add a parameter start_location (default None) to the scenario or persona. If set, prepend a synthetic β€œstart” event at start_location with end_min=0 before the first real event, so travel_issues checks the first leg.


10. Edge Cases Not Handled

  1. Empty valid_actions: If both current_request is None and has_unfinished is False, valid_actions is empty. The demo would crash on valid[0]. Current scenarios avoid this.
  2. Event at midnight (0) or end of day (1440): Logic uses <= 1440; should be verified for boundary events.
  3. Zero-duration events: _overlap would treat (100, 100) and (100, 100) as overlapping (100 < 100 is false, so no overlap). Zero-duration events are not generated.
  4. Multiple events at same start/end: Sorting by (start_min, end_min) is deterministic; travel_issues order is stable.
  5. Unknown locations in travel_times: Default 30 minutes is conservative; no explicit handling for missing keys.