Search-UI β Aggressive Refactor Plan
This document tracks the autonomous refactor begun on claude/refactor-aggressive.
It is the canonical source for what the refactor is doing, the order of work,
and the gates between phases.
Goal
Cut LOC by ~50%, separate ingestion from serving, replace hand-rolled infra with proven libraries, modernize the frontend. App must stay functional at every phase boundary.
North-star metrics
| Metric | Before | Target |
|---|---|---|
| Backend Python LOC | ~35,850 | ~10,000 |
| Frontend JS/JSX LOC | ~11,820 | ~5,000 |
*_routes.py files |
13 | β€ 5 |
| SQLite DB files | 5 | 1 |
| Web-process cold-start RSS | 450 MB β 76 MB (Session 4) | < 500 MB β (cold) |
| Tests collecting cleanly | 85/443 β 443/443 after Phase 0 | 443/443 |
| Tests passing (sandbox, no model network) | 440/443 | 443/443 in user env |
Phases & status
| Phase | Title | Status |
|---|---|---|
| 0 | Safety net (conftest, golden tests, smoke script) | DONE |
| 1 | Subtraction: dead code, redundant routers, rerankers, dedup | DONE |
| 2 | Single database + merge tool + Alembic | MERGE TOOL VALIDATED on real DBs (Session 3): found+fixed a vec0 INTEGER-PRIMARY-KEY-alias copy bug; merged 5β1 (3.2 GB), all 15 key row counts exact, golden diff identical except one benign tie-swap of two equal-score (0.71191) semantic hits. Flipped to the single DB (Session 3). Alembic SCOPED FOUNDATION added (opt-in); full cutover deliberately deferred β see "Phase 2 Alembic foundation" below. |
| 3a | ML out of web request path: CLI extraction | Cutover step 1 (lazy torch imports) DONE & runtime-verified (Session 4): cold-start web RSS 450β76 MB, torch absent at boot, models load lazily on first query, results byte-identical (golden + eval). CLI scaffold from Session 3 stands. Steps 3β4 (enqueue ingestion / relocate modules) still open. |
| 3b | Query-time embedding sidecar (optional but recommended) | DROPPED for current deployments (Session 4 decision). On the single-container HF Docker Space, web+sidecar share one container's RAM, so a sidecar does NOT lower container memory β its only real benefit is a true multi-machine split (web on HF, model on a GPU box), which isn't the deployment and would add per-query network latency + an always-on paid service. Revisit only if/when web and ML run on separate machines. |
| 3c | Video lifecycle: prune source MP4s after ingestion | DONE |
| 4 | Route consolidation: 13 β 6 coherent routers | DONE at 6 (Session 3 decision). 13β6 via cloud PRs #34β39 (searchβcontent+analysis, faceβspeaker, catalogβstatus, workflowβAD+settings+sync). Stopped at 6 deliberately β the last two merges would chase a counter at the cost of multi-concern files; see "Phase 4 progress" below. |
| 5 | Frontend: TypeScript + React Router + TanStack Query + Tailwind | DEFERRED β needs runtime verification |
| 6 | Face: DeepFace β insightface (ONNX) | DEFERRED β needs runtime verification |
| 7 | Desktop story (drop Electron or β Tauri) | DEFERRED β needs decision |
| 8 | Smart-search reckoning (instrument & decide) | DEFERRED β needs production data |
Phase 0 β Safety net (1β2 days)
Build the regression detector before changing anything.
Deliverables:
conftest.pyat repo root that fixes the two-import-conventions test messbackend/__init__.pyso tests can import either flat or namespacedtests/golden/query fixture set (run against user's real DB to populate)scripts/snapshot_golden.pyβ capture top-N results per queryscripts/diff_golden.pyβ re-run and diffscripts/smoke_test.shβ boot backend, hit 20 critical endpoints- All collection errors fixed (target: 131/131 tests collect, even if some skip)
- Tag pre-refactor state for rollback
Gate: all tests collect cleanly. Golden snapshot scripts run end-to-end against a dummy DB. Smoke script returns exit 0.
Phase 1 β Subtraction (2β3 days)
Pure deletion. No new deps, no architecture change.
Delete outright:
frontend/src/designMockups/(13 files, ~485 LOC) β DONEMockupReview.jsx+?view=mockupsrouting β DONEmain.py:/api/hello,/api/dataplaceholders β DONEllm_router.py+llm_client.py(orphaned, ~896 LOC) β DONEMiniLMfallback inai_features.pyβ DONE- Mixedbread legacy embedding code β DEFERRED to Phase 2 (still used by search_semantic.py for whole-document index; removing it now breaks search against legacy DB rows. Will be retired via Alembic backfill migration during DB consolidation.)
pywebviewβ DEFERRED to Phase 7 (desktop decision)
Consolidate:
- 3 rerankers β
search_visual_rerank_rules.pyβ DONE (common+event+rules merged; bug found and fixed in _best_label_score case-sensitivity) - 8 face files β defer to Phase 4. On inspection, the small ones (face_search_common 40 LOC, face_person_index 87 LOC, face_route_common 48 LOC) sit at the bottom of the dep tree and can't be merged upward without circularity. The 5 mixin files map to real concerns (db, storage, people, recognition, review) β collapsing them touches a 2000+ line class and isn't worth the regression risk in this phase.
- Duplicate
_get_vod_categories()β one helper inmedia_metadata.pyβ TODO
Policy change: drop the 600-line hard limit from CLAUDE.md. Replace with
guidance: "files should do one thing; never split a coherent concept just to
satisfy a counter."
Gate: golden tests pass. Tests still collect. LOC down β₯ 5k.
Phase 2 β Single DB + Alembic (4β5 days)
Deliverables:
backend/schema_version.pyβ one connection factory, one DB filebackend/migrations/β Alembic with baseline migrationscripts/migrate_to_single_db.pyβ merge 5 source DBs into 1, verify row counts- Remove every
CREATE TABLE IF NOT EXISTSfrom app boot code - Replace
_schema_metadataper-db with singlealembic_version
Gate: migration script runs cleanly on a copy of real DBs. Golden tests pass against single DB. Row counts match.
Risk mitigation: original 5 DBs untouched until 2 weeks of normal use.
Phase 3 β ML out of web process
3a: CLI extraction
Deliverables:
backend/cli.pywith subcommands:jws ingest vod,jws ingest subtitles,jws ingest video,jws ingest faces,jws ingest images,jws reindex embeddings- All ML imports (
torch,deepface,transformers,whisper,transnetv2) moved intobackend/jwsearch/ingest/ - Web process imports only
sentence-transformersfor query embeddings (or none if 3b ships) - Endpoints that previously kicked off processing now enqueue jobs
3b: Query-time embedding sidecar (optional)
Deliverables:
backend/jwsearch/embed_service.pyβ 100-line FastAPI process holding Qwen3- Main web process makes HTTP calls to it
- Web process imports zero ML libraries
3c: Video lifecycle
Deliverables:
jws prune videos --keep-thumbnails --keep-embeddingsCLI command- New column
source_deleted_aton the videos table - New env flag
SEARCH_UI_KEEP_SOURCE_VIDEOS(defaultfalse) - Ingestion workflow deletes MP4 after extraction if flag is unset
content_statustreats videos withsource_deleted_atset as "complete"
Rationale: thumbnails (50 MB/video) are 10Γ smaller than source MP4s
(350 MB/video). JW.org streams playback via progressiveDownloadURL already.
Re-extraction only needs re-download (bandwidth, not storage).
Gate: web process cold-start under 5s, RSS under 500 MB. Background
ingest produces identical indexed data (golden tests pass).
A pruned video still plays via ClipPlayer (poster + streaming URL).
Phase 4 β Route consolidation (3β5 days)
13 routers β 4:
| New router | Replaces |
|---|---|
search.py |
search_routes, content_routes, analysis_routes (scripture) |
catalog.py |
catalog_routes, status_routes, publication_routes (read) |
people.py |
face_routes, face_route_persons, speaker_routes |
jobs.py |
workflow_routes, processing_routes, sync_routes, audio_description_routes, settings_routes |
Plus: services.py (centralized service factory), errors.py (global handlers),
schemas.py (Pydantic DTOs).
Gate: golden tests pass. Frontend works without changes (URLs preserved).
Phases 5β8 β Deferred (need runtime verification or production data)
5: Frontend modernization, 6: Face re-platform, 7: Desktop story, 8: Router decision. Documented in chat; not started in this autonomous session.
Session 1 actual outcome (2026-05-26)
Phases 0, 1, 3c shipped. Phase 2 scaffolded (merge tool only). Phases 3a, 3b, 4, 5, 6, 7, 8 deferred β they need runtime verification, production data, or are best sequenced after the user flips to the single DB.
Net change: ~5,000 LOC removed. 18 new tests added. Test suite:
455/458 passing (3 pre-existing HuggingFace-network failures unchanged).
Branch: claude/refactor-aggressive.
What the user needs to do next to continue the refactor:
Run the golden snapshot against current backend with real data:
python scripts/snapshot_golden.py --base-url http://localhost:8001 \ --output tests/golden/snapshot.json git add tests/golden/snapshot.json && git commitWithout a baseline snapshot, Phase 4+ can't detect ranking regressions.
Merge the databases. The merge tool now reconstructs regular tables, vec0 (sqlite-vec) embeddings, AND FTS5 full-text indices, preserving rowids so embeddingβmetadata joins survive. Steps:
# Dry run first β reports per-source row counts, writes nothing python scripts/merge_databases.py --output ~/searchui-merged.db --dry-run # Then for real (needs: pip install sqlite-vec) python scripts/merge_databases.py --output ~/searchui-merged.dbVerify the merged DB serves searches correctly, THEN flip the app by pointing all DB env vars at it:
export SEARCH_UI_SEARCH_DB_PATH=~/searchui-merged.db export SEARCH_UI_IMAGE_DB_PATH=~/searchui-merged.db export SEARCH_UI_FACE_DB_PATH=~/searchui-merged.db export SEARCH_UI_SPEAKER_DB_PATH=~/searchui-merged.db export SEARCH_UI_PUBLICATIONS_DB_PATH=~/searchui-merged.dbRun the golden diff (
scripts/diff_golden.py) against the flipped app to confirm no ranking drift. Keep the original 5 DBs untouched for two weeks as the rollback path before archiving.Still TODO in Phase 2: replace the per-DB
CREATE TABLE IF NOT EXISTSbootstrap +_schema_metadatawith Alembic migrations so the single DB has real schema versioning. The merge tool keeps only the first source's_schema_metadatarow β Alembic'salembic_versionwill supersede it.Try the video prune in dry-run mode first:
python scripts/prune_source_videos.py --dry-runExpect ~1 TB of disk reclaimed across 3,713 videos.
Session 2 actual outcome (2026-05-28)
Completed the Phase 2 merge tool (vec0 + FTS5 reconstruction with rowid
preservation), then ran full pre-merge QC on the whole claude/refactor-aggressive
branch (18 commits, net β2,952 LOC):
- Safety review (subagent): zero dangling references β every deleted module
(
llm_client,llm_router, the two visual rerankers) and removed endpoint (/api/hello,/api/data,designMockups) has zero remaining referents. - Standards review (subagent): zero MUST-FIX. Fixed SHOULD-FIX items β
SEARCH_UI_KEEP_SOURCE_VIDEOSwas silently ignored on the batch path (process_all_local_videoshard-codeddelete_video_after=True); defaulted to theNonesentinel + added a regression test. Removed a phantom--keep-largestdoc line and two dead imports. - Security review (skill): no vulnerabilities. The SQL-building merge tool and file-deleting prune script are operator CLIs whose only external inputs are trusted env/CLI values and the app's own schema.
Test suite: 460 passed, 3 pre-existing HuggingFace-network failures. Branch
merged to main via PR. Continuation handoff for the remaining phases lives in
CONTINUATION_PROMPT.md.
Next (see CONTINUATION_PROMPT.md): Phase 4 route consolidation β Phase 2 Alembic β Phase 3a CLI-ingestion scaffold, each as an atomic-commit branch with subagent QC and a PR for Glenn to merge after an app smoke-test.
Phase 4 progress (Session 2, branch claude/phase4-route-consolidation)
Done & verified (13 β 10 route modules):
- Added
backend/tests/test_app_boot.pyβ assembles the realcreate_app()(startup checks off viaSEARCH_UI_STARTUP_CHECKS=false) and asserts the full /api surface + no duplicate (path, method) registrations. This is the regression guard that makes consolidation verifiable without the live app. search_routes.pyβ absorbedcontent_routes.py+analysis_routes.py(deduped the byte-identical_get_default_*_servicehelpers; dropped a deadimport json).face_routes.pyβ absorbedspeaker_routes.py(kept the module-level speakerroutersingleton).- Each step verified by a byte-identical 128-route manifest + full suite green (462 passed). Old files deleted; importing tests use module aliases.
Naming note: search.py is the search ENGINE module, so the consolidated
router keeps the *_routes.py convention rather than the plan's search.py.
Targets are now: search_routes, catalog_routes, face_routes,
workflow_routes (4 β€ 5).
Final state (Session 3): 13 β 6 routers. Phase 4 closed here.
Cloud PRs #34β39 carried it past the Session-2 "13β10" note: catalog_routes
absorbed status_routes, and workflow_routes absorbed audio_description +
settings + sync. Current routers (6): search_routes, catalog_routes
(+status), face_routes (+speaker), workflow_routes (+AD/settings/sync),
processing_routes, publication_routes.
Decision: stop at 6. Do NOT force the last two merges. A deep-dive review (Session 3) found both remaining merges trade maintainability for a smaller count β exactly what CLAUDE.md's file-size guidance warns against:
processing β workflowβ a ~1,790-LOC file with five unrelated concerns (orchestration, AD, settings, sync, batch-processing) wired by two independent runtimes (WorkflowRuntime,ProcessingRuntime) that share no code. Strictly worse. Rejected.publication β catalogβ a ~1,158-LOC, three-concerncatalog_routes, andtest_publication_routes.py'simportlib.reloadwould pollute sibling catalog/status tests. Publications is its own concern (separate JW publications API, DB, and image search). Marginal count gain, real risk. Rejected.
The β€5 North-star was a ceiling, not a quota. 6 coherent routers is the
maintainable resting place; the churn budget is better spent on search quality.
If a future session wants 5, the publicationβcatalog merge is the only
defensible one and must first replace that test's reload with attribute-patching.
Session 3 β DB consolidation flipped to single DB (2026-05-29)
Validated and flipped the live app onto the merged single DB. Steps taken and what's needed to keep/rollback it:
- Merged DB:
/Users/avsadmin/searchui-merged.db(3.2 GB; built withscripts/merge_databases.pyafter fixing the vec0 PK-alias bug). All 15 key row counts match the per-source dry-run exactly. - Durable launcher:
scripts/run-backend-merged.shsets the five env vars (override location viaSEARCH_UI_MERGED_DB); see CLAUDE.md Development Workflow. Merged tomain2026-05-29. - Flip is LIVE in the running backends (
:8001and:8002) via these env vars on the uvicorn launch (settings.db stays separate β it is NOT merged):SEARCH_UI_SEARCH_DB_PATH=/Users/avsadmin/searchui-merged.db SEARCH_UI_IMAGE_DB_PATH=/Users/avsadmin/searchui-merged.db SEARCH_UI_FACE_DB_PATH=/Users/avsadmin/searchui-merged.db SEARCH_UI_SPEAKER_DB_PATH=/Users/avsadmin/searchui-merged.db SEARCH_UI_PUBLICATIONS_DB_PATH=/Users/avsadmin/searchui-merged.db - To make it durable (survive a manual restart): start the backend with
scripts/run-backend-merged.sh(it sets those env vars for you), per CLAUDE.md's Development Workflow. The app readsos.environdirectly β there is no.envloader β so starting it the plain way (without the launcher) reverts to the 5 source DBs. - Rollback: unset the five env vars and restart β back to the original 5
DBs, which are left untouched. RETENTION: do NOT archive/delete the original
5 source DBs (
database.db,images_database.db,faces_database.db,speakers_database.db,publications_database.db) before ~2026-06-12 (β2 weeks of normal single-DB use). They are the rollback path until then. - Golden baseline re-captured on the merged DB (
tests/golden/snapshot.json). Smoke-tested keyword/semantic/hybrid/image-content/title/scripture/ publication-image on the flipped:8001β all return sensible results. - Known cosmetic diff vs the 5-DB world: one semantic query's two equal-score (0.71191) hits swap tie-order (vec0 index rebuild). Same set, same scores. A deterministic tie-break (ORDER BY distance, natural_key) is a good follow-up under search-quality.
Phase 2 Alembic foundation (Session 3) β SCOPED, opt-in
Added a safe foundation for schema versioning without the risky full cutover (the Plan agent showed a full cutover is currently unsafe β see "deferred").
Delivered:
backend/schema_version.pyβensure_alembic_version()idempotently brings a DB under Alembic: existing DB βstampbaseline; empty file βupgrade; already versioned β no-op. Plusget_primary_db_path().- Wired into
app_runtime.create_app_runtime()behindSEARCH_UI_ALEMBIC_MANAGE(OFF by default) β so merging changes nothing until opted in. When enabled, the live merged DB gets stamped at baseline (alembic_versionrow; safe, reversible, no data change).init_db()still owns the schema; failure is logged, not fatal. - Tests: stamp/upgrade/idempotent detection + flag-gated boot wiring.
Deferred (do NOT do without a separate, carefully-verified PR):
- Replacing the no-op baseline with a verbatim live-schema migration, and
removing the per-subsystem
init_db()CREATE TABLE IF NOT EXISTS. Blockers: Alembic autogenerate can't model the 12 vec0/FTS5 virtual tables (baseline must be hand-authored raw DDL with sqlite-vec loaded at migrate time), and_schema_metadatacarries embedding model/recipe infoalembic_versiondoes not replace.
Finding β orphan tables not in source. video_concepts, video_concepts_fts,
video_concept_embeddings are READ by search_semantic.py but have no
CREATE statement in backend/ β they exist only because some ingestion path
(outside the web backend) created them in the live DB. A fresh web-only install
would lack them. Their live DDL (captured for the future baseline):
CREATE TABLE video_concepts (natural_key TEXT NOT NULL, language TEXT NOT NULL,
summary TEXT NOT NULL, topics_json TEXT NOT NULL, keywords_json TEXT NOT NULL,
concept_text TEXT NOT NULL, content_hash TEXT, recipe_version INTEGER NOT NULL,
recipe_payload TEXT NOT NULL, indexed_at TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP,
visual_cues_json TEXT NOT NULL DEFAULT '[]', PRIMARY KEY (natural_key, language));
CREATE VIRTUAL TABLE video_concepts_fts USING fts5(natural_key, language, concept_text);
CREATE VIRTUAL TABLE video_concept_embeddings USING vec0(natural_key TEXT, language TEXT, embedding float[1024]);
Phase 3a CLI scaffold (Session 3) β additive, ML untouched
Added backend/cli.py: a stdlib-argparse ingestion entrypoint so ingestion can
run OUTSIDE the web process. Purely additive β it calls the existing
processing functions unchanged and does NOT move ML off the web request path
(that flip needs Glenn's cold-start RSS verification; see cutover below).
Subcommands wired (each calls an existing function; heavy imports are
function-local so cli.py --help and the parser load no torch):
ingest vod Β· ingest subtitles Β· ingest video [--all] Β· ingest images --source {publications|web} Β· process subtitles Β· reindex embeddings Β·
reindex subtitles. Tested via parser-routing + dispatch (faked modules) + a
subprocess guard asserting torch/process_video are absent after import.
Deferred: ingest faces β faces have no standalone ingestion function; they
run inside process_video's thumbnail step. A standalone face re-index is
net-new orchestration (pair it with Phase 6).
ML import boundary (what the cutover must move): the web process imports
torch eagerly today via search.py (from sentence_transformers import ...,
top-level) and search_images.py (import torch / from transformers import ..., top-level), both pulled in at boot through app_runtime. DeepFace/TF
(face_search), Whisper (transcription), TransNetV2 (scene detect), CLAP/VLM
(scene_processing) are lazy with respect to web boot β the boot path never
imports them, so they stay out of web RSS. (Note: video_scene_detect itself
imports search_images at module level, so importing it directly still pulls
torch; only the TransNet model load is lazy. Relevant for step 4's relocation.)
Cutover plan (separate PR, gated on Glenn's runtime check β do NOT do blind):
- Make torch import-lazy in
search.py+search_images.py(move the imports intoget_embedding_model()/get_siglip_model();TYPE_CHECKINGfor annotations). Web boot then imports no torch. - Decide query-time embeddings (the real fork): keyword/title/scripture/ image-category need NO model; semantic/hybrid + image-content textβembedding DO. Either ship the 3b embedding sidecar (web imports zero ML β only way to hold <500 MB steady-state) or keep the model lazy in-process (cold-start is lean but the first semantic query loads torch into web RSS).
- Make heavy-ingestion endpoints enqueue a job / shell out to
cli.pyinstead of running ML in the request worker. - (optional) relocate ML modules under
backend/jwsearch/ingest/.
Gate (Glenn verifies in his runtime, not a sandbox): boot uvicorn main:app
with the single-DB env vars, no warm queries; ps -o rss= -p <pid> after
/api/health 200 β target < 512000 KB; assert torch absent from the web
process; re-run scripts/diff_golden.py (no ranking drift) and confirm
background-ingest output is byte-identical to in-process.
Session 4 β Tie-break + Phase 3a cold-start cutover (2026-06-01)
Baseline confirmed green first: main==origin/main (50ef139), 628 tests
passing, merged-DB flip live, all six search families sane, golden diff clean.
1. Deterministic semantic tie-break (search-quality follow-up from Session 3) β
committed. sqlite-vec rejects a secondary ORDER BY on KNN queries, so
equal-distance hits arrived in index-dependent order (the 0.71191 tie-swap noted
in Session 3). Fixed in the three Python ranking sites (search_semantic,
search_video_concepts, search_hybrid) by breaking score ties on
natural_key. Verified on the real merged DB: the two previously-drifting golden
queries now have identical key sets + identical (key,score) multisets β only
equal-score hits reorder, now deterministically. search-eval (n=150):
title/keyword unchanged; hybrid recall@1 24.67β24.00% (1 sample, tie-ambiguity
noise, now stable vs rebuild-dependent). The hybrid wobble traces to the
semantic tie-break propagating into hybrid RRF ranks (hybrid reuses
search_semantic's order), not the concept sort. Golden re-baselined deterministic.
Added test_search_tie_break.py.
2. Phase 3a cutover step 1 (lazy torch imports) β committed & runtime-verified.
Moved torch/transformers/sentence-transformers out of module scope into
function-local imports in search.py, search_images.py,
image_siglip_inference.py (the three boot-path torch importers). Cold-start
web RSS 450 MB β 76 MB; torch absent at boot (new subprocess guard
test_web_boot_ml_free.py). Models still load lazily on first query (first
semantic β 828 MB, +visual β 1071 MB); results byte-identical (golden: all 12
match; eval unchanged). 632 tests pass.
Decision (Glenn): do Option A (lazy in-process), NOT the 3b sidecar. Rationale is the HF deployment shape: the Space is a single Docker container (free tier, one uvicorn). A sidecar in the same container doesn't lower container RAM (the model sits in one process either way); it only helps a true multi-machine split, which isn't the deployment and would add per-query network latency + an always-on paid service. Steady-state ~1.07 GB fits the 16 GB Space fine. The cold-start win (faster Space wake; word/title/scripture searches ready instantly without loading ML) is the real benefit and is now banked. Adjacent known issue, NOT addressed here: CPU semantic/visual latency on the free Space (inherent to no-GPU; would need caching / lighter model / precompute β a separate conversation).
3. Single-video ingestion moved off the web worker β committed & runtime-verified.
POST /api/process-video now shells out to python -m cli ingest video (new
--result-json gives a clean JSON artifact; the helper inherits env so the
subprocess targets the same DBs). Verified: endpoint returns HTTP 200 with the
identical result dict while the web worker stays 76 MB / 0 torch and a
separate subprocess (~1.3 GB) does all the ML. Response contract + delete
semantics unchanged. 640 tests pass.
Still open in 3a (not started):
- The streaming bulk endpoints (
process-all-videos,-v2,retry-failed) still runprocess_videoinline β they stream live SSE progress, so moving them out-of-process needs a progress-bridge design (subprocess β file/queue β SSE). Deliberately deferred as a separate, designed piece. - Composite workflows (
/api/update-content,reprocess-existing,nuclear-rebuild) β no single CLI subcommand yet. index-publication-images/crawl-web-imagesβcli ingest images.- Optionally relocate ML modules under
backend/jwsearch/ingest/.
Guardrails
- Golden tests pass at every commit.
- No phase ships without a rollback path.
- CLAUDE.md updated as part of the phase that invalidates a rule.
- Atomic commits. Subagent code review at each phase boundary.
- No model swaps and dependency cleanup in the same commit.