Copilot Copilot commited on
Commit ·
4a9e84e
1
Parent(s): 2c5cd27
Add portfolio pages and overlay playback
Browse filesCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- README.md +8 -3
- app.py +351 -58
- video_utils.py +237 -0
README.md
CHANGED
|
@@ -11,7 +11,7 @@ app_port: 7860
|
|
| 11 |
|
| 12 |
This folder is an isolated Hugging Face Space scaffold for the phase-recognition models in this repository.
|
| 13 |
It is intentionally separate from the existing FastAPI webapp and is designed to expose **DINO-Endo, AI-Endo, and V-JEPA2** on paid GPU hardware such as **1x A10G (24 GB VRAM)**.
|
| 14 |
-
The public UI now behaves like a small **project portfolio**
|
| 15 |
The default featured model remains **DINO-Endo**, but the same Space can load and unload all three model families one at a time.
|
| 16 |
|
| 17 |
## Supported model families
|
|
@@ -68,12 +68,16 @@ If a required checkpoint is missing locally, it will try to download it from the
|
|
| 68 |
|
| 69 |
### Upload and dashboard behavior
|
| 70 |
|
| 71 |
-
- The
|
|
|
|
|
|
|
|
|
|
| 72 |
- The active model family is selected through a visible **model slider** in the workspace rather than a hidden picker.
|
| 73 |
- The Space now keeps a single active predictor loaded at a time and unloads the previous model when the model slider changes.
|
| 74 |
- MP4 is the primary video upload format, while `mov`, `avi`, `mkv`, `webm`, and `m4v` remain enabled as fallback containers.
|
| 75 |
- `.streamlit/config.toml` raises the default Streamlit single-file upload ceiling to **4096 MB** and disables file watching / usage telemetry so runtime cache writes do not trigger restart loops.
|
| 76 |
- Uploaded videos are immediately spooled to local disk for metadata probing and analysis, instead of repeatedly reading the in-memory upload object on every rerun.
|
|
|
|
| 77 |
- The UI shows file size, duration, fps, frame count, resolution, working-storage headroom, and suppresses inline preview for very large uploads to keep the browser path lighter.
|
| 78 |
- V-JEPA2 is labeled as a slower first load so users understand the cold-cache cost of its very large encoder checkpoint.
|
| 79 |
|
|
@@ -150,11 +154,12 @@ python scripts/smoke_test.py --model vjepa2 --model-dir /path/to/model
|
|
| 150 |
## Scope of v1
|
| 151 |
|
| 152 |
- Streamlit UI
|
| 153 |
-
- project-portfolio landing page with DINO-Endo Surgery
|
| 154 |
- three-model slider for DINO-Endo, AI-Endo, and V-JEPA2, with DINO-Endo selected by default
|
| 155 |
- image upload and video upload
|
| 156 |
- dashboard-style model/runtime status
|
| 157 |
- robust video metadata probing with OpenCV + ffprobe fallback
|
|
|
|
| 158 |
- large single-file uploads up to the configured Streamlit cap
|
| 159 |
- per-frame phase timeline output for video
|
| 160 |
- optional live encoder/decoder explainability sidebar with true attention where available and labeled proxies elsewhere
|
|
|
|
| 11 |
|
| 12 |
This folder is an isolated Hugging Face Space scaffold for the phase-recognition models in this repository.
|
| 13 |
It is intentionally separate from the existing FastAPI webapp and is designed to expose **DINO-Endo, AI-Endo, and V-JEPA2** on paid GPU hardware such as **1x A10G (24 GB VRAM)**.
|
| 14 |
+
The public UI now behaves like a small **project portfolio** with explicit page navigation: a landing home page, a dedicated **DINO-Endo Surgery** workspace page, and a portfolio/projects page for future demos.
|
| 15 |
The default featured model remains **DINO-Endo**, but the same Space can load and unload all three model families one at a time.
|
| 16 |
|
| 17 |
## Supported model families
|
|
|
|
| 68 |
|
| 69 |
### Upload and dashboard behavior
|
| 70 |
|
| 71 |
+
- The Space now routes across multiple portfolio pages instead of stacking everything into a single long screen.
|
| 72 |
+
- The **Home** page acts as the landing site for **Ryukijano's Project Portfolio**.
|
| 73 |
+
- The **DINO-Endo Surgery** page is the dedicated hosted workspace for image/video inference.
|
| 74 |
+
- The **Projects** page lists the current live workspace plus the next planned portfolio pages.
|
| 75 |
- The active model family is selected through a visible **model slider** in the workspace rather than a hidden picker.
|
| 76 |
- The Space now keeps a single active predictor loaded at a time and unloads the previous model when the model slider changes.
|
| 77 |
- MP4 is the primary video upload format, while `mov`, `avi`, `mkv`, `webm`, and `m4v` remain enabled as fallback containers.
|
| 78 |
- `.streamlit/config.toml` raises the default Streamlit single-file upload ceiling to **4096 MB** and disables file watching / usage telemetry so runtime cache writes do not trigger restart loops.
|
| 79 |
- Uploaded videos are immediately spooled to local disk for metadata probing and analysis, instead of repeatedly reading the in-memory upload object on every rerun.
|
| 80 |
+
- Video analysis now produces an **annotated playback clip** with the predicted phase HUD burned directly onto the video frames, echoing the overlay style from the main `webapp/` dashboard.
|
| 81 |
- The UI shows file size, duration, fps, frame count, resolution, working-storage headroom, and suppresses inline preview for very large uploads to keep the browser path lighter.
|
| 82 |
- V-JEPA2 is labeled as a slower first load so users understand the cold-cache cost of its very large encoder checkpoint.
|
| 83 |
|
|
|
|
| 154 |
## Scope of v1
|
| 155 |
|
| 156 |
- Streamlit UI
|
| 157 |
+
- project-portfolio landing page with separate Home, DINO-Endo Surgery, and Projects pages
|
| 158 |
- three-model slider for DINO-Endo, AI-Endo, and V-JEPA2, with DINO-Endo selected by default
|
| 159 |
- image upload and video upload
|
| 160 |
- dashboard-style model/runtime status
|
| 161 |
- robust video metadata probing with OpenCV + ffprobe fallback
|
| 162 |
+
- annotated overlay playback for analysed videos
|
| 163 |
- large single-file uploads up to the configured Streamlit cap
|
| 164 |
- per-frame phase timeline output for video
|
| 165 |
- optional live encoder/decoder explainability sidebar with true attention where available and labeled proxies elsewhere
|
app.py
CHANGED
|
@@ -21,6 +21,8 @@ from predictor import MODEL_LABELS, PHASE_LABELS, normalize_model_key
|
|
| 21 |
from video_utils import (
|
| 22 |
STREAMLIT_SERVER_MAX_UPLOAD_MB,
|
| 23 |
SUPPORTED_VIDEO_TYPES,
|
|
|
|
|
|
|
| 24 |
format_bytes,
|
| 25 |
get_upload_size_bytes,
|
| 26 |
get_workspace_free_bytes,
|
|
@@ -28,6 +30,7 @@ from video_utils import (
|
|
| 28 |
recommended_frame_stride,
|
| 29 |
should_show_inline_preview,
|
| 30 |
spool_uploaded_video,
|
|
|
|
| 31 |
)
|
| 32 |
|
| 33 |
st.set_page_config(page_title="Ryukijano's Project Portfolio", layout="wide")
|
|
@@ -74,6 +77,18 @@ SPACE_TITLE = "Ryukijano's Project Portfolio"
|
|
| 74 |
FEATURED_PROJECT_TITLE = "DINO-Endo Surgery Workspace"
|
| 75 |
MODEL_SLIDER_KEY = "workspace-model-slider"
|
| 76 |
SELECTED_MODEL_STATE_KEY = "selected_model_key"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 77 |
|
| 78 |
|
| 79 |
@dataclass(frozen=True)
|
|
@@ -105,6 +120,39 @@ HOSTED_PROJECTS = (
|
|
| 105 |
),
|
| 106 |
)
|
| 107 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 108 |
|
| 109 |
def _phase_index(phase: str) -> int:
|
| 110 |
try:
|
|
@@ -293,10 +341,10 @@ def _render_project_hub(enabled_model_keys: list[str]) -> None:
|
|
| 293 |
)
|
| 294 |
|
| 295 |
metrics = st.columns(4)
|
| 296 |
-
metrics[0].metric("
|
| 297 |
metrics[1].metric("Model families", len(enabled_model_keys))
|
| 298 |
metrics[2].metric("Explainability", "Opt-in")
|
| 299 |
-
metrics[3].metric("
|
| 300 |
|
| 301 |
left_col, right_col = st.columns([1.8, 1.2], gap="large")
|
| 302 |
with left_col:
|
|
@@ -321,11 +369,11 @@ def _render_project_hub(enabled_model_keys: list[str]) -> None:
|
|
| 321 |
<span class="hub-status">Portfolio shell</span>
|
| 322 |
<h3>Ready for more demos</h3>
|
| 323 |
<p>
|
| 324 |
-
The
|
| 325 |
-
|
| 326 |
</p>
|
| 327 |
<ul>
|
| 328 |
-
<li>Keep each project's controls inside its own
|
| 329 |
<li>Reuse the same landing-page hero, metrics, and project-card layout.</li>
|
| 330 |
<li>Preserve one-model-at-a-time loading so future demos stay GPU-friendly.</li>
|
| 331 |
</ul>
|
|
@@ -334,12 +382,79 @@ def _render_project_hub(enabled_model_keys: list[str]) -> None:
|
|
| 334 |
unsafe_allow_html=True,
|
| 335 |
)
|
| 336 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 337 |
|
| 338 |
def _render_workspace_header(enabled_model_keys: list[str], model_key: str) -> None:
|
| 339 |
selected_label = _model_option_label(model_key)
|
| 340 |
selection_note = (
|
| 341 |
"Use the model slider to move between DINO-Endo, AI-Endo, and V-JEPA2. "
|
| 342 |
-
"Only one model stays loaded at a time so the Space remains responsive on shared GPU hardware
|
|
|
|
| 343 |
)
|
| 344 |
st.markdown(
|
| 345 |
f"""
|
|
@@ -388,7 +503,49 @@ def _get_model_manager() -> SpaceModelManager:
|
|
| 388 |
return manager
|
| 389 |
|
| 390 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 391 |
def _clear_video_stage() -> None:
|
|
|
|
| 392 |
temp_path = st.session_state.pop("staged_video_path", None)
|
| 393 |
if temp_path is not None:
|
| 394 |
Path(temp_path).unlink(missing_ok=True)
|
|
@@ -421,6 +578,12 @@ def _prepare_staged_video(uploaded_file):
|
|
| 421 |
return temp_path, meta
|
| 422 |
|
| 423 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 424 |
def _records_to_frame(records):
|
| 425 |
if not records:
|
| 426 |
return pd.DataFrame(columns=["frame_index", "timestamp_sec", "phase", "confidence"])
|
|
@@ -512,6 +675,8 @@ def _analyse_video(
|
|
| 512 |
frame_stride: int,
|
| 513 |
max_frames: int,
|
| 514 |
*,
|
|
|
|
|
|
|
| 515 |
explainability_config: dict | None = None,
|
| 516 |
explainability_callback=None,
|
| 517 |
):
|
|
@@ -520,16 +685,27 @@ def _analyse_video(
|
|
| 520 |
if not capture.isOpened():
|
| 521 |
raise RuntimeError("Unable to open uploaded video")
|
| 522 |
|
| 523 |
-
total_frames = int(capture.get(cv2.CAP_PROP_FRAME_COUNT) or 0)
|
| 524 |
-
fps = float(capture.get(cv2.CAP_PROP_FPS) or 0.0)
|
|
|
|
| 525 |
progress = st.progress(0)
|
| 526 |
status = st.empty()
|
| 527 |
|
| 528 |
predictor.reset_state()
|
| 529 |
records = []
|
| 530 |
processed = 0
|
|
|
|
| 531 |
frame_index = 0
|
|
|
|
| 532 |
explain_enabled = bool(explainability_config and explainability_config.get("enabled"))
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 533 |
|
| 534 |
try:
|
| 535 |
while True:
|
|
@@ -537,51 +713,107 @@ def _analyse_video(
|
|
| 537 |
if not ok:
|
| 538 |
break
|
| 539 |
|
| 540 |
-
|
| 541 |
-
|
| 542 |
-
|
| 543 |
-
|
| 544 |
-
|
| 545 |
-
|
| 546 |
-
|
| 547 |
-
|
| 548 |
-
|
| 549 |
-
|
| 550 |
-
|
| 551 |
-
|
| 552 |
-
|
| 553 |
-
|
| 554 |
-
|
| 555 |
-
|
| 556 |
-
|
| 557 |
-
|
| 558 |
-
|
| 559 |
-
|
| 560 |
-
|
| 561 |
-
|
| 562 |
-
|
| 563 |
-
|
| 564 |
-
|
| 565 |
-
|
| 566 |
-
|
| 567 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 568 |
|
| 569 |
if total_frames > 0:
|
| 570 |
progress.progress(min(frame_index + 1, total_frames) / total_frames)
|
| 571 |
else:
|
| 572 |
progress.progress(min(processed / max_frames, 1.0))
|
| 573 |
-
status.caption(
|
|
|
|
|
|
|
| 574 |
|
| 575 |
frame_index += 1
|
| 576 |
-
if processed >= max_frames:
|
|
|
|
| 577 |
break
|
|
|
|
|
|
|
|
|
|
| 578 |
finally:
|
| 579 |
capture.release()
|
| 580 |
predictor.reset_state()
|
| 581 |
-
|
| 582 |
-
|
| 583 |
-
|
| 584 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 585 |
|
| 586 |
|
| 587 |
def _render_single_result(result: dict):
|
|
@@ -607,6 +839,27 @@ def _render_video_results(records, meta):
|
|
| 607 |
st.warning("No frames were processed from the uploaded video.")
|
| 608 |
return
|
| 609 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 610 |
df = _records_to_frame(records)
|
| 611 |
counts = Counter(df["phase"].tolist())
|
| 612 |
dominant_phase, _ = counts.most_common(1)[0]
|
|
@@ -645,12 +898,9 @@ def _render_video_results(records, meta):
|
|
| 645 |
right.download_button("Download CSV", csv_payload, file_name="phase_timeline.csv", mime="text/csv")
|
| 646 |
|
| 647 |
|
| 648 |
-
def
|
| 649 |
-
enabled_model_keys
|
| 650 |
-
|
| 651 |
-
manager = _get_model_manager()
|
| 652 |
-
_inject_app_styles()
|
| 653 |
-
_render_project_hub(enabled_model_keys)
|
| 654 |
previous_selected_model_key, model_key = _resolve_model_selection(enabled_model_keys, default_model_key)
|
| 655 |
|
| 656 |
_render_workspace_header(enabled_model_keys, model_key)
|
|
@@ -659,6 +909,7 @@ def main():
|
|
| 659 |
st.session_state[SELECTED_MODEL_STATE_KEY] = model_key
|
| 660 |
if previous_selected_model_key is not None and previous_selected_model_key != model_key:
|
| 661 |
manager.unload_model()
|
|
|
|
| 662 |
|
| 663 |
explainability_config, explainability_spec = _build_explainability_config(manager, model_key)
|
| 664 |
|
|
@@ -775,6 +1026,17 @@ def main():
|
|
| 775 |
except Exception as exc:
|
| 776 |
st.error(str(exc))
|
| 777 |
else:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 778 |
info_cols = st.columns(5)
|
| 779 |
info_cols[0].metric("File size", video_meta["file_size_label"])
|
| 780 |
info_cols[1].metric("Duration", video_meta["duration_label"])
|
|
@@ -812,6 +1074,7 @@ def main():
|
|
| 812 |
title=f"Live explainability · sampled frame {processed_count}",
|
| 813 |
)
|
| 814 |
|
|
|
|
| 815 |
try:
|
| 816 |
with st.spinner(f"Running {MODEL_LABELS[model_key]} on {uploaded_video.name}..."):
|
| 817 |
predictor = manager.get_predictor(model_key)
|
|
@@ -820,6 +1083,8 @@ def main():
|
|
| 820 |
predictor,
|
| 821 |
frame_stride=frame_stride,
|
| 822 |
max_frames=max_frames,
|
|
|
|
|
|
|
| 823 |
explainability_config=explainability_config if explainability_config.get("enabled") else None,
|
| 824 |
explainability_callback=(
|
| 825 |
_video_explainability_callback
|
|
@@ -831,21 +1096,49 @@ def main():
|
|
| 831 |
**video_meta,
|
| 832 |
**analysis_meta,
|
| 833 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 834 |
except Exception as exc:
|
| 835 |
st.error(str(exc))
|
| 836 |
-
|
| 837 |
-
|
| 838 |
-
|
| 839 |
-
|
| 840 |
-
|
| 841 |
-
|
| 842 |
-
|
| 843 |
-
|
| 844 |
-
|
| 845 |
-
|
|
|
|
|
|
|
| 846 |
else:
|
| 847 |
_clear_video_stage()
|
| 848 |
|
| 849 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 850 |
if __name__ == "__main__":
|
| 851 |
main()
|
|
|
|
| 21 |
from video_utils import (
|
| 22 |
STREAMLIT_SERVER_MAX_UPLOAD_MB,
|
| 23 |
SUPPORTED_VIDEO_TYPES,
|
| 24 |
+
create_overlay_video_writer,
|
| 25 |
+
draw_prediction_overlay,
|
| 26 |
format_bytes,
|
| 27 |
get_upload_size_bytes,
|
| 28 |
get_workspace_free_bytes,
|
|
|
|
| 30 |
recommended_frame_stride,
|
| 31 |
should_show_inline_preview,
|
| 32 |
spool_uploaded_video,
|
| 33 |
+
transcode_video_for_streamlit,
|
| 34 |
)
|
| 35 |
|
| 36 |
st.set_page_config(page_title="Ryukijano's Project Portfolio", layout="wide")
|
|
|
|
| 77 |
FEATURED_PROJECT_TITLE = "DINO-Endo Surgery Workspace"
|
| 78 |
MODEL_SLIDER_KEY = "workspace-model-slider"
|
| 79 |
SELECTED_MODEL_STATE_KEY = "selected_model_key"
|
| 80 |
+
PORTFOLIO_PAGE_STATE_KEY = "portfolio-page"
|
| 81 |
+
VIDEO_ANALYSIS_STATE_KEY = "video-analysis-state"
|
| 82 |
+
PORTFOLIO_PAGE_LABELS = {
|
| 83 |
+
"home": "Home",
|
| 84 |
+
"workspace": "DINO-Endo Surgery",
|
| 85 |
+
"projects": "Projects",
|
| 86 |
+
}
|
| 87 |
+
PORTFOLIO_PAGE_SUMMARIES = {
|
| 88 |
+
"home": "Landing page for Ryukijano's Project Portfolio and the featured hosted work.",
|
| 89 |
+
"workspace": "Dedicated Dino-Endo Surgery workspace with model controls, explainability, and annotated video playback.",
|
| 90 |
+
"projects": "Project index for the current live workspace and the next portfolio pages to add.",
|
| 91 |
+
}
|
| 92 |
|
| 93 |
|
| 94 |
@dataclass(frozen=True)
|
|
|
|
| 120 |
),
|
| 121 |
)
|
| 122 |
|
| 123 |
+
PORTFOLIO_PROJECTS = HOSTED_PROJECTS + (
|
| 124 |
+
HostedProject(
|
| 125 |
+
key="jetson-runtime-notes",
|
| 126 |
+
title="Jetson Deployment Notes",
|
| 127 |
+
status="Coming soon",
|
| 128 |
+
summary=(
|
| 129 |
+
"A future portfolio page for the clinic-side Jetson deployment story: upload reliability, local retention, "
|
| 130 |
+
"and report generation in the full webapp stack."
|
| 131 |
+
),
|
| 132 |
+
highlights=(
|
| 133 |
+
"Cloudflare-safe upload architecture",
|
| 134 |
+
"Retention-aware clinical workflows",
|
| 135 |
+
"Single-worker queue constraints on Jetson",
|
| 136 |
+
),
|
| 137 |
+
tags=("Edge deployment", "Clinical workflow", "FastAPI"),
|
| 138 |
+
),
|
| 139 |
+
HostedProject(
|
| 140 |
+
key="explainability-lab",
|
| 141 |
+
title="Explainability Lab",
|
| 142 |
+
status="Planned",
|
| 143 |
+
summary=(
|
| 144 |
+
"A follow-on page for comparing encoder attention, temporal decoder strips, and proxy saliency views "
|
| 145 |
+
"across the three phase-recognition model families."
|
| 146 |
+
),
|
| 147 |
+
highlights=(
|
| 148 |
+
"Layer/head comparisons",
|
| 149 |
+
"Temporal decoder strip review",
|
| 150 |
+
"Side-by-side model introspection",
|
| 151 |
+
),
|
| 152 |
+
tags=("Attention maps", "Interpretability", "Model analysis"),
|
| 153 |
+
),
|
| 154 |
+
)
|
| 155 |
+
|
| 156 |
|
| 157 |
def _phase_index(phase: str) -> int:
|
| 158 |
try:
|
|
|
|
| 341 |
)
|
| 342 |
|
| 343 |
metrics = st.columns(4)
|
| 344 |
+
metrics[0].metric("Portfolio pages", len(PORTFOLIO_PAGE_LABELS))
|
| 345 |
metrics[1].metric("Model families", len(enabled_model_keys))
|
| 346 |
metrics[2].metric("Explainability", "Opt-in")
|
| 347 |
+
metrics[3].metric("Overlay playback", "Built in")
|
| 348 |
|
| 349 |
left_col, right_col = st.columns([1.8, 1.2], gap="large")
|
| 350 |
with left_col:
|
|
|
|
| 369 |
<span class="hub-status">Portfolio shell</span>
|
| 370 |
<h3>Ready for more demos</h3>
|
| 371 |
<p>
|
| 372 |
+
The landing site now routes between distinct pages instead of stacking everything into one long screen.
|
| 373 |
+
Add more project pages here later while keeping one shared brand, layout, and deployment target.
|
| 374 |
</p>
|
| 375 |
<ul>
|
| 376 |
+
<li>Keep each project's controls inside its own dedicated page.</li>
|
| 377 |
<li>Reuse the same landing-page hero, metrics, and project-card layout.</li>
|
| 378 |
<li>Preserve one-model-at-a-time loading so future demos stay GPU-friendly.</li>
|
| 379 |
</ul>
|
|
|
|
| 382 |
unsafe_allow_html=True,
|
| 383 |
)
|
| 384 |
|
| 385 |
+
action_cols = st.columns([1.2, 1.0, 1.0], gap="medium")
|
| 386 |
+
if action_cols[0].button("Open DINO-Endo Surgery workspace", key="open-featured-workspace", use_container_width=True, type="primary"):
|
| 387 |
+
_navigate_to_page("workspace")
|
| 388 |
+
if action_cols[1].button("Browse project pages", key="open-project-pages", use_container_width=True):
|
| 389 |
+
_navigate_to_page("projects")
|
| 390 |
+
action_cols[2].download_button(
|
| 391 |
+
"Export portfolio summary",
|
| 392 |
+
json.dumps(
|
| 393 |
+
{
|
| 394 |
+
"title": SPACE_TITLE,
|
| 395 |
+
"pages": list(PORTFOLIO_PAGE_LABELS.values()),
|
| 396 |
+
"projects": [
|
| 397 |
+
{"title": project.title, "status": project.status, "tags": list(project.tags)}
|
| 398 |
+
for project in PORTFOLIO_PROJECTS
|
| 399 |
+
],
|
| 400 |
+
},
|
| 401 |
+
indent=2,
|
| 402 |
+
).encode("utf-8"),
|
| 403 |
+
file_name="portfolio_summary.json",
|
| 404 |
+
mime="application/json",
|
| 405 |
+
use_container_width=True,
|
| 406 |
+
)
|
| 407 |
+
|
| 408 |
+
|
| 409 |
+
def _render_projects_page() -> None:
|
| 410 |
+
st.markdown(
|
| 411 |
+
"""
|
| 412 |
+
<section class="workspace-card">
|
| 413 |
+
<p class="hub-eyebrow">Project index</p>
|
| 414 |
+
<h2>Portfolio pages</h2>
|
| 415 |
+
<p class="workspace-copy">
|
| 416 |
+
The portfolio shell now has distinct pages. DINO-Endo Surgery is the live hosted workspace, while the
|
| 417 |
+
other cards mark the next pages to spin out from this shared Space foundation.
|
| 418 |
+
</p>
|
| 419 |
+
</section>
|
| 420 |
+
""",
|
| 421 |
+
unsafe_allow_html=True,
|
| 422 |
+
)
|
| 423 |
+
|
| 424 |
+
project_cols = st.columns(len(PORTFOLIO_PROJECTS), gap="large")
|
| 425 |
+
for col, project in zip(project_cols, PORTFOLIO_PROJECTS):
|
| 426 |
+
highlights_html = "".join(f"<li>{item}</li>" for item in project.highlights)
|
| 427 |
+
with col:
|
| 428 |
+
st.markdown(
|
| 429 |
+
f"""
|
| 430 |
+
<section class="hub-card">
|
| 431 |
+
<span class="hub-status">{project.status}</span>
|
| 432 |
+
<h3>{project.title}</h3>
|
| 433 |
+
<p>{project.summary}</p>
|
| 434 |
+
<div class="hub-chip-row">{_render_hub_chips(project.tags)}</div>
|
| 435 |
+
<ul>{highlights_html}</ul>
|
| 436 |
+
</section>
|
| 437 |
+
""",
|
| 438 |
+
unsafe_allow_html=True,
|
| 439 |
+
)
|
| 440 |
+
if project.key == HOSTED_PROJECTS[0].key:
|
| 441 |
+
if st.button("Open workspace page", key=f"open-page-{project.key}", use_container_width=True, type="primary"):
|
| 442 |
+
_navigate_to_page("workspace")
|
| 443 |
+
else:
|
| 444 |
+
st.button(
|
| 445 |
+
"Page planned",
|
| 446 |
+
key=f"planned-page-{project.key}",
|
| 447 |
+
use_container_width=True,
|
| 448 |
+
disabled=True,
|
| 449 |
+
)
|
| 450 |
+
|
| 451 |
|
| 452 |
def _render_workspace_header(enabled_model_keys: list[str], model_key: str) -> None:
|
| 453 |
selected_label = _model_option_label(model_key)
|
| 454 |
selection_note = (
|
| 455 |
"Use the model slider to move between DINO-Endo, AI-Endo, and V-JEPA2. "
|
| 456 |
+
"Only one model stays loaded at a time so the Space remains responsive on shared GPU hardware, and video runs now "
|
| 457 |
+
"produce an annotated playback clip with the phase HUD burned directly onto the frames."
|
| 458 |
)
|
| 459 |
st.markdown(
|
| 460 |
f"""
|
|
|
|
| 503 |
return manager
|
| 504 |
|
| 505 |
|
| 506 |
+
def _current_portfolio_page() -> str:
|
| 507 |
+
page_key = st.session_state.get(PORTFOLIO_PAGE_STATE_KEY, "home")
|
| 508 |
+
if page_key not in PORTFOLIO_PAGE_LABELS:
|
| 509 |
+
page_key = "home"
|
| 510 |
+
st.session_state[PORTFOLIO_PAGE_STATE_KEY] = page_key
|
| 511 |
+
return page_key
|
| 512 |
+
|
| 513 |
+
|
| 514 |
+
def _navigate_to_page(page_key: str) -> None:
|
| 515 |
+
if page_key not in PORTFOLIO_PAGE_LABELS:
|
| 516 |
+
raise RuntimeError(f"Unsupported portfolio page '{page_key}'")
|
| 517 |
+
st.session_state[PORTFOLIO_PAGE_STATE_KEY] = page_key
|
| 518 |
+
st.rerun()
|
| 519 |
+
|
| 520 |
+
|
| 521 |
+
def _render_page_navigation() -> str:
|
| 522 |
+
current_page = _current_portfolio_page()
|
| 523 |
+
st.caption("Portfolio navigation")
|
| 524 |
+
nav_cols = st.columns(len(PORTFOLIO_PAGE_LABELS))
|
| 525 |
+
for col, (page_key, label) in zip(nav_cols, PORTFOLIO_PAGE_LABELS.items()):
|
| 526 |
+
if col.button(
|
| 527 |
+
label,
|
| 528 |
+
key=f"portfolio-nav-{page_key}",
|
| 529 |
+
use_container_width=True,
|
| 530 |
+
type="primary" if page_key == current_page else "secondary",
|
| 531 |
+
) and page_key != current_page:
|
| 532 |
+
_navigate_to_page(page_key)
|
| 533 |
+
st.caption(PORTFOLIO_PAGE_SUMMARIES[current_page])
|
| 534 |
+
return current_page
|
| 535 |
+
|
| 536 |
+
|
| 537 |
+
def _clear_video_analysis() -> None:
|
| 538 |
+
analysis_state = st.session_state.pop(VIDEO_ANALYSIS_STATE_KEY, None)
|
| 539 |
+
if not analysis_state:
|
| 540 |
+
return
|
| 541 |
+
|
| 542 |
+
annotated_video_path = analysis_state.get("annotated_video_path")
|
| 543 |
+
if annotated_video_path:
|
| 544 |
+
Path(annotated_video_path).unlink(missing_ok=True)
|
| 545 |
+
|
| 546 |
+
|
| 547 |
def _clear_video_stage() -> None:
|
| 548 |
+
_clear_video_analysis()
|
| 549 |
temp_path = st.session_state.pop("staged_video_path", None)
|
| 550 |
if temp_path is not None:
|
| 551 |
Path(temp_path).unlink(missing_ok=True)
|
|
|
|
| 578 |
return temp_path, meta
|
| 579 |
|
| 580 |
|
| 581 |
+
def _video_analysis_signature(
|
| 582 |
+
staged_signature: tuple[str, int, str] | None, model_key: str, frame_stride: int, max_frames: int
|
| 583 |
+
) -> tuple:
|
| 584 |
+
return staged_signature, model_key, frame_stride, max_frames
|
| 585 |
+
|
| 586 |
+
|
| 587 |
def _records_to_frame(records):
|
| 588 |
if not records:
|
| 589 |
return pd.DataFrame(columns=["frame_index", "timestamp_sec", "phase", "confidence"])
|
|
|
|
| 675 |
frame_stride: int,
|
| 676 |
max_frames: int,
|
| 677 |
*,
|
| 678 |
+
video_info: dict | None = None,
|
| 679 |
+
model_label: str,
|
| 680 |
explainability_config: dict | None = None,
|
| 681 |
explainability_callback=None,
|
| 682 |
):
|
|
|
|
| 685 |
if not capture.isOpened():
|
| 686 |
raise RuntimeError("Unable to open uploaded video")
|
| 687 |
|
| 688 |
+
total_frames = int(capture.get(cv2.CAP_PROP_FRAME_COUNT) or (video_info or {}).get("frame_count") or 0)
|
| 689 |
+
fps = float(capture.get(cv2.CAP_PROP_FPS) or (video_info or {}).get("fps") or 0.0)
|
| 690 |
+
output_fps = fps if fps > 0 else float((video_info or {}).get("fps") or 24.0)
|
| 691 |
progress = st.progress(0)
|
| 692 |
status = st.empty()
|
| 693 |
|
| 694 |
predictor.reset_state()
|
| 695 |
records = []
|
| 696 |
processed = 0
|
| 697 |
+
rendered_frames = 0
|
| 698 |
frame_index = 0
|
| 699 |
+
truncated = False
|
| 700 |
explain_enabled = bool(explainability_config and explainability_config.get("enabled"))
|
| 701 |
+
latest_overlay_result = {
|
| 702 |
+
"phase": "unknown",
|
| 703 |
+
"confidence": 0.0,
|
| 704 |
+
}
|
| 705 |
+
overlay_writer = None
|
| 706 |
+
overlay_intermediate_path = None
|
| 707 |
+
overlay_warning = None
|
| 708 |
+
analysis_failed = False
|
| 709 |
|
| 710 |
try:
|
| 711 |
while True:
|
|
|
|
| 713 |
if not ok:
|
| 714 |
break
|
| 715 |
|
| 716 |
+
sampled_frame = frame_index % frame_stride == 0
|
| 717 |
+
|
| 718 |
+
if sampled_frame:
|
| 719 |
+
rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
|
| 720 |
+
started = time.perf_counter()
|
| 721 |
+
result = predictor.predict(rgb, explainability=explainability_config if explain_enabled else None)
|
| 722 |
+
elapsed_ms = (time.perf_counter() - started) * 1000.0
|
| 723 |
+
|
| 724 |
+
probs = result.get("probs", [0.0, 0.0, 0.0, 0.0])
|
| 725 |
+
record = {
|
| 726 |
+
"frame_index": frame_index,
|
| 727 |
+
"timestamp_sec": round(frame_index / fps, 3) if fps > 0 else None,
|
| 728 |
+
"phase": result.get("phase", "unknown"),
|
| 729 |
+
"phase_id": _phase_index(result.get("phase", "unknown")),
|
| 730 |
+
"confidence": float(result.get("confidence", 0.0)),
|
| 731 |
+
"frames_used": int(result.get("frames_used", processed + 1)),
|
| 732 |
+
"idle": float(probs[0]) if len(probs) > 0 else 0.0,
|
| 733 |
+
"marking": float(probs[1]) if len(probs) > 1 else 0.0,
|
| 734 |
+
"injection": float(probs[2]) if len(probs) > 2 else 0.0,
|
| 735 |
+
"dissection": float(probs[3]) if len(probs) > 3 else 0.0,
|
| 736 |
+
"inference_ms": round(elapsed_ms, 3),
|
| 737 |
+
}
|
| 738 |
+
records.append(record)
|
| 739 |
+
latest_overlay_result = {
|
| 740 |
+
"phase": record["phase"],
|
| 741 |
+
"confidence": record["confidence"],
|
| 742 |
+
}
|
| 743 |
+
processed += 1
|
| 744 |
+
|
| 745 |
+
if explain_enabled and explainability_callback is not None:
|
| 746 |
+
explainability_callback(result.get("explainability"), processed, frame_index)
|
| 747 |
+
|
| 748 |
+
if overlay_writer is None:
|
| 749 |
+
overlay_writer, overlay_intermediate_path = create_overlay_video_writer(
|
| 750 |
+
frame_size=(frame.shape[1], frame.shape[0]),
|
| 751 |
+
fps=output_fps,
|
| 752 |
+
temp_dir=temp_path.parent,
|
| 753 |
+
)
|
| 754 |
+
|
| 755 |
+
overlay_frame = draw_prediction_overlay(
|
| 756 |
+
frame,
|
| 757 |
+
phase=latest_overlay_result["phase"],
|
| 758 |
+
confidence=float(latest_overlay_result["confidence"]),
|
| 759 |
+
model_label=model_label,
|
| 760 |
+
frame_index=frame_index,
|
| 761 |
+
fps=output_fps,
|
| 762 |
+
total_frames=total_frames,
|
| 763 |
+
sampled_frame=sampled_frame,
|
| 764 |
+
)
|
| 765 |
+
overlay_writer.write(overlay_frame)
|
| 766 |
+
rendered_frames += 1
|
| 767 |
|
| 768 |
if total_frames > 0:
|
| 769 |
progress.progress(min(frame_index + 1, total_frames) / total_frames)
|
| 770 |
else:
|
| 771 |
progress.progress(min(processed / max_frames, 1.0))
|
| 772 |
+
status.caption(
|
| 773 |
+
f"Processed {processed} sampled frames · rendered {rendered_frames} playback frames"
|
| 774 |
+
)
|
| 775 |
|
| 776 |
frame_index += 1
|
| 777 |
+
if sampled_frame and processed >= max_frames:
|
| 778 |
+
truncated = total_frames <= 0 or frame_index < total_frames
|
| 779 |
break
|
| 780 |
+
except Exception:
|
| 781 |
+
analysis_failed = True
|
| 782 |
+
raise
|
| 783 |
finally:
|
| 784 |
capture.release()
|
| 785 |
predictor.reset_state()
|
| 786 |
+
if overlay_writer is not None:
|
| 787 |
+
overlay_writer.release()
|
| 788 |
+
progress.empty()
|
| 789 |
+
status.empty()
|
| 790 |
+
if analysis_failed and overlay_intermediate_path is not None:
|
| 791 |
+
overlay_intermediate_path.unlink(missing_ok=True)
|
| 792 |
+
|
| 793 |
+
annotated_video_path = None
|
| 794 |
+
if overlay_intermediate_path is not None and rendered_frames > 0:
|
| 795 |
+
transcoded_path, overlay_warning = transcode_video_for_streamlit(
|
| 796 |
+
overlay_intermediate_path,
|
| 797 |
+
temp_dir=temp_path.parent,
|
| 798 |
+
)
|
| 799 |
+
annotated_video_path = str(transcoded_path)
|
| 800 |
+
|
| 801 |
+
analysis_meta = {
|
| 802 |
+
"fps": fps,
|
| 803 |
+
"total_frames": total_frames,
|
| 804 |
+
"sampled_frames": processed,
|
| 805 |
+
"rendered_frames": rendered_frames,
|
| 806 |
+
"annotated_video_path": annotated_video_path,
|
| 807 |
+
"annotated_video_warning": overlay_warning,
|
| 808 |
+
"analysis_complete": not truncated,
|
| 809 |
+
}
|
| 810 |
+
if annotated_video_path is not None:
|
| 811 |
+
annotated_path = Path(annotated_video_path)
|
| 812 |
+
if annotated_path.exists():
|
| 813 |
+
analysis_meta["annotated_video_size_label"] = format_bytes(annotated_path.stat().st_size)
|
| 814 |
+
analysis_meta["annotated_video_duration_sec"] = rendered_frames / output_fps if output_fps > 0 else None
|
| 815 |
+
|
| 816 |
+
return records, analysis_meta
|
| 817 |
|
| 818 |
|
| 819 |
def _render_single_result(result: dict):
|
|
|
|
| 839 |
st.warning("No frames were processed from the uploaded video.")
|
| 840 |
return
|
| 841 |
|
| 842 |
+
annotated_video_path = meta.get("annotated_video_path")
|
| 843 |
+
if annotated_video_path:
|
| 844 |
+
st.subheader("Annotated playback")
|
| 845 |
+
st.caption(
|
| 846 |
+
"The analysed clip now includes a HUD-style phase overlay directly on the video frames, mirroring the live webapp feel."
|
| 847 |
+
)
|
| 848 |
+
st.video(annotated_video_path)
|
| 849 |
+
overlay_details = []
|
| 850 |
+
if meta.get("annotated_video_size_label"):
|
| 851 |
+
overlay_details.append(f"overlay size {meta['annotated_video_size_label']}")
|
| 852 |
+
if meta.get("rendered_frames"):
|
| 853 |
+
overlay_details.append(f"{int(meta['rendered_frames'])} rendered frames")
|
| 854 |
+
if overlay_details:
|
| 855 |
+
st.caption(" · ".join(overlay_details))
|
| 856 |
+
if meta.get("annotated_video_warning"):
|
| 857 |
+
st.warning(meta["annotated_video_warning"])
|
| 858 |
+
if not meta.get("analysis_complete", True):
|
| 859 |
+
st.info(
|
| 860 |
+
"The playback clip and tables cover only the analysed portion of the video because the sampled-frame limit was reached."
|
| 861 |
+
)
|
| 862 |
+
|
| 863 |
df = _records_to_frame(records)
|
| 864 |
counts = Counter(df["phase"].tolist())
|
| 865 |
dominant_phase, _ = counts.most_common(1)[0]
|
|
|
|
| 898 |
right.download_button("Download CSV", csv_payload, file_name="phase_timeline.csv", mime="text/csv")
|
| 899 |
|
| 900 |
|
| 901 |
+
def _render_workspace_page(
|
| 902 |
+
enabled_model_keys: list[str], default_model_key: str, manager: SpaceModelManager
|
| 903 |
+
) -> None:
|
|
|
|
|
|
|
|
|
|
| 904 |
previous_selected_model_key, model_key = _resolve_model_selection(enabled_model_keys, default_model_key)
|
| 905 |
|
| 906 |
_render_workspace_header(enabled_model_keys, model_key)
|
|
|
|
| 909 |
st.session_state[SELECTED_MODEL_STATE_KEY] = model_key
|
| 910 |
if previous_selected_model_key is not None and previous_selected_model_key != model_key:
|
| 911 |
manager.unload_model()
|
| 912 |
+
_clear_video_analysis()
|
| 913 |
|
| 914 |
explainability_config, explainability_spec = _build_explainability_config(manager, model_key)
|
| 915 |
|
|
|
|
| 1026 |
except Exception as exc:
|
| 1027 |
st.error(str(exc))
|
| 1028 |
else:
|
| 1029 |
+
analysis_signature = _video_analysis_signature(
|
| 1030 |
+
st.session_state.get("staged_video_signature"),
|
| 1031 |
+
model_key,
|
| 1032 |
+
frame_stride,
|
| 1033 |
+
max_frames,
|
| 1034 |
+
)
|
| 1035 |
+
saved_analysis = st.session_state.get(VIDEO_ANALYSIS_STATE_KEY)
|
| 1036 |
+
if saved_analysis and saved_analysis.get("signature") != analysis_signature:
|
| 1037 |
+
_clear_video_analysis()
|
| 1038 |
+
saved_analysis = None
|
| 1039 |
+
|
| 1040 |
info_cols = st.columns(5)
|
| 1041 |
info_cols[0].metric("File size", video_meta["file_size_label"])
|
| 1042 |
info_cols[1].metric("Duration", video_meta["duration_label"])
|
|
|
|
| 1074 |
title=f"Live explainability · sampled frame {processed_count}",
|
| 1075 |
)
|
| 1076 |
|
| 1077 |
+
_clear_video_analysis()
|
| 1078 |
try:
|
| 1079 |
with st.spinner(f"Running {MODEL_LABELS[model_key]} on {uploaded_video.name}..."):
|
| 1080 |
predictor = manager.get_predictor(model_key)
|
|
|
|
| 1083 |
predictor,
|
| 1084 |
frame_stride=frame_stride,
|
| 1085 |
max_frames=max_frames,
|
| 1086 |
+
video_info=video_meta,
|
| 1087 |
+
model_label=MODEL_LABELS[model_key],
|
| 1088 |
explainability_config=explainability_config if explainability_config.get("enabled") else None,
|
| 1089 |
explainability_callback=(
|
| 1090 |
_video_explainability_callback
|
|
|
|
| 1096 |
**video_meta,
|
| 1097 |
**analysis_meta,
|
| 1098 |
}
|
| 1099 |
+
st.session_state[VIDEO_ANALYSIS_STATE_KEY] = {
|
| 1100 |
+
"signature": analysis_signature,
|
| 1101 |
+
"records": records,
|
| 1102 |
+
"meta": meta,
|
| 1103 |
+
"annotated_video_path": meta.get("annotated_video_path"),
|
| 1104 |
+
"latest_explainability": latest_payload["value"],
|
| 1105 |
+
}
|
| 1106 |
+
saved_analysis = st.session_state[VIDEO_ANALYSIS_STATE_KEY]
|
| 1107 |
except Exception as exc:
|
| 1108 |
st.error(str(exc))
|
| 1109 |
+
|
| 1110 |
+
saved_analysis = st.session_state.get(VIDEO_ANALYSIS_STATE_KEY)
|
| 1111 |
+
if saved_analysis and saved_analysis.get("signature") == analysis_signature:
|
| 1112 |
+
_render_video_results(saved_analysis["records"], saved_analysis["meta"])
|
| 1113 |
+
if explainability_config.get("enabled"):
|
| 1114 |
+
_render_explainability_panel(
|
| 1115 |
+
video_explain_placeholder,
|
| 1116 |
+
saved_analysis.get("latest_explainability"),
|
| 1117 |
+
enabled=True,
|
| 1118 |
+
spec=explainability_spec,
|
| 1119 |
+
title="Explainability",
|
| 1120 |
+
)
|
| 1121 |
else:
|
| 1122 |
_clear_video_stage()
|
| 1123 |
|
| 1124 |
|
| 1125 |
+
def main():
|
| 1126 |
+
enabled_model_keys = _enabled_model_keys()
|
| 1127 |
+
default_model_key = _default_model_key(enabled_model_keys)
|
| 1128 |
+
manager = _get_model_manager()
|
| 1129 |
+
_inject_app_styles()
|
| 1130 |
+
current_page = _render_page_navigation()
|
| 1131 |
+
|
| 1132 |
+
if current_page == "home":
|
| 1133 |
+
_render_project_hub(enabled_model_keys)
|
| 1134 |
+
return
|
| 1135 |
+
|
| 1136 |
+
if current_page == "projects":
|
| 1137 |
+
_render_projects_page()
|
| 1138 |
+
return
|
| 1139 |
+
|
| 1140 |
+
_render_workspace_page(enabled_model_keys, default_model_key, manager)
|
| 1141 |
+
|
| 1142 |
+
|
| 1143 |
if __name__ == "__main__":
|
| 1144 |
main()
|
video_utils.py
CHANGED
|
@@ -8,12 +8,23 @@ import tempfile
|
|
| 8 |
from pathlib import Path
|
| 9 |
|
| 10 |
import cv2
|
|
|
|
| 11 |
|
| 12 |
STREAMLIT_SERVER_MAX_UPLOAD_MB = 4096
|
| 13 |
INLINE_VIDEO_PREVIEW_MAX_MB = 256
|
| 14 |
MIN_FREE_WORKSPACE_BUFFER_MB = int(os.getenv("SPACE_MIN_FREE_SPACE_BUFFER_MB", "2048"))
|
| 15 |
COPY_CHUNK_BYTES = 8 * 1024 * 1024
|
| 16 |
SUPPORTED_VIDEO_TYPES = ["mp4", "mov", "avi", "mkv", "webm", "m4v"]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 17 |
|
| 18 |
|
| 19 |
def format_bytes(num_bytes: int) -> str:
|
|
@@ -71,6 +82,51 @@ def get_workspace_free_bytes(temp_dir: Path | None = None) -> int:
|
|
| 71 |
return shutil.disk_usage(target_dir).free
|
| 72 |
|
| 73 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 74 |
def spool_uploaded_video(uploaded_file, suffix: str | None = None, temp_dir: Path | None = None) -> Path:
|
| 75 |
file_size_bytes = get_upload_size_bytes(uploaded_file)
|
| 76 |
if file_size_bytes > STREAMLIT_SERVER_MAX_UPLOAD_MB * 1024 * 1024:
|
|
@@ -99,6 +155,150 @@ def spool_uploaded_video(uploaded_file, suffix: str | None = None, temp_dir: Pat
|
|
| 99 |
return temp_path
|
| 100 |
|
| 101 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 102 |
def probe_video_info(video_path: str | Path) -> dict:
|
| 103 |
path = Path(video_path)
|
| 104 |
if not path.exists():
|
|
@@ -221,3 +421,40 @@ def _parse_fraction(value: str) -> float:
|
|
| 221 |
return float(value)
|
| 222 |
except ValueError:
|
| 223 |
return 0.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
from pathlib import Path
|
| 9 |
|
| 10 |
import cv2
|
| 11 |
+
import numpy as np
|
| 12 |
|
| 13 |
STREAMLIT_SERVER_MAX_UPLOAD_MB = 4096
|
| 14 |
INLINE_VIDEO_PREVIEW_MAX_MB = 256
|
| 15 |
MIN_FREE_WORKSPACE_BUFFER_MB = int(os.getenv("SPACE_MIN_FREE_SPACE_BUFFER_MB", "2048"))
|
| 16 |
COPY_CHUNK_BYTES = 8 * 1024 * 1024
|
| 17 |
SUPPORTED_VIDEO_TYPES = ["mp4", "mov", "avi", "mkv", "webm", "m4v"]
|
| 18 |
+
PHASE_OVERLAY_BGR = {
|
| 19 |
+
"idle": (139, 125, 96),
|
| 20 |
+
"marking": (255, 140, 79),
|
| 21 |
+
"injection": (255, 198, 104),
|
| 22 |
+
"dissection": (107, 107, 255),
|
| 23 |
+
"unknown": (248, 250, 252),
|
| 24 |
+
}
|
| 25 |
+
HUD_BACKGROUND_BGR = (20, 10, 6)
|
| 26 |
+
HUD_TEXT_BGR = (252, 250, 248)
|
| 27 |
+
HUD_MUTED_BGR = (215, 200, 191)
|
| 28 |
|
| 29 |
|
| 30 |
def format_bytes(num_bytes: int) -> str:
|
|
|
|
| 82 |
return shutil.disk_usage(target_dir).free
|
| 83 |
|
| 84 |
|
| 85 |
+
def create_temp_video_path(
|
| 86 |
+
*, suffix: str = ".mp4", prefix: str = "portfolio-overlay-", temp_dir: Path | None = None
|
| 87 |
+
) -> Path:
|
| 88 |
+
target_dir = Path(temp_dir or tempfile.gettempdir())
|
| 89 |
+
target_dir.mkdir(parents=True, exist_ok=True)
|
| 90 |
+
fd, temp_name = tempfile.mkstemp(prefix=prefix, suffix=suffix, dir=target_dir)
|
| 91 |
+
os.close(fd)
|
| 92 |
+
return Path(temp_name)
|
| 93 |
+
|
| 94 |
+
|
| 95 |
+
def create_overlay_video_writer(
|
| 96 |
+
*, frame_size: tuple[int, int], fps: float, temp_dir: Path | None = None
|
| 97 |
+
) -> tuple[cv2.VideoWriter, Path]:
|
| 98 |
+
effective_fps = fps if fps > 0 else 24.0
|
| 99 |
+
candidates = (
|
| 100 |
+
("mp4v", ".mp4"),
|
| 101 |
+
("avc1", ".mp4"),
|
| 102 |
+
("MJPG", ".avi"),
|
| 103 |
+
("XVID", ".avi"),
|
| 104 |
+
)
|
| 105 |
+
tried = []
|
| 106 |
+
for codec, suffix in candidates:
|
| 107 |
+
temp_path = create_temp_video_path(
|
| 108 |
+
prefix="portfolio-overlay-raw-",
|
| 109 |
+
suffix=suffix,
|
| 110 |
+
temp_dir=temp_dir,
|
| 111 |
+
)
|
| 112 |
+
writer = cv2.VideoWriter(
|
| 113 |
+
str(temp_path),
|
| 114 |
+
cv2.VideoWriter_fourcc(*codec),
|
| 115 |
+
effective_fps,
|
| 116 |
+
frame_size,
|
| 117 |
+
)
|
| 118 |
+
if writer.isOpened():
|
| 119 |
+
return writer, temp_path
|
| 120 |
+
writer.release()
|
| 121 |
+
temp_path.unlink(missing_ok=True)
|
| 122 |
+
tried.append(f"{codec}{suffix}")
|
| 123 |
+
|
| 124 |
+
raise RuntimeError(
|
| 125 |
+
"Unable to create annotated playback video for this environment. "
|
| 126 |
+
f"Tried {', '.join(tried)}."
|
| 127 |
+
)
|
| 128 |
+
|
| 129 |
+
|
| 130 |
def spool_uploaded_video(uploaded_file, suffix: str | None = None, temp_dir: Path | None = None) -> Path:
|
| 131 |
file_size_bytes = get_upload_size_bytes(uploaded_file)
|
| 132 |
if file_size_bytes > STREAMLIT_SERVER_MAX_UPLOAD_MB * 1024 * 1024:
|
|
|
|
| 155 |
return temp_path
|
| 156 |
|
| 157 |
|
| 158 |
+
def draw_prediction_overlay(
|
| 159 |
+
frame: np.ndarray,
|
| 160 |
+
*,
|
| 161 |
+
phase: str,
|
| 162 |
+
confidence: float,
|
| 163 |
+
model_label: str,
|
| 164 |
+
frame_index: int,
|
| 165 |
+
fps: float,
|
| 166 |
+
total_frames: int | None = None,
|
| 167 |
+
sampled_frame: bool = True,
|
| 168 |
+
) -> np.ndarray:
|
| 169 |
+
annotated = frame.copy()
|
| 170 |
+
overlay = annotated.copy()
|
| 171 |
+
height, width = annotated.shape[:2]
|
| 172 |
+
margin = max(16, width // 64)
|
| 173 |
+
available_width = max(120, width - (margin * 2))
|
| 174 |
+
available_height = max(72, height - (margin * 2))
|
| 175 |
+
panel_width = min(max(int(width * 0.44), 240), available_width)
|
| 176 |
+
panel_height = min(max(int(height * 0.16), 88), min(140, available_height))
|
| 177 |
+
accent = _phase_overlay_color_bgr(phase)
|
| 178 |
+
|
| 179 |
+
panel_x1 = margin
|
| 180 |
+
panel_y1 = max(margin, height - panel_height - margin)
|
| 181 |
+
panel_x2 = panel_x1 + panel_width
|
| 182 |
+
panel_y2 = panel_y1 + panel_height
|
| 183 |
+
cv2.rectangle(overlay, (panel_x1, panel_y1), (panel_x2, panel_y2), HUD_BACKGROUND_BGR, -1)
|
| 184 |
+
cv2.rectangle(overlay, (panel_x1, panel_y1), (panel_x2, panel_y2), accent, 2)
|
| 185 |
+
cv2.rectangle(overlay, (panel_x1 + 12, panel_y1 + 12), (panel_x1 + 18, panel_y2 - 12), accent, -1)
|
| 186 |
+
cv2.addWeighted(overlay, 0.68, annotated, 0.32, 0, annotated)
|
| 187 |
+
|
| 188 |
+
base_scale = max(0.52, min(1.0, width / 1200.0))
|
| 189 |
+
phase_label = phase.title() if phase and phase.lower() != "unknown" else "--"
|
| 190 |
+
status_label = "Sampled frame" if sampled_frame else "Carry-forward overlay"
|
| 191 |
+
_draw_shadowed_text(
|
| 192 |
+
annotated,
|
| 193 |
+
f"Phase: {phase_label}",
|
| 194 |
+
(panel_x1 + 34, panel_y1 + 32),
|
| 195 |
+
accent,
|
| 196 |
+
font_scale=base_scale * 0.92,
|
| 197 |
+
thickness=2,
|
| 198 |
+
)
|
| 199 |
+
_draw_shadowed_text(
|
| 200 |
+
annotated,
|
| 201 |
+
f"Confidence {confidence:.1%} • {status_label}",
|
| 202 |
+
(panel_x1 + 34, panel_y1 + 58),
|
| 203 |
+
HUD_TEXT_BGR,
|
| 204 |
+
font_scale=base_scale * 0.62,
|
| 205 |
+
thickness=1,
|
| 206 |
+
)
|
| 207 |
+
|
| 208 |
+
current_time = format_duration(frame_index / fps if fps > 0 else None)
|
| 209 |
+
total_time = format_duration(total_frames / fps if fps > 0 and total_frames else None)
|
| 210 |
+
header_text = model_label
|
| 211 |
+
if current_time != "Unknown" and total_time != "Unknown":
|
| 212 |
+
header_text = f"{header_text} • {current_time} / {total_time}"
|
| 213 |
+
elif current_time != "Unknown":
|
| 214 |
+
header_text = f"{header_text} • {current_time}"
|
| 215 |
+
|
| 216 |
+
header_scale = max(0.46, min(0.62, width / 1600.0))
|
| 217 |
+
text_width, text_height = cv2.getTextSize(
|
| 218 |
+
header_text, cv2.FONT_HERSHEY_SIMPLEX, header_scale, 1
|
| 219 |
+
)[0]
|
| 220 |
+
header_width = min(text_width + 28, available_width)
|
| 221 |
+
header_height = text_height + 18
|
| 222 |
+
header_x2 = width - margin
|
| 223 |
+
header_x1 = max(margin, header_x2 - header_width)
|
| 224 |
+
header_y1 = margin
|
| 225 |
+
header_y2 = header_y1 + header_height
|
| 226 |
+
header_overlay = annotated.copy()
|
| 227 |
+
cv2.rectangle(header_overlay, (header_x1, header_y1), (header_x2, header_y2), HUD_BACKGROUND_BGR, -1)
|
| 228 |
+
cv2.rectangle(header_overlay, (header_x1, header_y1), (header_x2, header_y2), accent, 1)
|
| 229 |
+
cv2.addWeighted(header_overlay, 0.62, annotated, 0.38, 0, annotated)
|
| 230 |
+
_draw_shadowed_text(
|
| 231 |
+
annotated,
|
| 232 |
+
header_text,
|
| 233 |
+
(header_x1 + 14, header_y1 + text_height + 5),
|
| 234 |
+
HUD_MUTED_BGR,
|
| 235 |
+
font_scale=header_scale,
|
| 236 |
+
thickness=1,
|
| 237 |
+
)
|
| 238 |
+
|
| 239 |
+
if total_frames and total_frames > 0:
|
| 240 |
+
progress_ratio = min(max((frame_index + 1) / total_frames, 0.0), 1.0)
|
| 241 |
+
bar_x1 = margin
|
| 242 |
+
bar_y1 = height - 8
|
| 243 |
+
bar_x2 = width - margin
|
| 244 |
+
bar_y2 = height - 4
|
| 245 |
+
cv2.rectangle(annotated, (bar_x1, bar_y1), (bar_x2, bar_y2), (60, 67, 82), -1)
|
| 246 |
+
cv2.rectangle(
|
| 247 |
+
annotated,
|
| 248 |
+
(bar_x1, bar_y1),
|
| 249 |
+
(bar_x1 + int((bar_x2 - bar_x1) * progress_ratio), bar_y2),
|
| 250 |
+
accent,
|
| 251 |
+
-1,
|
| 252 |
+
)
|
| 253 |
+
|
| 254 |
+
return annotated
|
| 255 |
+
|
| 256 |
+
|
| 257 |
+
def transcode_video_for_streamlit(
|
| 258 |
+
video_path: str | Path, *, temp_dir: Path | None = None
|
| 259 |
+
) -> tuple[Path, str | None]:
|
| 260 |
+
input_path = Path(video_path)
|
| 261 |
+
ffmpeg_path = shutil.which("ffmpeg")
|
| 262 |
+
if ffmpeg_path is None:
|
| 263 |
+
return input_path, "ffmpeg is unavailable, so the raw overlay clip is being used for playback."
|
| 264 |
+
|
| 265 |
+
output_path = create_temp_video_path(prefix="portfolio-overlay-final-", suffix=".mp4", temp_dir=temp_dir)
|
| 266 |
+
command = [
|
| 267 |
+
ffmpeg_path,
|
| 268 |
+
"-y",
|
| 269 |
+
"-loglevel",
|
| 270 |
+
"error",
|
| 271 |
+
"-i",
|
| 272 |
+
str(input_path),
|
| 273 |
+
"-an",
|
| 274 |
+
"-c:v",
|
| 275 |
+
"libx264",
|
| 276 |
+
"-preset",
|
| 277 |
+
"veryfast",
|
| 278 |
+
"-pix_fmt",
|
| 279 |
+
"yuv420p",
|
| 280 |
+
"-movflags",
|
| 281 |
+
"+faststart",
|
| 282 |
+
str(output_path),
|
| 283 |
+
]
|
| 284 |
+
try:
|
| 285 |
+
result = subprocess.run(command, capture_output=True, text=True, timeout=600, check=False)
|
| 286 |
+
except subprocess.TimeoutExpired:
|
| 287 |
+
output_path.unlink(missing_ok=True)
|
| 288 |
+
return input_path, "ffmpeg timed out while transcoding the overlay clip, so the raw overlay video is being used."
|
| 289 |
+
if result.returncode != 0:
|
| 290 |
+
output_path.unlink(missing_ok=True)
|
| 291 |
+
error_line = result.stderr.strip().splitlines()[-1] if result.stderr.strip() else "unknown ffmpeg error"
|
| 292 |
+
return (
|
| 293 |
+
input_path,
|
| 294 |
+
f"ffmpeg could not transcode the overlay clip for browser-friendly playback ({error_line}). "
|
| 295 |
+
"Using the directly encoded clip instead.",
|
| 296 |
+
)
|
| 297 |
+
|
| 298 |
+
input_path.unlink(missing_ok=True)
|
| 299 |
+
return output_path, None
|
| 300 |
+
|
| 301 |
+
|
| 302 |
def probe_video_info(video_path: str | Path) -> dict:
|
| 303 |
path = Path(video_path)
|
| 304 |
if not path.exists():
|
|
|
|
| 421 |
return float(value)
|
| 422 |
except ValueError:
|
| 423 |
return 0.0
|
| 424 |
+
|
| 425 |
+
|
| 426 |
+
def _phase_overlay_color_bgr(phase: str | None) -> tuple[int, int, int]:
|
| 427 |
+
phase_key = (phase or "").strip().lower()
|
| 428 |
+
return PHASE_OVERLAY_BGR.get(phase_key, PHASE_OVERLAY_BGR["unknown"])
|
| 429 |
+
|
| 430 |
+
|
| 431 |
+
def _draw_shadowed_text(
|
| 432 |
+
frame: np.ndarray,
|
| 433 |
+
text: str,
|
| 434 |
+
origin: tuple[int, int],
|
| 435 |
+
color: tuple[int, int, int],
|
| 436 |
+
*,
|
| 437 |
+
font_scale: float,
|
| 438 |
+
thickness: int,
|
| 439 |
+
) -> None:
|
| 440 |
+
shadow_origin = (origin[0] + 1, origin[1] + 1)
|
| 441 |
+
cv2.putText(
|
| 442 |
+
frame,
|
| 443 |
+
text,
|
| 444 |
+
shadow_origin,
|
| 445 |
+
cv2.FONT_HERSHEY_SIMPLEX,
|
| 446 |
+
font_scale,
|
| 447 |
+
(0, 0, 0),
|
| 448 |
+
thickness + 2,
|
| 449 |
+
cv2.LINE_AA,
|
| 450 |
+
)
|
| 451 |
+
cv2.putText(
|
| 452 |
+
frame,
|
| 453 |
+
text,
|
| 454 |
+
origin,
|
| 455 |
+
cv2.FONT_HERSHEY_SIMPLEX,
|
| 456 |
+
font_scale,
|
| 457 |
+
color,
|
| 458 |
+
thickness,
|
| 459 |
+
cv2.LINE_AA,
|
| 460 |
+
)
|