Spaces:

AnhLee0
/

AI-Coach

Sleeping

App Files Files Community

Hoang Duc Hung commited on 26 days ago

Commit

350d731

1 Parent(s): 56251f6

feat: stabilize push-up feedback and VLM handling

Browse files

Files changed (17) hide show

.env.example +5 -2
README.md +65 -18
docs/LLM_FEEDBACK_NVIDIA_LANGGRAPH.md +57 -159
docs/Migrating Agent to OpenAI.md +0 -750
docs/README.md +47 -0
docs/SETUP_NOTES.md +73 -25
docs/UI_REDESIGN.md +70 -18
docs/implementation_plan.md +52 -75
docs/legacy_agent_notes.md +22 -0
docs/problem_statement.md +37 -4
push_up/analysis_service.py +381 -12
push_up/feedback_graph.py +563 -0
reflex_frontend/state.py +24 -3
reflex_frontend/ui.py +86 -7
requirements.txt +2 -1
scripts/log_hook.py +134 -6
scripts/run_pushup_eval_tests.py +255 -53

.env.example CHANGED Viewed

@@ -2,8 +2,11 @@
 # ANTHROPIC_API_KEY=sk-ant-...
 # OPENAI_API_KEY=sk-...
 GEMINI_API_KEY=...
-NVIDIA_API_KEY=nvapi-...
-NVIDIA_MODEL=nvidia/nvidia-nemotron-nano-9b-v2
 # Default model
 # DEFAULT_MODEL=claude-sonnet-4-20250514

 # ANTHROPIC_API_KEY=sk-ant-...
 # OPENAI_API_KEY=sk-...
 GEMINI_API_KEY=...
+NVIDIA_API_KEY=nvapi-your-real-nvidia-key
+NVIDIA_BASE_URL=https://integrate.api.nvidia.com/v1
+NVIDIA_MODEL=deepseek-ai/deepseek-v4-flash
+NVIDIA_VISION_MODEL=meta/llama-3.2-11b-vision-instruct
+NVIDIA_VISION_MAX_TOKENS=1200
 # Default model
 # DEFAULT_MODEL=claude-sonnet-4-20250514

README.md CHANGED Viewed

@@ -1,27 +1,74 @@
-# AI Sports Coach - Taekwondo Biomechanics Analysis
-Ứng dụng sử dụng AI để phân tích và hướng dẫn tập luyện Taekwondo thông qua video.
-## Tính năng
-- Trích xuất khung xương 3D (33 điểm) và bàn tay bằng MediaPipe Holistic.
-- Đồng bộ hóa động tác giữa người dùng và chuyên gia bằng thuật toán DTW.
-- Tính toán góc sinh cơ học (Biomechanics) để phát hiện lỗi sai.
-- Video Overlay: Chồng hình và khoanh đỏ vị trí lỗi trực quan.
-## Cài đặt
-1. Cài đặt các thư viện cần thiết:
-```bash
 pip install -r requirements.txt
 ```
-2. Chạy ứng dụng Streamlit:
-```bash
-streamlit run app.py
 ```
-## Cấu trúc dự án
-- `src/core/pose.py`: Xử lý AI nhận diện khung xương.
-- `src/core/compare.py`: Thuật toán so sánh và đồng bộ hóa.
-- `src/core/overlay.py`: Engine vẽ và chồng video.
-- `src/core/preprocess_reference.py`: Công cụ tạo dữ liệu mẫu từ video chuyên gia.

+# AI Coach - Push-up Video Analysis
+Ung dung Reflex phan tich bai tap hit dat tu video. He thong dung MediaPipe Pose de trich xuat landmark, rule engine de phat hien loi ky thuat, DTW de so sanh voi video mau, va tuy chon NVIDIA VLM de sinh feedback chu bang tieng Viet.
+## Tinh Nang
+- Upload video hoc vien va phan tich theo tung rep.
+- So sanh voi video mau `data/templates/push_up_template.mp4`.
+- Cham diem tong quan va diem tung rep.
+- Phat hien loi rule-based:
+  - Chua ha nguoi du sau.
+  - Vong lung.
+  - Co the chua giu thang.
+  - Gap co hoac cui dau qua muc.
+  - Nho mong qua cao.
+- Ve arrow vao vi tri loi dua tren landmark deterministic.
+- Hien thi feedback rule-based va optional feedback tu VLM.
+- Batch test tat ca video trong `data/tests` va xuat file JSON tong hop.
+## Cai Dat
+```powershell
+python -m venv .venv
+.\.venv\Scripts\Activate.ps1
 pip install -r requirements.txt
 ```
+## Chay App
+```powershell
+reflex run
+```
+Mo frontend tai:
+```text
+http://localhost:3000/
+```
+## Cau Hinh VLM Tuy Chon
+Copy `.env.example` thanh `.env` va dien NVIDIA API key:
+```env
+NVIDIA_API_KEY=nvapi-your-real-nvidia-key
+NVIDIA_BASE_URL=https://integrate.api.nvidia.com/v1
+NVIDIA_MODEL=deepseek-ai/deepseek-v4-flash
+NVIDIA_VISION_MODEL=meta/llama-3.2-11b-vision-instruct
+NVIDIA_VISION_MAX_TOKENS=1200
+```
+Neu khong co API key, app van chay rule-based analysis binh thuong.
+## Chay Batch Test
+```powershell
+python scripts\run_pushup_eval_tests.py
+```
+Luu artifact anh va arrow:
+```powershell
+python scripts\run_pushup_eval_tests.py --save-artifacts
 ```
+Bat VLM trong batch test:
+```powershell
+python scripts\run_pushup_eval_tests.py --save-artifacts --enable-vlm
+```
+## Tai Lieu
+Xem `docs/README.md` de doc map va giai thich chi tiet luong hoat dong hien tai.

docs/LLM_FEEDBACK_NVIDIA_LANGGRAPH.md CHANGED Viewed

@@ -1,201 +1,99 @@
-# NVIDIA + LangGraph Feedback Plan
-## Goal
-Add personalized coaching feedback without changing the current computer-vision core.
-Current core stays deterministic:
 ```text
-video -> MediaPipe pose -> push-up reps -> rule errors + scores -> annotated frames
 ```
-LLM is added only after analysis:
-```text
-analysis result JSON -> LangGraph feedback graph -> Vietnamese coaching text
-```
-The LLM must not detect pose errors from scratch. It only explains validated errors that already come from `push_up/rules.py` and `push_up/evaluator.py`.
-## Recommended NVIDIA Model
-Default model:
-```text
-nvidia/nvidia-nemotron-nano-9b-v2
-```
-Reason:
-- The task is short Vietnamese coaching text generation from structured JSON, not long visual reasoning.
-- Low latency matters because feedback is generated inside the Reflex request flow.
-- It is strong enough for instruction following and agentic workflows.
-- Cost/latency is more appropriate than larger models for this use case.
-Upgrade option:
-```text
-nvidia/nemotron-3-nano-30b-a3b
-```
-Use this if the 9B model produces weak Vietnamese, ignores formatting, or needs stronger reasoning across many rep-level errors.
-Do not start with a vision-language model. The app already has pose landmarks and error labels; sending frames/video to a VLM would add cost, latency, and privacy risk without improving the core decision logic.
-## Environment
-Add these variables:
 ```env
-NVIDIA_API_KEY=nvapi-...
-NVIDIA_MODEL=nvidia/nvidia-nemotron-nano-9b-v2
-```
-Dependencies:
-```text
-langgraph
-langchain-nvidia-ai-endpoints
 ```
-## LangGraph Design
-Use a small graph, not a free-form agent loop.
-```text
-START
-  -> build_evidence
-  -> generate_feedback
-  -> validate_feedback
-  -> END
-```
-### State
-```python
-class FeedbackState(TypedDict):
-    analysis: dict
-    evidence: dict
-    feedback: str
-    warnings: list[str]
-```
-### Node: build_evidence
-Input: full result from `analyze_pushup()`.
-Output: compact evidence object for the LLM.
-Keep only:
-- overall score
-- student rep count
-- expert rep count
-- main errors with count/severity/guidance
-- top 3 weakest reps
-- rep-level score and error labels
-Do not send image bytes or video paths to the text LLM.
-### Node: generate_feedback
-Use `ChatNVIDIA` and a strict prompt.
-Prompt rules:
-- Vietnamese only.
-- Maximum 120-150 words.
-- Do not invent errors outside `evidence`.
-- Mention the main repeated error first.
-- Include one correction cue for the next set.
-- Include one drill.
-- If no errors, encourage maintenance and tempo control.
-### Node: validate_feedback
-Fast deterministic validation:
-- If empty, use fallback summary.
-- If too long, trim or ask model to rewrite shorter.
-- If it mentions an error not in `main_errors` or `rep_results`, discard and use fallback.
-## Integration Points
-Add a new file:
 ```text
-push_up/feedback_graph.py
 ```
-Expose:
-```python
-def generate_coach_feedback(analysis_result: dict) -> str:
-    ...
-```
-Then in `push_up/analysis_service.py`, after `payload` is built:
-```python
-payload["coach_feedback"] = generate_coach_feedback(payload)
 ```
-In `reflex_frontend/state.py`, add:
-```python
-coach_feedback: str = ""
-```
-Set it after analysis:
-```python
-self.coach_feedback = result.get("coach_feedback", "")
-```
-In `reflex_frontend/ui.py`, show it in `results_panel()`.
-## Why Not Let LLM Decide The Error Location?
-The arrow location should remain deterministic.
-Correct responsibility split:
-```text
-Rule engine -> decides error type
-Landmark mapping -> decides arrow target
-OpenCV -> draws arrow
-LLM -> explains how to fix
-```
-This avoids hallucinated arrows and keeps the UI explainable.
-## Test Plan
-Run:
 ```powershell
-python scripts/run_pushup_eval_tests.py
 ```
-Expected behavior:
-- `template_vs_template`: near-perfect score, no major errors.
-- `hv01_cuoi_dau_thap`: main error should be head misalignment.
-- `hv01_hit_nua_rep`: main error should include not deep enough.
-- `hv01_khong_gong_bung`: main error should include body alignment or hip sag.
-- `hv01_mong_cao`: main error should be hip pike.
-- `hv01_rep_sai_rep_dung`: mixed good and bad reps.
-- `hv01_tap_dung`: no major errors.
-- `hv02_tap_dung`: should auto-flip orientation and produce no major errors.
-- `vo_teakwondo`: should be rejected as not push-up.
-Log should include:
-- compared video names
-- detected orientation
-- overall score
-- student/expert rep count
-- per-rep score
-- per-rep errors

+# NVIDIA VLM Feedback
+## Vai Tro
+VLM la thanh phan phu tro sau khi computer vision va rule engine da phan tich xong. No khong thay the rule engine.
 ```text
+rule engine -> loi chinh + diem + arrow target
+VLM         -> feedback chu ngan bang tieng Viet
 ```
+Trong UI, phan nay duoc hien thi la `Feedback tu VLM`.
+## Provider Va Model
+Ung dung dung NVIDIA endpoint OpenAI-compatible:
 ```env
+NVIDIA_API_KEY=nvapi-your-real-nvidia-key
+NVIDIA_BASE_URL=https://integrate.api.nvidia.com/v1
+NVIDIA_MODEL=deepseek-ai/deepseek-v4-flash
+NVIDIA_VISION_MODEL=meta/llama-3.2-11b-vision-instruct
+NVIDIA_VISION_MAX_TOKENS=1200
 ```
+Model text `NVIDIA_MODEL` phuc vu feedback tong quan neu can. Model vision `NVIDIA_VISION_MODEL` nhan anh hoc vien + anh mau de sinh feedback tung rep.
+Neu `NVIDIA_API_KEY` thieu hoac bang placeholder, app bo qua VLM va dung fallback rong cho `llm_feedback`.
+## Input Cho VLM
+Moi rep loi co the gui:
+- Anh hoc vien da ve pose landmark.
+- Anh mau da ve pose landmark.
+- Rule context gom rep number, score, rule errors, severity, guidance.
+Anh duoc ghep ngang trong `push_up.feedback_graph._make_comparison_image()`:
 ```text
+left  = student rep
+right = mentor/template rep
 ```
+## Output Mong Doi
+VLM phai tra ve mot JSON object ngan:
+```json
+{
+  "is_error": true,
+  "visual_error_label": "Nhô mông quá cao",
+  "diagnosis": "Học viên đẩy hông lên cao trong rep này.",
+  "correction": "Giữ vai và hông di chuyển cùng nhau.",
+  "feedback": "Học viên đẩy hông lên cao trong rep này. Giữ vai và hông di chuyển cùng nhau."
+}
 ```
+VLM khong tra ve arrow. Arrow duoc tinh deterministic tu landmark trong `analysis_service`.
+## Ly Do Khong Dung VLM Cho Arrow
+Thu nghiem thuc te cho thay VLM co the tra `x/y` hop le nhung diem do nam tren nen anh, ghe, san tap, hoac vung khong phai co the. Voi ung dung coaching, arrow sai vi tri gay nham lan hon la khong co arrow.
+Vi vay:
+- Rule engine quyet dinh `primary_error`.
+- Landmark cua frame loi quyet dinh target arrow.
+- VLM chi bo sung cau feedback de nguoi dung de hieu hon.
+## Validation Va Fallback
+`feedback_graph.py` co cac lop bao ve:
+- Bo qua VLM neu thieu API key.
+- Retry mot lan neu response khong parse duoc JSON.
+- Phat hien response bi cat do `finish_reason == "length"`.
+- Trim feedback qua dai.
+- Fallback ve feedback rong neu VLM loi, de rule-based feedback van hien thi.
+## Ten Truong Payload
+Payload hien tai van giu ten `llm_feedback`, `llm_feedback_source`, `llm_feedback_error` de tranh refactor rong. Ve mat y nghia san pham, day la VLM feedback. UI da hien thi la `Feedback tu VLM`.
+## Cach Test
+Mac dinh batch test tat VLM:
 ```powershell
+python scripts\run_pushup_eval_tests.py
 ```
+Bat VLM that:
+```powershell
+python scripts\run_pushup_eval_tests.py --save-artifacts --enable-vlm
+```
+Khi bat VLM, can dam bao `.env` co `NVIDIA_API_KEY` hop le.

docs/Migrating Agent to OpenAI.md DELETED Viewed

@@ -1,750 +0,0 @@
-# Chat Conversation
-Note: _This is purely the output of the chat conversation and does not contain any raw data, codebase snippets, etc. used to generate the output._
-### User Input
-update src/agent.py to use OpenAI API
-*Viewed [agent.py](file:///home/fed/ai/projec/A20-App-032/src/agent.py) *
-*Viewed [config.py](file:///home/fed/ai/projec/A20-App-032/src/config.py) *
-*Viewed [tools.py](file:///home/fed/ai/projec/A20-App-032/src/tools.py) *
-*Edited relevant file*
-*Edited relevant file*
-*Edited relevant file*
-### Planner Response
-I have updated `src/agent.py` to use the OpenAI API instead of the Anthropic API.
-In addition to `src/agent.py`, I proactively updated the following files to align with the new OpenAI integration:
-1. `src/tools.py` – Updated the `get_tool_schemas` function to successfully construct the tool payloads in OpenAI's precise `{"type": "function", ...}` JSON schema format.
-2. `src/config.py` – Adjusted the app's default model to use `gpt-4o` rather than `claude-sonnet-4-20250514`.
-### Key differences handled:
-- Switched the Anthropic client pattern over to use `OpenAI()` instances.
-- Re-architected `run_agent_loop` to use `response.choices[0].message.tool_calls` for tracking required tool runs, decoding the arguments correctly, and mapping those tool results back to the messages list properly (`{"role": "tool", "tool_call_id": id, "content": ...}`).
-- Pushed the `SYSTEM_PROMPT` inside the `messages` array under the `{"role": "system"}` property as expected by OpenAI APIs.
-Make sure you have `openai` installed (`pip install openai`) and that you add `OPENAI_API_KEY` to your local `.env`! Let me know if you want me to write tests or interact with the loop itself.
-### User Input
-đọc yêu cầu từ problem_statement.md và lên kế họach xây dựng phần mềm
-*Viewed [problem_statement.md](file:///home/fed/ai/projec/A20-App-032/problem_statement.md) *
-*Edited relevant file*
-### Planner Response
-Chào bạn, mình đã đọc kĩ yêu cầu dự án trong file `problem_statement.md` và đã xây dựng một kế hoạch triển khai chi tiết cho phần mềm "Huấn luyện viên thể thao AI" (AI Sports Coach).
-Mời bạn xem chi tiết Kế hoạch triển khai (bao gồm cả kiến trúc luồng dữ liệu và lộ trình phát triển) tại artifact đính kèm dưới đây:
-[implementation_plan.md](file:///home/fed/.gemini/antigravity/brain/3368571a-8f93-438f-bc63-09a787991718/implementation_plan.md)
-### Một số Decision (quyết định thiết kế) quan trọng:
-1. **Đồng bộ hóa Video (Synchronization)**: Vì sinh viên và mẫu chuyên nghiệp hiếm khi thực hiện động tác ở cùng một tốc độ, chúng ta cần một thuật toán gióng hàng frame (ví dụ như Dynamic Time Warping - DTW) chứ không thể so sánh `Frame-to-Frame` đơn thuần để tránh báo sai góc độ.
-2. **Overlay Video vs Draw Landmark**: Thay vì chỉ vẽ lên khung xương (stick-figure), chúng ta sẽ có **Alpha Blending** chồng 2 luồng video và highlight phần lỗi cụ thể (như vùng cánh tay đang đánh sai) để mang tính hình ảnh và thiết thực với người dùng Taekwondo hơn.
-3. **Streamlit (Mobile Responsive)**: Mặc dù Streamlit sinh ra cho Web App, một số thủ thuật CSS nội bộ (internal CSS hacks với `st.markdown`) sẽ được thực hiện để bố cục layout được gọn nhẹ cho màn hình điện thoại, nơi sinh viên thực sự sẽ xem và tự tập luyện.
-Bạn có muốn mình bắt đầu khởi tạo cấu trúc thư mục và viết script cho giai đoạn 1 – **Trích xuất dữ liệu mẫu (Reference Data Extraction)** luôn không?
-### User Input
-*Viewed [implementation_plan.md](file:///home/fed/.gemini/antigravity/brain/3368571a-8f93-438f-bc63-09a787991718/implementation_plan.md) *
-*Edited relevant file*
-### Planner Response
-Đồng ý, phân tích các đòn bài quyền trong Taekwondo thì cấu hình của bàn tay (ví dụ như nắm đấm - vươn chưởng/cạnh tay) là cực kỳ quan trọng.
-Mình đã cập nhật lại artifact `implementation_plan.md` để bổ sung xử lý **bàn tay**:
-- Chuyển từ việc chỉ sử dụng BlazePose sang sử dụng kết hợp **MediaPipe Holistic (hoặc Pose + Hand Tracking)**. Việc này sẽ cho phép thu thập thêm tọa độ các ngón tay/bàn tay, thay vì chỉ dừng lại ở cổ tay.
-- Mở rộng thuật toán sinh cơ học (Biomechanics analysis) để xử lý logic không chỉ các góc lớn (cùi chỏ, vai, hông) mà còn đối chiếu biên độ và **cấu hình tiểu tiết của các khớp ngón tay** để đảm bảo người dùng đang đánh bằng kỹ thuật tay đúng chuẩn.
-Bạn check lại bản cập nhật trong file [implementation_plan.md](file:///home/fed/.gemini/antigravity/brain/3368571a-8f93-438f-bc63-09a787991718/implementation_plan.md) nhé. Mình có thể đi tiếp sang phần nào ngay bây giờ?
-### User Input
-*Viewed [implementation_plan.md](file:///home/fed/.gemini/antigravity/brain/3368571a-8f93-438f-bc63-09a787991718/implementation_plan.md) *
-*Edited relevant file*
-### Planner Response
-Tuyệt vời, đây đều là hai điểm cực kỳ thiết thực để cải thiện trải nghiệm và độ chuẩn xác của công cụ AI này. Mình đã update vào bản kế hoạch theo đúng góp ý của bạn:
-1. **Xử lý thời điểm bắt đầu trễ:** Bổ sung thuật toán *Start Frame Detection / Keypose Matching*. Chúng ta sẽ quét qua các frame đầu tiên để dò xem người dùng đã thực sự vào thế tay chuẩn bị của bài quyền hay chưa, sau đó mới kích hoạt đồng bộ hóa DTW để so sánh. Việc này sẽ giải quyết hoàn toàn trường hợp sinh viên bấm máy quay nhưng đứng chờ một lúc mới bắt đầu diễn.
-2. **Khoanh viền đỏ phần lỗi trực quan:** Cập nhật yêu cầu module Video Overlay. Thay vì chỉ in ra text đơn thuần, khi phát hiện độ lệch góc của một bộ phận (như Tay/Chân) vượt quá ngưỡng cho phép, thuật toán OpenCV sẽ vẽ trực tiếp một vòng khoanh đỏ lên vị trí khớp xương đó (dựa vào tọa độ XY của xương qua frame tương ứng). Điều này giúp sinh viên lập tức biết họ cần sửa ở đâu.
-Bạn xem lại artifact đã được làm mới tại đây nhé: [implementation_plan.md](file:///home/fed/.gemini/antigravity/brain/3368571a-8f93-438f-bc63-09a787991718/implementation_plan.md)
-Bạn có đồng ý chốt kế hoạch này để chúng ta bắt đầu cài đặt thư viện và setup **Giai đoạn 1** không?
-### User Input
-### User Input
-tiếp tục
-*User accepted the command `python3 --version`*
-*User accepted the command `bash scripts/setup_hooks.sh`*
-*Edited relevant file*
-*User accepted the command `mkdir -p src/core src/data/reference_vids`*
-*Edited relevant file*
-*Edited relevant file*
-*Edited relevant file*
-*Edited relevant file*
-*Edited relevant file*
-*Edited relevant file*
-*Edited relevant file*
-*User accepted the command `ls -R src/`*
-### Planner Response
-Mình đã hoàn thành việc khởi tạo nền tảng cho dự án **AI Sports Coach (Taekwondo)** dựa trên kế hoạch đã được phê duyệt.
-### Các công việc đã thực hiện:
-1.  **Cấu trúc thư mục**: Thiết lập các module lõi (`core`) và thư mục chứa dữ liệu mẫu (`data/reference_vids`).
-2.  **Core AI (`src/core/pose.py`)**: Sử dụng **MediaPipe Holistic** để trích xuất đồng thời 33 điểm khung xương Pose và chi tiết các khớp bàn tay/ngón tay (theo yêu cầu bổ sung của bạn).
-3.  **Biomechanics Engine (`src/core/compare.py`)**:
-    - Triển khai hàm tính toán góc giữa các khớp (Vai, Cùi chỏ, Hông, Đầu gối).
-    - Tích hợp thuật toán **DTW (Dynamic Time Warping)** để đồng bộ hóa chuỗi động tác của sinh viên và người mẫu ngay cả khi tốc độ thực hiện khác nhau.
-4.  **Overlay Engine (`src/core/overlay.py`)**:
-    - Hỗ trợ trộn 2 luồng video (Alpha blending).
-    - Có chức năng `draw_error_highlight` để **khoanh vùng màu đỏ** tại các vị trí khớp xương bị lỗi (theo yêu cầu bổ sung của bạn).
-5.  **Giao diện người dùng (`src/app.py`)**: Xây dựng UI bằng **Streamlit** với CSS tùy chỉnh để tối ưu cho trải nghiệm di động, hỗ trợ upload video và hiển thị kết quả phân tích.
-6.  **Tài liệu & Cài đặt**: Tạo file `requirements.txt` và `README.md` hướng dẫn chi tiết cách chạy ứng dụng.
-### Bước tiếp theo:
-- Bạn có thể chuẩn bị một video thực hiện bài quyền mẫu (Taegeuk) và đặt vào `src/data/reference_vids/`.
-- Sau đó, mình sẽ hỗ trợ bạn chạy script `src/core/preprocess_reference.py` để trích xuất "mã gene chuyển động" chuẩn cho ứng dụng đối chiếu.
-Để bắt đầu, bạn có thể cài đặt thư viện bằng lệnh:
-```bash
-pip install -r requirements.txt
-```
-Và chạy ứng dụng bằng:
-```bash
-streamlit run src/app.py
-```
-Bạn có muốn mình demo cách viết thêm logic nhận diện cấu hình bàn tay (ví dụ: đòn đấm nắm đấm chặt hay lỏng) vào module `compare.py` không?
-### User Input
-2026-04-12 18:09:02.276 Uncaught app execution
-Traceback (most recent call last):
-  File "/home/fed/ai/projec/A20-App-032/venv/lib64/python3.14/site-packages/streamlit/runtime/scriptrunner/exec_code.py", line 129, in exec_func_with_error_handling
-    result = func()
-  File "/home/fed/ai/projec/A20-App-032/venv/lib64/python3.14/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 689, in code_to_exec
-    exec(code, module.__dict__)  # noqa: S102
-    ~~~~^^^^^^^^^^^^^^^^^^^^^^^
-  File "/home/fed/ai/projec/A20-App-032/src/app.py", line 7, in <module>
-    from src.core.pose import PoseEstimator
-ModuleNotFoundError: No module named 'src'
-*Viewed [app.py](file:///home/fed/ai/projec/A20-App-032/src/app.py) *
-*User accepted the command `ls -F src/core/`*
-*User accepted the command `mv src/app.py ./app.py && touch src/__init__.py src/core/__init__.py src/data/__init__.py`*
-*Viewed [app.py](file:///home/fed/ai/projec/A20-App-032/app.py) *
-*Edited relevant file*
-### Planner Response
-Lỗi này xảy ra do Streamlit khi chạy lệnh `streamlit run src/app.py` sẽ thêm thư mục `src/` vào danh sách các đường dẫn tìm kiếm (PYTHONPATH), dẫn đến việc nó không tìm thấy module cha tên là `src`.
-Mình đã xử lý bằng cách:
-1.  **Chỉnh sửa cấu trúc dự án**: Di chuyển file `app.py` từ thư mục `src/` ra **thư mục gốc** của dự án. Đây là cách làm chuẩn để Streamlit có thể nhận diện đúng các package bên trong thư mục `src/`.
-2.  **Khởi tạo Package**: Thêm các file `__init__.py` cần thiết vào các thư mục để Python hiểu chúng là các module có thể import được.
-3.  **Cập nhật tài liệu**: Sửa lại hướng dẫn trong `README.md`.
-Bây giờ, bạn hãy chạy lại từ thư mục gốc bằng lệnh sau:
-```bash
-streamlit run app.py
-```
-Lỗi `ModuleNotFoundError` sẽ biến mất và ứng dụng sẽ hoạt động bình thường!
-### User Input
-2026-04-12 18:14:43.096 Uncaught app execution
-Traceback (most recent call last):
-  File "/home/fed/ai/projec/A20-App-032/venv/lib64/python3.14/site-packages/streamlit/runtime/scriptrunner/exec_code.py", line 129, in exec_func_with_error_handling
-    result = func()
-  File "/home/fed/ai/projec/A20-App-032/venv/lib64/python3.14/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 689, in code_to_exec
-    exec(code, module.__dict__)  # noqa: S102
-    ~~~~^^^^^^^^^^^^^^^^^^^^^^^
-  File "/home/fed/ai/projec/A20-App-032/app.py", line 50, in <module>
-    estimator = PoseEstimator()
-  File "/home/fed/ai/projec/A20-App-032/src/core/pose.py", line 10, in __init__
-    self.mp_holistic = mp.solutions.holistic
-                       ^^^^^^^^^^^^
-AttributeError: module 'mediapipe' has no attribute 'solutions'
-*User accepted the command `ls -F && pip show mediapipe`*
-*Checked command status*
-*User accepted the command `python3 -c "import mediapipe as mp; print(f'Version: {mp.__version__}'); print(f'Has solutions: {hasattr(mp, \"solutions\")}')"`*
-*Checked command status*
-*Edited relevant file*
-*Viewed [pose.py](file:///home/fed/ai/projec/A20-App-032/src/core/pose.py) *
-*Edited relevant file*
-*Grep searched codebase*
-*User accepted the command `./venv/bin/pip install -r requirements.txt`*
-### Planner Response
-Lỗi `AttributeError: module 'mediapipe' has no attribute 'solutions'` thường xảy ra do thư viện MediaPipe không tự động nạp các submodule vào namespace chính trong một số môi trường hoặc phiên bản nhất định.
-Mình đã khắc phục bằng cách cập nhật lại file `src/core/pose.py`:
-- Thay vì gọi trực tiếp `mp.solutions`, mình đã chuyển sang **import đích danh** các module cần thiết (`mediapipe.python.solutions.holistic`, `drawing_utils`, v.v.). Đây là cách tiếp cận an toàn và tường minh hơn, giúp tránh lỗi Attribute trên.
-Bạn hãy thử chạy lại ứng dụng:
-```bash
-streamlit run app.py
-```
-Bây giờ hệ thống sẽ khởi tạo được bộ khung xương MediaPipe mà không gặp lỗi này nữa!_
-### User Input
-2026-04-12 18:16:30.203 Uncaught app execution
-Traceback (most recent call last):
-  File "/home/fed/ai/projec/A20-App-032/venv/lib64/python3.14/site-packages/streamlit/runtime/scriptrunner/exec_code.py", line 129, in exec_func_with_error_handling
-    result = func()
-  File "/home/fed/ai/projec/A20-App-032/venv/lib64/python3.14/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 689, in code_to_exec
-    exec(code, module.__dict__)  # noqa: S102
-    ~~~~^^^^^^^^^^^^^^^^^^^^^^^
-  File "/home/fed/ai/projec/A20-App-032/app.py", line 7, in <module>
-    from src.core.pose import PoseEstimator
-  File "/home/fed/ai/projec/A20-App-032/src/core/pose.py", line 4, in <module>
-    import mediapipe.python.solutions.holistic as mp_holistic
-ModuleNotFoundError: No module named 'mediapipe.python'
-*User accepted the command `ls -R /home/fed/ai/projec/A20-App-032/venv/lib64/python3.14/site-packages/mediapipe/ | grep solutions`*
-*User accepted the command `python3 -c "import site; print(site.getsitepackages())"`*
-*User accepted the command `./venv/bin/python3 -c "import site; print(site.getsitepackages())"`*
-*User accepted the command `ls -F /home/fed/ai/projec/A20-App-032/venv/lib64/python3.14/site-packages/mediapipe/`*
-*User accepted the command `ls -F /home/fed/ai/projec/A20-App-032/venv/lib/python3.14/site-packages/mediapipe/`*
-*Viewed [__init__.py](file:///home/fed/ai/projec/A20-App-032/venv/lib64/python3.14/site-packages/mediapipe/__init__.py) *
-*User accepted the command `find /home/fed/ai/projec/A20-App-032/venv/ -name holistic.py`*
-*User accepted the command `./venv/bin/python3 -c "import mediapipe.solutions"`*
-*Checked command status*
-*User accepted the command `find /home/fed/ai/projec/A20-App-032/venv/lib64/python3.14/site-packages/mediapipe/ -maxdepth 2`*
-*User accepted the command `ls -F /home/fed/.local/lib/python3.14/site-packages/mediapipe/`*
-*User accepted the command `find /home/fed/ai/projec/A20-App-032/venv/ -name "*.task" -o -name "*.tflite"`*
-*User accepted the command `./venv/bin/python3 -c "import mediapipe; print(dir(mediapipe))"`*
-*Checked command status*
-*User accepted the command `./venv/bin/pip install mediapipe --force-reinstall`*
-*Checked command status*
-*User accepted the command `./venv/bin/python3 -c "import mediapipe; print(f'Has solutions: {hasattr(mediapipe, \"solutions\")}')"`*
-*Checked command status*
-*User accepted the command `ls -F /home/fed/ai/projec/A20-App-032/venv/lib/python3.14/site-packages/mediapipe/`*
-*User accepted the command `mkdir -p src/data/models && wget -O src/data/models/pose_landmarker.task https://storage.googleapis.com/mediapipe-models/pose_landmarker/pose_landmarker_heavy/float16/1/pose_landmarker_heavy.task && wget -O src/data/models/hand_landmarker.task https://storage.googleapis.com/mediapipe-models/hand_landmarker/hand_landmarker/float16/1/hand_landmarker.task`*
-*Checked command status*
-*Edited relevant file*
-*User accepted the command `ls -R src/data/models/`*
-*Edited relevant file*
-*Edited relevant file*
-### Planner Response
-Hiện tại, phiên bản MediaPipe được cài đặt cho Python 3.14 (vốn là phiên bản Python rất mới) dường như đã loại bỏ API `solutions` cũ (Legacy API) và chuyển hoàn toàn sang **MediaPipe Tasks API** mới. Điều này giải thích tại sao bạn gặp lỗi `ModuleNotFoundError` dù đã cài đặt thư viện thành công.
-Mình đã tiến hành một đợt nâng cấp hệ thống để tương thích hoàn toàn với API mới này:
-1.  **Chuyển đổi API (`src/core/pose.py`)**: Mình đã viết lại toàn bộ lớp `PoseEstimator` để sử dụng `mediapipe.tasks`.
-    - Sử dụng `PoseLandmarker` cho khung xương.
-    - Sử dụng `HandLandmarker` cho bàn tay.
-    - Vẫn giữ nguyên cấu trúc đầu ra (`Pose`, `Left_hand`, `Right_hand`) để không làm hỏng các phần so sánh và overlay đã viết trước đó.
-2.  **Tải Model AI**: Vì API mới yêu cầu file model vật lý (không tự tải như bản cũ), mình đã tự động tải về các file model chuẩn từ Google:
-    - `pose_landmarker.task` (bản Heavy để đạt độ chính xác cao nhất cho Taekwondo).
-    - `hand_landmarker.task`.
-    - Các file này được lưu tại `src/data/models/`.
-3.  **Tự trình vẽ (Custom Drawing)**: Do `mp.solutions.drawing_utils` cũng không còn khả dụng, mình đã viết thêm logic vẽ khung xương và bàn tay trực tiếp bằng OpenCV bên trong lớp `PoseEstimator`.
-Giờ đây, bạn có thể chạy lại ứng dụng:
-```bash
-streamlit run app.py
-```
-Code sẽ hoạt động ổn định trên môi trường Python 3.14 hiện tại của bạn! Nếu có bất kỳ lỗi nhỏ nào về tọa độ, hãy cho mình biết nhé.
-### User Input
-bỏ mục chọn bài mẫu, yêu cầu người dùng upload 1 file video mẫu và 1 file video dùng làm so sánh
-*Viewed [app.py](file:///home/fed/ai/projec/A20-App-032/app.py) *
-*Edited relevant file*
-### Planner Response
-Mình đã cập nhật xong `app.py`. Bạn có thể kiểm tra các thay đổi mới nhất như sau:
-1.  **Giao diện linh hoạt (Dynamic Upload)**:
-    - Mình đã loại bỏ danh sách bài quyền mẫu cố định ở Sidebar.
-    - Thay vào đó là 2 khu vực Upload chuyên biệt: một cho **Video Mẫu (Chuyên gia)** và một cho **Video Của Bạn**. Điều này cho phép ứng dụng so sánh bất kỳ bài quyền hoặc đòn đánh nào mà bạn muốn, không giới hạn trong dữ liệu có sẵn.
-2.  **Thiết kế cao cấp (Premium UI)**:
-    - Chuyển sang Dark Mode làm chủ đạo với các Card Upload được thiết kế bóng bẩy.
-    - Nút "BẮT ĐẦU PHÂN TÍCH" được thiết kế với hiệu ứng Gradient và tương tác Hover sống động.
-3.  **Quy trình xử lý song song**:
-    - Ứng dụng sẽ trích xuất khung xương của cả hai video tuần tự và sử dụng thuật toán **BiomechanicsMatcher** kết hợp **DTW** để gióng hàng thời gian giữa chúng.
-    - Trạng thái xử lý được hiển thị chuyên nghiệp thông qua `st.status`.
-Bây giờ bạn có thể mở ứng dụng và tải lên 2 video để bắt đầu trải nghiệm nhé!
-```bash
-streamlit run app.py
-```
-### User Input
-tiến hành giai đọan 2 trong implementation_plan.md
-*Viewed [compare.py](file:///home/fed/ai/projec/A20-App-032/src/core/compare.py) *
-*Edited relevant file*
-*Edited relevant file*
-*Viewed [compare.py](file:///home/fed/ai/projec/A20-App-032/src/core/compare.py) *
-*Edited relevant file*
-### Planner Response
-Mình đã hoàn thành việc nâng cấp **Giai đoạn 2** trong kế hoạch triển khai, tập trung vào xây dựng bộ phân tích sinh cơ học (Biomechanics Engine) chuyên sâu cho Taekwondo.
-### Các cải tiến quan trọng đã thực hiện trong `src/core/compare.py`:
-1.  **Phân tích trạng thái bàn tay (`get_hand_state`)**:
-    - Không chỉ dừng lại ở khung xương, mình đã thêm logic tính toán độ mở/đóng của bàn tay. Điều này giúp hệ thống nhận diện được sinh viên đang thực hiện đòn đấm (nắm chặt) hay đòn xỉa/chưởng (mở tay) để đối chiếu chính xác với mẫu.
-2.  **Chuẩn hóa khung xương (`normalize_pose`)**:
-    - Implement hàm chuẩn hóa dữ liệu dựa trên chiều cao thân người và điểm gốc (giữa hông). Điều này giúp việc so sánh không bị ảnh hưởng bởi khoảng cách đứng xa/gần camera hay sự khác biệt về chiều cao giữa người dùng và chuyên gia.
-3.  **Thuật toán tìm điểm bắt đầu (`find_start_frame`)**:
-    - Hệ thống giờ đây sẽ tự động quét qua những giây đầu tiên của video người dùng để tìm thời điểm họ bắt đầu vào thế giống video mẫu nhất, sau đó mới kích hoạt đồng bộ hóa DTW. Điều này giải quyết vấn đề "chờ đợi" trước khi bắt đầu bài quyền.
-4.  **Vector đặc trưng mở rộng**:
-    - Tích hợp thêm các góc quan trọng như Góc Hông (Hip angles) và Góc Vai (Shoulder rotation) vào vector đặc trưng để tăng độ chi tiết cho việc tính điểm.
-Giờ đây, bộ lõi phân tích đã sẵn sàng để xử lý dữ liệu phức tạp hơn và trả về kết quả định lượng (Errors/Score) một cách khoa học.
-Bạn có muốn mình tiến tới **Giai đoạn 3: Overlay & Visual Correction** (vẽ khoanh đỏ vùng lỗi lên video) không?
-### User Input
-tiến tới Giai đoạn 3
-*Viewed [overlay.py](file:///home/fed/ai/projec/A20-App-032/src/core/overlay.py) *
-*Edited relevant file*
-*Edited relevant file*
-*Edited relevant file*
-*Edited relevant file*
-### Planner Response
-Mình đã hoàn tất **Giai đoạn 3: Phân tích lỗi và Overlay Video (Correction & Result Generation)**. Toàn bộ logic hiển thị trực quan lỗi sai hiện đã được tích hợp hoàn chỉnh:
-### Các tính năng mới trong Giai đoạn 3:
-1.  **Hệ thống Khoanh vùng Lỗi (Visual Highlighting)**:
-    - Trong `src/core/overlay.py`, mình đã xây dựng hàm `create_combined_overlay` tự động ánh xạ (mapping) các chỉ số lỗi sinh cơ học sang đúng tọa độ các khớp xương của MediaPipe.
-    - Khi phát hiện sai lệch (ví dụ: góc cùi chỏ lệch > 20 độ), hệ thống sẽ **vẽ một vòng tròn đỏ rực** ngay tại vị trí khớp đó để người dùng nhận biết tức thì.
-2.  **HUD (Heads-Up Display) Thông minh**:
-    - Video kết quả sẽ có một bảng điều khiển (HUD) ở góc trên bên trái, hiển thị danh sách các cảnh báo cụ thể theo thời gian thực (ví dụ: `! Tay trai: Sai lech 25 deg`).
-3.  **Xuất Video Kết quả (`generate_result_video`)**:
-    - Mình đã bổ sung trình render video mạnh mẽ. Nó sẽ lấy video của người dùng làm nền, sau đó "chồng" (alpha blend) video mẫu mờ ảo lên trên (hiệu ứng bóng ma - ghost effect) và vẽ thêm khung xương + các cảnh báo lỗi.
-    - Video cuối cùng được xuất ra định dạng `.mp4` và hiển thị trực tiếp trên giao diện Streamlit sau khi quá trình phân tích kết thúc.
-### Cách kiểm tra:
-Bây giờ khi bạn tải lên 2 video và nhấn nút phân tích, bạn sẽ thấy:
-- Một thanh tiến trình chạy qua việc trích xuất AI.
-- Sau đó là quá trình render video overlay.
-- Cuối cùng là một Video kết quả chuyên nghiệp, nơi các lỗi được khoanh đỏ rõ rệt.
-Bạn hãy thử chạy lệnh:
-```bash
-streamlit run app.py
-```
-Chúng ta đã đi được 3/5 giai đoạn chính. Bạn có muốn mình tiếp tục với **Giai đoạn 4: Tối ưu UI/UX nâng cao và tích hợp Plugin WebRTC** (nếu bạn muốn quay video trực tiếp bằng Camera) không?
-### User Input
-2026-04-12 18:52:29.115 Uncaught app execution
-Traceback (most recent call last):
-  File "/home/fed/ai/projec/A20-App-032/venv/lib64/python3.14/site-packages/streamlit/runtime/scriptrunner/exec_code.py", line 129, in exec_func_with_error_handling
-    result = func()
-  File "/home/fed/ai/projec/A20-App-032/venv/lib64/python3.14/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 689, in code_to_exec
-    exec(code, module.__dict__)  # noqa: S102
-    ~~~~^^^^^^^^^^^^^^^^^^^^^^^
-  File "/home/fed/ai/projec/A20-App-032/app.py", line 8, in <module>
-    from src.core.compare import BiomechanicsMatcher, get_pose_angles
-ImportError: cannot import name 'get_pose_angles' from 'src.core.compare' (/home/fed/ai/projec/A20-App-032/src/core/compare.py)
-*Viewed [app.py](file:///home/fed/ai/projec/A20-App-032/app.py) *
-*Grep searched codebase*
-*Edited relevant file*
-### Planner Response
-Lỗi này phát sinh do trong quá trình nâng cấp thư viện ở Giai đoạn 2, mình đã đổi tên hàm `get_pose_angles` thành `get_biomechanics_features` để phản ánh đúng việc phân tích cả bàn tay và các chỉ số sinh cơ học khác, nhưng chưa cập nhật dòng import trong `app.py`.
-Mình đã sửa lại file `app.py` để loại bỏ phần import không còn tồn tại này. Ứng dụng hiện tại chủ yếu sử dụng class `BiomechanicsMatcher` để xử lý so sánh nên việc xóa này không ảnh hưởng đến tính năng.
-Bạn hãy thử chạy lại nhé:
-```bash
-streamlit run app.py
-```
-### User Input
-loại bỏ thông số tĩnh (Text/HUD) trên video đầu ra, chỉ sử dụng video của sinh viên trên video đầu ra
-*Viewed [overlay.py](file:///home/fed/ai/projec/A20-App-032/src/core/overlay.py) *
-*Edited relevant file*
-*Edited relevant file*
-### Planner Response
-Mình đã cập nhật hệ thống hiển thị kết quả theo yêu cầu mới nhất của bạn:
-1.  **Loại bỏ Text/HUD**: Các bảng thông số và dòng chữ cảnh báo đã được gỡ bỏ hoàn toàn khỏi video đầu ra để tránh gây rối mắt và che khuất động tác.
-2.  **Chế độ xem đơn (Single Video Output)**: Video kết quả bây giờ chỉ hiển thị duy nhất **video thực tế của sinh viên**. Mình đã loại bỏ hiệu ứng bóng ma (alpha blending) của video chuyên gia.
-3.  **Hệ thống chỉ thị lỗi trực quan**: Các vòng tròn khoanh đỏ (và màu cam cho bàn tay) vẫn được giữ lại tại các vị trí khớp xương bị sai lệch. Điều này giúp sinh viên tập trung tối đa vào việc quan sát chính bản thân mình và nhận diện lỗi sai ngay trên cơ thể mình.
-Bạn hãy chạy lại ứng dụng để xem kết quả mới nhé:
-```bash
-streamlit run app.py
-```
-### User Input
-video kết quả không xuất hiện trên trình duyêt web, tôi cho là trình duyệt không nhận diện được đường dẫn file tạm
-*Edited relevant file*
-*Edited relevant file*
-### Planner Response
-Đúng như bạn dự đoán, việc truyền đường dẫn file trực tiếp (`filepath`) đôi khi khiến trình duyệt gặp khó khăn trong việc truy cập file tạm hoặc không nhận diện được định dạng nén của OpenCV.
-Mình đã thực hiện 2 thay đổi quan trọng để xử lý triệt để vấn đề này:
-1.  **Chuyển đổi Codec (`avc1`)**: Trong `src/core/overlay.py`, mình đã đổi bộ mã hóa video từ `mp4v` sang `avc1` (H.264). Đây là chuẩn nén video phổ biến nhất mà tất cả các trình duyệt hiện đại (Chrome, Safari, Firefox) đều có thể giải mã và phát trực tiếp trên web.
-2.  **Truyền dữ liệu nhị phân (Binary Stream)**: Trong `app.py`, thay vì đưa cho Streamlit một cái tên file, mình đã chuyển sang cách **đọc file dưới dạng bytes** rồi mới đẩy vào `st.video()`. Cách này đảm bảo video được "nhúng" trực tiếp vào trang web, loại bỏ hoàn toàn các lỗi liên quan đến đường dẫn hay quyền truy cập file trên server.
-Bạn hãy thử chạy lại, bây giờ video kết quả chắc chắn sẽ hiện ra mượt mà ngay trên trình duyệt!
-```bash
-streamlit run app.py
-```
-### User Input
-[h264_v4l2m2m @ 0x7f9461481b40] Could not find a valid device
-[h264_v4l2m2m @ 0x7f9461481b40] can't configure encoder
-[ERROR:0@360.131] global cap_ffmpeg_impl.hpp:3514 open Could not open codec h264_v4l2m2m, error: Unspecified error (-22)
-[ERROR:0@360.131] global cap_ffmpeg_impl.hpp:3531 open VIDEOIO/FFMPEG: Failed to initialize VideoWriter
-*Edited relevant file*
-### Planner Response
-Lỗi bạn vừa gặp (`h264_v4l2m2m`) là do hệ thống đang cố gắng sử dụng card đồ họa hoặc chip xử lý phần cứng để nén video H.264 nhưng không tìm thấy thiết bị phù hợp, dẫn đến việc OpenCV không thể khởi tạo trình ghi video.
-Để khắc phục, mình đã:
-1.  **Chuyển về Codec `mp4v`**: Đây là bộ mã hóa phần mềm tiêu chuẩn của OpenCV, nó không phụ thuộc vào thiết bị ph��n cứng đặc thù nên sẽ hoạt động ổn định trên hầu hết các máy Linux.
-2.  **Kết hợp với Binary Stream**: Vì mình đã cập nhật `app.py` để đọc file dưới dạng nhị phân ở bước trước, nên ngay cả khi dùng `mp4v`, dữ liệu vẫn sẽ được đẩy lên trình duyệt một cách hiệu quả.
-Bạn hãy thử chạy lại nhé, lần này quá trình render sẽ không còn bị ngắt quãng bởi lỗi phần cứng nữa:
-```bash
-streamlit run app.py
-```
-### User Input
-video vẫn không xuất hiện
-*User accepted the command `./venv/bin/python3 -c "import cv2; print(cv2.VideoWriter_fourcc(*'VP80'))"`*
-*Edited relevant file*
-*Edited relevant file*
-### Planner Response
-Có vẻ như trình duyệt web không hỗ trợ phát trực tiếp định dạng MP4 nếu nó được nén bằng thuật toán cũ của OpenCV (`mp4v`), đồng thời hệ thống của bạn cũng đang gặp lỗi khi cố gắng nén bằng chuẩn H.264 (do không tìm thấy driver phần cứng).
-Vì vậy, mình đã chuyển đổi toàn bộ quy trình sang định dạng **WebM** với codec **VP8**:
-1.  **Định dạng WebM**: Đây là định dạng video mở, được thiết kế riêng cho web. Nó nhẹ và có khả năng tương thích cực cao với các trình duyệt mà không cần đến hỗ trợ phần cứng đặc thù như H.264.
-2.  **Cập nhật Codec (`VP80`)**: Trong module overlay, mình đã đổi codec sang `VP80` để ghi file `.webm`. Trình duyệt sẽ nhận diện và phát được ngay lập tức thông qua luồng dữ liệu nhị phân mà chúng ta đã thiết lập.
-Bạn hãy thử chạy lại một lần nữa, kết quả sẽ hiện ra ngay sau khi quá trình render hoàn tất:
-```bash
-streamlit run app.py
-```
-### User Input
-trong video có những khoanh tròn màu xanh, chúng có ý nghĩa gfi
-### Planner Response
-Các vòng tròn màu xanh (Blue) mà bạn thấy trên video chính là **các điểm khớp xương của bàn tay (Hand Landmarks)**.
-Cụ thể, ý nghĩa của các ký hiệu màu sắc trên video kết quả như sau:
-1.  **Vòng tròn màu xanh (Blue)**: Hiển thị chi tiết vị trí các ngón tay và cổ tay của bạn. Vì trong Taekwondo, cấu hình bàn tay (nắm đấm hay mở tay) rất quan trọng, nên các điểm này giúp bạn thấy rõ cử động chi tiết của bàn tay so với video mẫu.
-2.  **Vòng tròn màu đỏ (Red)**: Đây là **cảnh báo lỗi**. Khi một bộ phận cơ thể (như vai, cùi chỏ, hoặc đầu gối) có góc lệch quá lớn so với video mẫu, hệ thống sẽ tự động khoanh tròn đỏ để bạn biết chính xác điểm cần sửa.
-3.  **Vòng tròn màu cam (Orange)**: Cảnh báo lỗi riêng cho **bàn tay**. Nếu video mẫu là nắm đấm mà bạn lại mở tay (hoặc ngược lại), hệ thống sẽ khoanh vùng màu cam ở cổ tay.
-4.  **Các đường kẻ màu xanh lá (Green)**: Là bộ khung xương cơ thể kết nối các khớp chính (vai, hông, chân), giúp bạn quan sát tổng thể tư thế (pose) của mình.
-Nếu bạn thấy các vòng tròn màu xanh quá nhỏ hoặc khó nhìn, mình có thể điều chỉnh lại kích thước hoặc đổi sang màu khác nổi bật hơn nhé!
-### User Input
-ghi chú mã nguồn, giải thích các hằng số được sử dụng trong mã nguồn
-*Edited relevant file*
-*Edited relevant file*
-*Edited relevant file*
-*Edited relevant file*
-*Viewed [compare.py](file:///home/fed/ai/projec/A20-App-032/src/core/compare.py) *
-*Edited relevant file*
-*Edited relevant file*
-### Planner Response
-Mình đã hoàn tất việc ghi chú mã nguồn bằng tiếng Việt và giải thích chi tiết các hằng số quan trọng cho toàn bộ hệ thống.
-Dưới đây là tóm tắt các hằng số và thông số kỹ thuật cốt lõi vừa được chú thích:
-### 1. Hệ thống Ngưỡng (Thresholds)
-- **`error_threshold = 120.0`** (trong `overlay.py`): Đây là ngưỡng sai lệch góc cộng dồn. Nếu tổng sai số tại một khớp xương vượt quá 120 (đơn vị Độ), hệ thống sẽ kích hoạt vòng tròn đỏ cảnh báo.
-- **`START_DETECTION_THRESHOLD = 15.0`** (trong `compare.py`): Ngưỡng độ lệch tối đa để hệ thống coi là sinh viên bắt đầu động tác giống mẫu.
-- **`major_errors_threshold > 60.0`** (trong `app.py`): Một frame được coi là lỗi nghiêm trọng nếu sai số trung bình vượt quá 60.
-- **`hand_error_threshold > 30.0`** (trong `overlay.py`): Dùng để khoanh vùng màu cam cho lỗi cấu hình bàn tay (nắm/mở).
-### 2. Logic Kỹ thuật
-- **`timestamp += 33`** (trong `pose.py`): MediaPipe Tasks API ở chế độ Video yêu cầu mỗi frame có một mốc thời gian tăng dần. 33ms tương đương với tần suất 30 khung hình/giây (FPS) – tiêu chuẩn của hầu hết camera điện thoại.
-- **`FEATURE_TO_LANDMARK`**: Bảng ánh xạ giúp hệ thống biết khi tính toán sai lệch ở index số 0 (Khuỷu tay trái) thì phải vẽ vòng tròn đỏ vào Landmark ID số 13 của MediaPipe.
-- **`Scale Hand (* 100)`** (trong `compare.py`): Vì góc khớp tính theo Độ (0-180), còn trạng thái bàn tay tính theo tỷ lệ (0-1), nên mình nhân 100 cho bàn tay để chúng có "trọng số" tương đương khi tính toán tổng sai số.
-Mã nguồn hiện đã rõ ràng và dễ bảo trì hơn cho các giai đoạn tiếp theo. Bạn có thể kiểm tra trực tiếp trong các file `src/core/*.py` để xem chi tiết các đoạn comment mới.
-### User Input
-tiếp tục với giai đọan 4
-*Edited relevant file*
-### Planner Response
-Mình đã thực hiện xong **Giai đoạn 4: Nâng cấp Giao diện và Dashboard kết quả**.
-Ứng dụng hiện tại không chỉ hiển thị video mà còn cung cấp một bảng phân tích số liệu chuyên sâu để người tập có cái nhìn tổng quan nhất về kỹ thuật của mình:
-### Các điểm mới trong Giai đoạn 4:
-1.  **Dashboard Chỉ số Hiệu suất**:
-    - Hiển thị điểm số chính xác kèm theo so sánh với mục tiêu.
-    - Đánh giá độ nhịp điệu (Rhythm) dựa trên khả năng đồng bộ của thuật toán DTW.
-2.  **Biểu đồ sai lệch thời gian thực**:
-    - Sử dụng `st.line_chart` để vẽ biểu đồ trực quan hóa toàn bộ quá trình tập luyện. Người tập có thể biết được chính xác giây thứ mấy là lúc họ thực hiện tệ nhất (đỉnh biểu đồ) để quay lại xem kỹ video.
-3.  **Phân tích chi tiết từng bộ phận (Body Part Breakdown)**:
-    - Hệ thống tính toán sai số trung bình cho từng khớp xương (Tay trái/phải, Vai, Hông, Chân).
-    - Sử dụng các icon trực quan: ✅ (Tốt), ⚠️ (Cần chú ý), ❌ (Sai nhiều) kèm theo con số độ lệch trung bình.
-4.  **Tối ưu hóa Mobile UI**:
-    - Bố cục Dashboard được chia thành các cột nhỏ tự động co giãn (responsive), giúp sinh viên dễ dàng xem báo cáo trên điện thoại ngay tại sân tập.
-Với bản cập nhật này, ứng dụng đã đạt ngưỡng **Production-Ready** về mặt tính năng và giao diện.
-Bạn đã sẵn sàng để tiến tới **Giai đoạn cuối cùng (Giai đoạn 5: Tối ưu hiệu năng và Kiểm thử)** chưa? Ở bước này chúng ta sẽ tinh chỉnh lại tốc độ xử lý và giới hạn tài nguyên để đảm bảo app chạy mượt mà trên môi trường Web.
-### User Input
-2026-04-12 20:53:36.452 Uncaught app execution
-Traceback (most recent call last):
-  File "/home/fed/ai/projec/A20-App-032/venv/lib64/python3.14/site-packages/streamlit/runtime/scriptrunner/exec_code.py", line 129, in exec_func_with_error_handling
-    result = func()
-  File "/home/fed/ai/projec/A20-App-032/venv/lib64/python3.14/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 689, in code_to_exec
-    exec(code, module.__dict__)  # noqa: S102
-    ~~~~^^^^^^^^^^^^^^^^^^^^^^^
-  File "/home/fed/ai/projec/A20-App-032/app.py", line 142, in <module>
-    from src.core.overlay import FEATURE_LABELS
-ImportError: cannot import name 'FEATURE_LABELS' from 'src.core.overlay' (/home/fed/ai/projec/A20-App-032/src/core/overlay.py)
-*Viewed [overlay.py](file:///home/fed/ai/projec/A20-App-032/src/core/overlay.py) *
-*Edited relevant file*
-### Planner Response
-Lỗi này là do sơ suất của mình khi dọn dẹp mã nguồn ở bước trước (khi bạn yêu cầu bỏ HUD), mình đã vô tình xóa mất từ điển `FEATURE_LABELS` khiến Dashboard kết quả không tìm thấy tên các bộ phận để hiển thị.
-Mình đã khôi phục lại các nhãn tên bộ phận bằng tiếng Việt trong file `src/core/overlay.py`. Bây giờ Dashboard phân tích chi tiết sẽ hoạt động bình thường trở lại.
-Bạn hãy chạy lại ứng dụng nhé:
-```bash
-streamlit run app.py
-```
-### User Input
-kết xuất video đầu ra riêng rẽ cho từng phiên làm việc, cho các video đó vào 1 thư mục tên là outputs, xóa các video cũ hơn 5 phút
-*User accepted the command `mkdir -p outputs`*
-*Edited relevant file*
-*Edited relevant file*
-### Planner Response
-Mình đã cập nhật hệ thống quản lý file video đầu ra theo yêu cầu của bạn:
-1.  **Thư mục lưu trữ riêng biệt**: Mình đã tạo thư mục `outputs/` để chứa tất cả các video kết quả.
-2.  **Định danh theo phiên (Session ID)**: Thay vì ghi đè lên một file duy nhất, mỗi lần bạn nhấn "Phân tích", ứng dụng sẽ tạo ra một file webm mới có kèm timestamp (ví dụ: `outputs/result_1712932726.webm`). Điều này giúp tránh xung đột dữ liệu khi có nhiều người dùng hoặc nhiều phiên làm việc cùng lúc.
-3.  **Tự động dọn dẹp (Auto-cleanup)**: Mình đã thêm hàm `cleanup_old_videos` vào bộ khởi tạo của ứng dụng. Mỗi khi có một phiên phân tích mới bắt đầu, hệ thống sẽ tự động quét thư mục `outputs/` và **xóa bỏ bất kỳ video nào cũ hơn 5 phút**. Việc này giúp tiết kiệm dung lượng ổ đĩa và bảo vệ quyền riêng tư của người dùng.
-Bạn có thể kiểm tra danh sách video trong thư mục `outputs/` của project để thấy sự thay đổi.
-```bash
-streamlit run app.py
-```
-### User Input
-giảm yêu cầu về độ chính xác
-*Edited relevant file*
-*Edited relevant file*
-### Planner Response
-Mình đã điều chỉnh hệ thống để trở nên "dễ tính" hơn, giúp người mới tập không cảm thấy quá áp lực về độ chính xác tuyệt đối:
-1.  **Nới lỏng công thức tính điểm**: Trong `src/core/compare.py`, mình đã giảm hệ số phạt sai số từ **2.5** xuống còn **1.5**.
-    - *Ví dụ*: Trước đây nếu bạn lệch trung bình 20 độ thì chỉ được 50 điểm, bây giờ với cùng mức lệch đó bạn sẽ đạt khoảng **70 điểm**. Điều này giúp phản ánh đúng hơn nỗ lực của người tập ở giai đoạn bắt đầu.
-2.  **Tăng ngưỡng "Lỗi nghiêm trọng"**: Trong `app.py`, mình đã tăng ngưỡng để một lỗi bị coi là "Cần sửa" từ 60 độ lên **90 độ**. Hệ thống sẽ chỉ đếm những lỗi thực sự quá lớn, giúp Dashboard trông tích cực hơn.
-3.  **Vòng tròn cảnh báo (Red Circles)**: Vẫn giữ ở mức **120.0** như bạn đã chỉnh trước đó, đảm bảo chỉ những động tác sai lệch hẳn về phương hướng mới bị khoanh đỏ.
-Hệ thống giờ đây sẽ mang tính khuyến khích (supportive) nhiều hơn là khắt khe. Bạn có thể thử nghiệm ngay với bài tập tiếp theo nhé!
-### User Input
-chấm dứt việc khoanh màu xanh trên video
-*Viewed [pose.py](file:///home/fed/ai/projec/A20-App-032/src/core/pose.py) *
-*Viewed [pose.py](file:///home/fed/ai/projec/A20-App-032/src/core/pose.py) *
-*Edited relevant file*
-### Planner Response
-Mình đã loại bỏ hoàn toàn các vòng tròn màu xanh (các điểm khớp bàn tay) khỏi video. Giờ đây video kết quả sẽ chỉ tập trung hiển thị:
-- Khung xương chính (Green lines).
-- Các vòng tròn đỏ/cam cảnh báo lỗi khi có sai lệch lớn.
-Video sẽ trông gọn gàng và tập trung vào các lỗi chính hơn. Bạn có thể kiểm tra kết quả ngay trong lượt phân tích tiếp theo nhé.
-### User Input
-định dạng tên video đầu ra như sau: tên video mẫu_tên video võ sinh_độ chính xác_timestamp
-*Edited relevant file*
-### Planner Response
-Mình đã cập nhật logic đặt tên file video đầu ra theo đúng định dạng bạn yêu cầu.
-Tên file mới sẽ có cấu trúc: `{Tên_Video_Mẫu}_{Tên_Video_Sinh_Viên}_{Điểm_Số}_{Timestamp}.webm`
-**Ví dụ**: Nếu bạn tải lên file `mau_quyen_1.mp4` và `nguyen_van_a.mov`, đạt điểm 85, video kết quả sẽ có tên là:
-`outputs/mau_quyen_1_nguyen_van_a_85_1712932726.webm`
-Hệ thống cũng tự động loại bỏ đuôi file gốc và thay thế khoảng trắng bằng dấu gạch dưới để đảm bảo tên file an toàn cho hệ thống. Mời bạn kiểm tra lại trong thư mục `outputs/` sau khi chạy phân tích.

docs/README.md ADDED Viewed

	@@ -0,0 +1,47 @@

+# AI Coach Docs
+Bo tai lieu nay mo ta phien ban hien tai cua du an: ung dung Reflex phan tich bai tap hit dat bang video.
+## Doc Map
+- `problem_statement.md`: muc tieu san pham va pham vi hien tai.
+- `implementation_plan.md`: kien truc, module chinh, luong xu ly va quyet dinh thiet ke.
+- `SETUP_NOTES.md`: cach cai dat, chay app, cau hinh VLM va test batch.
+- `LLM_FEEDBACK_NVIDIA_LANGGRAPH.md`: vai tro NVIDIA VLM, input/output, fallback va ly do arrow khong dung VLM.
+- `UI_REDESIGN.md`: cau truc UI hien tai, cach phan biet rule-based feedback voi VLM feedback.
+- `legacy_agent_notes.md`: ghi chu ngan ve tai lieu agent/OpenAI cu khong con thuoc luong production.
+## Tom Tat He Thong
+```text
+Reflex UI
+  -> upload video hoc vien
+  -> MediaPipe Pose local CPU
+  -> rep segmentation + DTW
+  -> rule-based error detection
+  -> annotated rep frames + deterministic arrows
+  -> optional NVIDIA VLM text feedback
+  -> per-rep result cards
+```
+## Nguon Su That Trong Ung Dung
+- Loi chinh: rule engine.
+- Diem so: ket hop rule score va DTW score.
+- Arrow: landmark rule-based.
+- Feedback rule-based: `ERROR_GUIDANCE`.
+- Feedback VLM: chi la giai thich bo sung bang ngon ngu tu nhien.
+## Lenh Thuong Dung
+```powershell
+reflex run
+```
+```powershell
+python scripts\run_pushup_eval_tests.py
+```
+```powershell
+python scripts\run_pushup_eval_tests.py --save-artifacts --enable-vlm
+```

docs/SETUP_NOTES.md CHANGED Viewed

@@ -1,41 +1,89 @@
 # Setup Notes
-## MCP
-- Context7 MCP was added to `~/.codex/config.toml`:
-```toml
-[mcp_servers.context7]
-command = "npx"
-args = ["-y", "@upstash/context7-mcp"]
 ```
-- A dedicated Reflex MCP package was not found during npm search. Fallback is to use Context7 where available, inspect Reflex source in `.venv`, and keep local docs in this repo.
-## Repo Config
-- `.codex/config.toml` was not created because `.codex` in this repo is currently a file, not a directory.
-- No Stitch package was added. The design direction is captured in `docs/UI_REDESIGN.md` for later Stitch/Figma reference.
-## Running
-```bash
-source .venv/bin/activate
-reflex run
 ```
-If ports `3000` or `8000` are busy, Reflex will choose the next available ports.
-## Runtime Directories
-- `.web/` is generated by Reflex. It contains the compiled frontend app, copied static assets, and Node packages. It is not app data and can be deleted; `reflex run` recreates it.
-- `.states/` is generated by Reflex in development to persist small state snapshots. It is not user video storage and can be deleted; Reflex may recreate it.
-- `uploaded_files/` is the default Reflex upload directory. This project redirects Reflex upload temp files to `/tmp/pushupai_reflex_uploads`, so student videos are not stored in the repo.
-## Video Storage Policy
-- Template source video: `data/templates/push_up_template.mp4`.
-- Template preprocessed cache: `data/processed/pushup_template.pkl`.
-- Browser-facing template video: `assets/push_up_template.mp4`.
-- Student videos are written only to `/tmp/pushupai_reflex_uploads/students` so the browser can preview them. They are deleted when choosing another video, and stale temp previews are cleaned after 6 hours.
-- Per-rep comparison images are written to `/tmp/pushupai_reflex_uploads/analysis` so the result UI can show them. They are deleted when choosing another video, and stale temp artifacts are cleaned after 6 hours.

 # Setup Notes
+## Moi Truong
+Du an duoc phat trien va test chinh tren Windows voi PowerShell.
+Khuyen nghi:
+- Python 3.11.
+- Virtual environment `.venv`.
+- Chay tu repo root.
+```powershell
+python -m venv .venv
+.\.venv\Scripts\Activate.ps1
+pip install -r requirements.txt
 ```
+## Chay App
+```powershell
+reflex run
+```
+Mac dinh Reflex se chay:
+- Frontend: `http://localhost:3000/`
+- Backend: `http://0.0.0.0:8000`
+Neu port ban, Reflex co the chon port khac.
+## Cau Hinh VLM
+VLM la tuy chon. Neu khong cau hinh NVIDIA API, app van chay rule-based analysis va feedback rule-based.
+Tao `.env` tu `.env.example` va dien:
+```env
+NVIDIA_API_KEY=nvapi-your-real-nvidia-key
+NVIDIA_BASE_URL=https://integrate.api.nvidia.com/v1
+NVIDIA_MODEL=deepseek-ai/deepseek-v4-flash
+NVIDIA_VISION_MODEL=meta/llama-3.2-11b-vision-instruct
+NVIDIA_VISION_MAX_TOKENS=1200
 ```
+## Thu Muc Runtime
+- `.web/`: Reflex build output, co the xoa.
+- `.states/`: Reflex state snapshot trong dev, co the xoa.
+- `uploaded_files/`: Reflex upload directory neu dung cau hinh mac dinh.
+- `analysis_artifacts/`: ket qua debug/test, artifact anh, JSON test output. Thu muc nay gitignored.
+- `.ai-log/`: prompt/test logs tu hook, gitignored.
+## Du Lieu Chinh
+- Template video: `data/templates/push_up_template.mp4`.
+- Template cache: `data/processed/pushup_template.pkl`.
+- Browser asset template: `assets/push_up_template.mp4`.
+- Test videos: `data/tests/*.mp4`.
+## Logging Hook
+Theo `AGENTS.md`, prompt logging duoc cau hinh tu dong qua hook. Khong can ghi log prompt thu cong.
+Khi clone/chuyen OS, chay:
+```powershell
+python scripts\setup_hooks.py
+```
+## Test Batch
+Chay nhanh, khong goi VLM:
+```powershell
+python scripts\run_pushup_eval_tests.py
+```
+Chay va luu artifact anh:
+```powershell
+python scripts\run_pushup_eval_tests.py --save-artifacts
+```
+Chay ca VLM:
+```powershell
+python scripts\run_pushup_eval_tests.py --save-artifacts --enable-vlm
+```

docs/UI_REDESIGN.md CHANGED Viewed

@@ -1,27 +1,79 @@
-# UI Redesign
-## Direction
-The new UI should feel like a focused AI coaching tool, not a technical demo. Keep the experience bright, clean, and task-first.
-## Screens
-- Home: short introduction and one clear action to start analysis.
-- Exercise catalog: list supported exercises.
-- Exercise analysis: sample video, user video upload, and analysis result in the same workflow.
-## Visual Style
-- Brand name: AI Coach.
-- Font: Be Vietnam Pro or Inter.
-- Palette: light neutral background, white cards, dark text, green primary CTA, blue secondary accents.
-- Cards: moderate radius, subtle borders, soft shadows.
-- Text: short Vietnamese labels, no long marketing sections.
-## Asset Placeholders
-- Hero image placeholder: person doing push-up or home workout.
-- Exercise thumbnail placeholder: still frame from the push-up template video.
-- Analysis placeholder: simple visual showing video comparison or AI movement analysis.
-These placeholders are represented in UI as clean visual blocks until real assets are added.

+# UI Guide
+## Muc Tieu UI
+UI can giong mot cong cu coaching thuc dung, khong phai landing page marketing. Man hinh phan tich la luong chinh:
+```text
+chon video -> phan tich -> xem ket qua tung rep
+```
+## Cac Trang
+- Home: gioi thieu ngan va CTA vao bai tap.
+- Exercise catalog: hien cac bai tap dang ho tro.
+- Push-up analysis: video mau, upload video hoc vien, nut phan tich, ket qua.
+## Trang Phan Tich Hit Dat
+Trang nay gom:
+- Video mau.
+- Video hoc vien upload.
+- Nut `Phan tich`.
+- Progress label khi dang xu ly.
+- Summary tong quan.
+- Metrics: diem, so rep, rep tot, rep can chu y.
+- Card tung rep.
+## Card Tung Rep
+Moi rep card hien:
+- `Rep N`.
+- Trang thai: `Tot`, `Can chu y`, hoac `Co loi`.
+- Diem rep.
+- Primary error, vi du `Nho mong qua cao`.
+- Anh hoc vien da ve landmark va arrow neu co loi.
+- Anh mau tu template.
+- Feedback tu VLM neu co.
+- Feedback rule-based luon co.
+## Phan Biet Feedback
+- `primary_error`: rule-based label, khong phai VLM.
+- `Feedback rule-based`: cau huong dan deterministic tu `ERROR_GUIDANCE`.
+- `Feedback tu VLM`: cau feedback sinh tu vision model, co the vang mat neu API loi/thieu key.
+## Arrow
+Arrow duoc ve len anh hoc vien, khong ve len anh mau. Target arrow lay tu landmark rule-based:
+- `not_deep_enough`: vung nguc/vai.
+- `hip_sag`: hong.
+- `body_not_straight`: hong.
+- `hip_pike`: hong.
+- `head_misaligned`: dau/co.
+Label arrow dung tieng Viet co dau, vi du:
+- `Ngực chưa đủ thấp`
+- `Hông bị võng xuống`
+- `Hông lệch khỏi trục`
+- `Cổ cúi quá mức`
+- `Hông quá cao`
+## Style
+- Nen sang, text toi, card trang.
+- Mau chinh xanh la cho CTA.
+- Mau xanh duong cho VLM/info.
+- Mau do cho loi/arrow.
+- Card gon, uu tien scan nhanh.
+## Luu Y Khi Sua UI
+- Khong doi `llm_feedback` thanh `vlm_feedback` trong UI rieng le neu chua refactor state/payload dong bo.
+- Khong an feedback rule-based, vi day la nguon giai thich deterministic.
+- Khong doi arrow ve VLM coordinates.
+- Kiem tra mobile va desktop sau khi sua card layout.

docs/implementation_plan.md CHANGED Viewed

@@ -1,87 +1,64 @@
-# Kế hoạch Thực thi Dự án: Huấn luyện viên thể thao AI (Taekwondo)
-Dựa trên yêu cầu từ `problem_statement.md`, dưới đây là bản phân tích và kế hoạch xây dựng phần mềm chi tiết.
-## 1. Tổng quan Dự án (Project Overview)
-- **Sản phẩm:** Ứng dụng Web/Mobile Web hỗ trợ sinh viên Taekwondo tự luyện tập bài quyền thông qua việc so sánh đối chiếu với video chuẩn.
-- **Công nghệ cốt lõi (Core Tech):**
-  - **AI / Computer Vision:** MediaPipe Holistic/Pose & Hand (cho khả năng nhận diện bộ khung xương cơ thể 33 điểm và chi tiết cử động các khớp bàn tay/ngón tay).
-  - **Logic so sánh:** Biomechanics analysis (tính toán Vector, góc giữa các khớp như Cùi chỏ, Vai, Đầu gối, Hông, và chi tiết cấu hình Bàn tay).
-  - **Video Processing:** OpenCV / Video overlay engine (Kỹ thuật gióng lề khung xương, vẽ đè khung xương hoặc overlay 2 video lên nhau, xuất video kết quả).
-  - **Frontend / Giao diện người dùng:** Streamlit (phiên bản responsive hỗ trợ giao diện Mobile).
-  - **Ngôn ngữ / Môi trường:** Python 3.14.
-## 2. Kiến trúc Hệ thống (System Architecture)
-```mermaid
-graph TD
-    A[Người dùng - Mobile Web] -->|Upload Video| B(Streamlit App / Frontend)
-    B -->|Tiền xử lý| C[Hệ thống Phân tích Video cốt lõi]
-    subgraph AI/CV Core
-        C --> D{Trích xuất Landmark}
-        D -->|Video User| E[MediaPipe Pose]
-        D -->|Video Mẫu| F[MediaPipe Pose/Cache DB]
-        E --> G[Biomechanics Matcher]
-        F --> G
-        G -->|Dynamic Time Warping / Angle Compare| H[Tính toán Độ Lệch %]
-    end
-    H --> I[Video Engine - Overlay]
-    I -->|Frames| J[Tạo Video Kết quả MP4]
-    J --> B
-    B -->|Hiển thị Video & Metrics| A
 ```
-## 3. Lộ trình Triển khai (Implementation Phases)
-### Giai đoạn 1: Chuẩn bị Môi trường & Khai phá dữ liệu (Setup & Data Pre-processing)
-- Thiết lập môi trường Python 3.14, cài đặt các thư viện: `mediapipe`, `opencv-python`, `numpy`, `streamlit`, `scipy`.
-- **Dữ liệu Video Mẫu:** Chuẩn bị tối thiểu một vài video thực hiện bài quyền Taekwondo (phân giải ổn định, FPS chuẩn).
-- **Trích xuất thông số Mẫu:** Viết một script chạy MediaPipe để lấy ra toàn bộ tọa độ các khớp xương theo từng frame của video mẫu và lưu dưới dạng JSON/CSV hoặc Numpy Array để tải nhanh ở lần đối chiếu sau.
-### Giai đoạn 2: Xây dựng Bộ Module Phân tích Cốt lõi (Core AI & Biomechanics Engine)
-- **Module Nhận diện Khung xương:** Bọc (wrap) BlazePose để trích xuất 33 điểm tọa độ 3D.
-- **Tính toán Sinh cơ học (Biomechanics):**
-  - Hàm tính toán góc giữa 3 điểm landmark (VD: vai - khuỷu tay - cổ tay).
-  - Hàm chuẩn hóa khung hình (Padding/Scaling) vì chiều cao của người dùng và người mẫu trong video có thể khác nhau.
-- **Thuật toán Đồng bộ hóa Video (Synchronization):**
-  - Vì thời điểm bắt đầu thực hiện bài quyền của sinh viên có thể trễ (ví dụ: đứng yên chuẩn bị ở những giây đầu tiên) và tốc độ ra đòn không khớp, cần áp dụng thuật toán tìm kiếm Frame bắt đầu (Start Frame Detection / Keypose Matching) kết hợp với **DTW (Dynamic Time Warping)** để gióng hàng chính xác 2 chuỗi video frames.
-### Giai đoạn 3: Phân tích Lỗi và Overlay Video (Correction & Result Generation)
-- **Scoring (Tính tỷ lệ % lỗi):**
-  - Đánh giá sự khác biệt của các góc khớp so sánh với video mẫu.
-  - Trung bình hóa sự khác biệt để cho ra phần trăm độ lệch tổng quan.
-- **Video Overlay:**
-  - Làm mờ (alpha blending) video của sinh viên với video của chuyên gia.
-  - Hiển thị thông số tĩnh (Text/HUD) và **khoanh vùng màu đỏ** trực tiếp tại vị trí khớp xương bị lỗi (ví dụ: khoanh tròn đỏ ở cánh tay nếu "Tay đang nâng quá cao") để tạo hiệu ứng trực quan rõ rệt điểm cần sửa trên khung hình.
-  - Xuất frame sang file `.mp4` trả về cho UI.
-### Giai đoạn 4: Giao diện Người dùng Streamlit (Frontend & Mobile Mode)
-- **Giao diện chính (UI/UX):**
-  - Section 1: Hướng dẫn sử dụng & Lựa chọn bài quyền Taekwondo.
-  - Section 2: Upload Video (hoặc Quay video trực tiếp bằng Mobile Camera thông qua plugin Streamlit webrtc, nếu khả thi).
-  - Section 3: Hiển thị thanh tiến trình xử lý (Progress bar).
-  - Section 4: Màn hình Kết quả (Dashboard) bao gồm: Video Overlay (nhấn Play), Biểu đồ thể hiện độ lệch xương, Điểm tổng kết.
-- Tối ưu hóa UI bằng Vanilla CSS tích hợp vào Streamlit (thông qua `st.markdown(unsafe_allow_html=True)`) giúp các layout thân thiện với màn hình dọc của Mobile.
-### Giai đoạn 5: Tối ưu hóa hiệu năng & Kiểm thử (Testing & Optimization)
-- Xử lý vấn đề bộ nhớ/RAM khi parse video lớn trên Streamlit. Cắt giới hạn video tải lên (VD: tối đa 60 giây / 50MB).
-- Tối ưu UI/UX: Áp dụng hiệu ứng skeleton loading và UX mượt mà.
-- Kiểm thử các trường hợp video chất lượng thấp, môi trường ánh sáng yếu ảnh hưởng tới BlazePose.
-## 4. Cấu trúc thư mục dự kiến
-```text
-/
-├── src/
-│   ├── app.py              # File chạy chính của trang Streamlit
-│   ├── core/
-│   │   ├── pose.py         # MediaPipe BlazePose logic
-│   │   ├── compare.py      # Logic DTW & Biomechanic Vectors
-│   │   └── overlay.py      # Xử lý render OpenCV Video
-│   └── data/
-│       └── reference_vids/ # Chứa video chuyên nghiệp & JSON frames
-├── requirements.txt
-├── README.md
-└── problem_statement.md
-```

+# Implementation Plan
+## Trang Thai Hien Tai
+Du an da chuyen tu prototype Taekwondo/Streamlit sang ung dung Reflex phan tich bai tap hit dat. Luong xu ly production hien tai nam trong cac module `push_up/*` va `reflex_frontend/*`.
+## Kien Truc Chinh
+```text
+Reflex UI
+  -> upload video hoc vien
+  -> push_up.analysis_service.analyze_pushup()
+  -> VideoProcessor + MediaPipe Pose
+  -> PushUpEvaluator
+  -> PushUpRuleEngine
+  -> frame artifacts + rule-based arrows
+  -> optional NVIDIA VLM feedback
+  -> Reflex result cards
 ```
+## Module Chinh
+- `reflex_frontend/state.py`: quan ly upload, tien trinh xu ly, state ket qua.
+- `reflex_frontend/ui.py`: giao dien trang chu, danh sach bai tap, trang phan tich hit dat, card ket qua tung rep.
+- `push_up/processor.py`: doc video, phat hien huong nguoi tap, flip neu can, trich xuat landmark theo frame.
+- `push_up/engine.py`: boc MediaPipe Pose, tinh cac chi so kinematics nhu elbow angle, hip angle, body line angle, depth signal.
+- `push_up/evaluator.py`: segment rep, can chinh voi template bang DTW, ket hop diem rule score va DTW score.
+- `push_up/rules.py`: phat hien loi ky thuat bang nguong deterministic.
+- `push_up/analysis_service.py`: orchestration cua toan bo flow, tao payload cho UI, luu artifact, ve arrow.
+- `push_up/feedback_graph.py`: goi text LLM/VLM qua NVIDIA OpenAI-compatible API va validate JSON feedback.
+- `scripts/run_pushup_eval_tests.py`: batch test tat ca video trong `data/tests` va xuat file JSON tong hop.
+## Luong Phan Tich Video
+1. UI luu video upload vao thu muc tam cua Reflex.
+2. `prepare_template_cache()` dam bao video mau da duoc xu ly va cache landmark tai `data/processed/pushup_template.pkl`.
+3. `VideoProcessor` doc video hoc vien, lay mau frame, auto flip neu nguoi tap nam nguoc huong template.
+4. `PoseEngine` trich xuat 33 landmark pose va tinh kinematics.
+5. `PushUpEvaluator` segment rep dua tren depth signal va elbow angle.
+6. Moi rep duoc so sanh voi golden template bang DTW va pose similarity.
+7. `PushUpRuleEngine` chay rule-based checks:
+   - `not_deep_enough`
+   - `hip_sag`
+   - `body_not_straight`
+   - `head_misaligned`
+   - `hip_pike`
+8. `analysis_service` tao card ket qua tung rep, main error, feedback rule-based, va summary.
+9. Khi `save_artifacts=True`, app luu frame hoc vien/mau va ve arrow vao frame hoc vien.
+10. Neu co `NVIDIA_API_KEY`, VLM duoc goi de sinh feedback chu cho rep loi.
+## Quyet Dinh Thiet Ke
+- Rule engine la nguon quyet dinh loi chinh, diem rule, feedback rule-based va arrow target.
+- VLM khong duoc dung de quyet dinh toa do arrow vi de tra sai vi tri tren nen anh.
+- VLM chi sinh feedback ngan bang tieng Viet dua tren anh hoc vien, anh mau va rule context.
+- Arrow label dung tieng Viet co dau va duoc render bang Pillow de tranh loi Unicode cua OpenCV.
+- Ket qua test batch mac dinh tat VLM de lap lai, nhanh hon va khong ton API.
+## Viec Can Lam Tiep
+- Them expected assertions vao batch test, vi du `hv01_mong_cao` phai co `hip_pike`.
+- Luu anh artifact cho test run co chon loc thay vi tat ca rep neu can giam dung luong.
+- Tach `llm_*` field thanh `vlm_*` trong payload/UI de ten bien khop voi y nghia hien tai.
+- Them exercise registry neu mo rong sang bai tap khac.

docs/legacy_agent_notes.md ADDED Viewed

	@@ -0,0 +1,22 @@

+# Legacy Agent Notes
+File nay thay the tai lieu lich su `Migrating Agent to OpenAI.md`.
+Noi dung cu la ban ghi hoi thoai ve prototype agent/OpenAI va Taekwondo. Phan do khong con mo ta luong production hien tai cua ung dung.
+## Trang Thai Hien Tai
+- Ung dung production hien tai la Reflex app cho bai tap hit dat.
+- LLM/VLM integration hien tai nam trong `push_up/feedback_graph.py`.
+- Provider hien tai dung NVIDIA OpenAI-compatible API, khong phai agent loop rieng trong `src/agent.py`.
+- Rule engine van la nguon quyet dinh loi va diem.
+## Khi Nao Can Xem Lai Phan Legacy
+Chi can quan tam neu du an quay lai huong:
+- agent tool-calling rieng;
+- OpenAI API truc tiep thay vi NVIDIA endpoint;
+- phan tich Taekwondo prototype cu.
+Voi roadmap hien tai, hay doc `implementation_plan.md` va `LLM_FEEDBACK_NVIDIA_LANGGRAPH.md` thay cho file legacy nay.

docs/problem_statement.md CHANGED Viewed

@@ -1,4 +1,37 @@
-- **Tên dự án:** Huấn luyện viên thể thao AI thông qua phân tích video.
-- **Mục tiêu:** Giải quyết vấn đề thiếu hướng dẫn cá nhân hóa cho sinh viên thể thao trong việc nhận diện và điều chỉnh các động tác thực hiện bài quyền Taekwondo chưa chuẩn so với video mẫu chuyên nghiệp.
-- **Yêu cầu nghiệp vụ:** Ứng dụng di động (Mobile App) ứng dụng AI/Computer Vision để phân tích video, so sánh với dữ liệu mẫu chuyên nghiệp, và cung cấp phản hồi tinh chỉnh trực quan bao gồm phần trăm độ lệch giữa video của người dùng và video mẫu, và video overlay để người dùng dễ dàng nhận biết lỗi sai.
-- **Tech stack:** mediapipe, blazepose, biomechanics analysis, video overlay engine, streamlit, Python 3.14

+# Problem Statement
+## Ten Du An
+AI Coach - ung dung phan tich dong tac hit dat bang video.
+## Muc Tieu
+Nguoi tap thuong kho tu nhan ra loi ky thuat khi tap mot minh, dac biet cac loi nhu chua ha nguoi du sau, hong vong, hong qua cao, co the khong giu thang, hoac cui dau qua muc. Du an nay xay dung mot ung dung web giup nguoi dung tai video tap luyen, so sanh voi video mau, va nhan phan hoi truc quan theo tung rep.
+## Pham Vi Hien Tai
+Phien ban hien tai tap trung vao bai tap hit dat:
+- Phat hien tu dong so rep trong video.
+- So sanh chuoi dong tac cua hoc vien voi video mau trong `data/templates/push_up_template.mp4`.
+- Cham diem tong quan va diem tung rep.
+- Phat hien loi bang rule engine dua tren landmark va goc khop.
+- Hien thi frame hoc vien va frame mau cho tung rep.
+- Ve mui ten do vao vi tri loi dua tren landmark rule-based.
+- Sinh feedback tieng Viet ngan gon bang VLM neu cau hinh NVIDIA API.
+- Luu ket qua test video thanh file JSON de review lai.
+## Ngoai Pham Vi Hien Tai
+- Chua ho tro nhieu bai tap ngoai hit dat trong UI production.
+- Chua phan tich video realtime tu camera.
+- Chua dung VLM de quyet dinh loi chinh hoac toa do arrow. VLM chi ho tro feedback chu.
+- Chua dung ket qua VLM de thay the rule engine.
+## Yeu Cau Thanh Cong
+- App chay duoc bang `reflex run`.
+- Nguoi dung upload video va nhan ket qua theo tung rep.
+- Loi chinh phai den tu rule-based analysis, co the giai thich va kiem thu.
+- Arrow phai tro vao co the hoc vien, khong tro vao nen anh.
+- Cac video trong `data/tests` co the duoc chay batch test va xuat mot file JSON tong hop.

push_up/analysis_service.py CHANGED Viewed

@@ -9,8 +9,10 @@ import uuid
 import cv2
 import mediapipe as mp
 from push_up.evaluator import PushUpEvaluator
 from push_up.processor import VideoProcessor
@@ -18,6 +20,7 @@ BASE_DIR = Path(__file__).resolve().parent.parent
 TEMPLATE_SOURCE = BASE_DIR / "data" / "templates" / "push_up_template.mp4"
 TEMPLATE_CACHE = BASE_DIR / "data" / "processed" / "pushup_template.pkl"
 DEBUG_ARTIFACTS_ROOT = BASE_DIR / "analysis_artifacts"
 ANALYSIS_DIR = Path("analysis")
 ERROR_GUIDANCE = {
@@ -36,6 +39,14 @@ ERROR_LABELS = {
     "hip_pike": "Nhô mông quá cao",
 }
 SEVERITY_RANK = {"high": 3, "medium": 2, "low": 1}
@@ -96,11 +107,6 @@ def analyze_pushup(
     if not template_video_path.exists():
         template_video_path = TEMPLATE_SOURCE
-    student_data, _ = processor.process_video_from_path(str(student_video_path))
-    result = evaluator.evaluate(expert_data, student_data)
-    if result.get("error"):
-        return {"error": result["error"]}
     run_id = ""
     run_dir: Path | None = None
     if save_artifacts:
@@ -108,39 +114,75 @@ def analyze_pushup(
         run_dir = upload_root / ANALYSIS_DIR / run_id
         run_dir.mkdir(parents=True, exist_ok=True)
     rep_cards = []
     error_counter: Counter[str] = Counter()
     high_severity_reps = 0
     good_reps = 0
     for rep in result["rep_results"]:
-        student_frame = student_data[rep["w_pair"][0]]
         expert_frame = expert_data[rep["w_pair"][1]]
         student_rel = ""
         expert_rel = ""
         if save_artifacts:
             student_artifact = ANALYSIS_DIR / run_id / f"student_rep_{rep['rep_num']}.jpg"
             expert_artifact = ANALYSIS_DIR / run_id / f"expert_rep_{rep['rep_num']}.jpg"
-            _save_pose_frame(
                 processor=processor,
                 video_path=student_video_path,
                 frame_idx=student_frame["frame_idx"],
-                destination=upload_root / student_artifact,
                 flip=student_frame.get("flipped", False),
             )
             _save_pose_frame(
                 processor=processor,
                 video_path=template_video_path,
                 frame_idx=expert_frame["frame_idx"],
-                destination=upload_root / expert_artifact,
                 flip=expert_frame.get("flipped", False),
             )
             student_rel = str(student_artifact)
             expert_rel = str(expert_artifact)
-        rep_errors = [_serialize_error(error) for error in rep["errors"]]
         if rep["score"] >= 0.88 and not rep_errors:
             good_reps += 1
         if any(error["severity"] == "high" for error in rep_errors):
@@ -158,6 +200,12 @@ def analyze_pushup(
                 "error_labels": [error["label"] for error in rep_errors],
                 "primary_error": rep_errors[0]["label"] if rep_errors else "Không có lỗi nghiêm trọng",
                 "feedback": _rep_feedback(rep_errors),
                 "student_frame_path": student_rel,
                 "expert_frame_path": expert_rel,
                 "rule_score_pct": round(float(rep["rule_score"]) * 100, 1),
@@ -199,28 +247,103 @@ def analyze_pushup(
         "rep_results": rep_cards,
         "student_video_path": "",
     }
     if save_artifacts and run_dir is not None:
         payload["student_video_path"] = _safe_relative_path(student_video_path, upload_root)
         with (run_dir / "result.json").open("w", encoding="utf-8") as file:
             json.dump(payload, file, ensure_ascii=False, indent=2)
     return payload
 def _save_pose_frame(
     processor: VideoProcessor,
     video_path: Path,
     frame_idx: int,
     destination: Path,
     flip: bool,
-) -> None:
     destination.parent.mkdir(parents=True, exist_ok=True)
     frame = processor.get_frame(str(video_path), frame_idx, flip=flip)
     if frame is None:
-        return
     _, landmarks = processor.engine.extract_kinematics(frame.copy(), is_static=True)
     if landmarks:
         mp.solutions.drawing_utils.draw_landmarks(
             frame,
@@ -228,6 +351,245 @@ def _save_pose_frame(
             mp.solutions.pose.POSE_CONNECTIONS,
         )
     cv2.imwrite(str(destination), frame)
 def _serialize_error(error: dict) -> dict:
@@ -319,3 +681,10 @@ def _safe_relative_path(path: Path, base: Path) -> str:
         return path.relative_to(base).as_posix()
     except ValueError:
         return path.as_posix()

 import cv2
 import mediapipe as mp
+import numpy as np
 from push_up.evaluator import PushUpEvaluator
+from push_up.feedback_graph import generate_rep_visual_feedback
 from push_up.processor import VideoProcessor
 TEMPLATE_SOURCE = BASE_DIR / "data" / "templates" / "push_up_template.mp4"
 TEMPLATE_CACHE = BASE_DIR / "data" / "processed" / "pushup_template.pkl"
 DEBUG_ARTIFACTS_ROOT = BASE_DIR / "analysis_artifacts"
+VIDEO_TEST_LOG_FILE = BASE_DIR / ".ai-log" / "video_test_runs.jsonl"
 ANALYSIS_DIR = Path("analysis")
 ERROR_GUIDANCE = {
     "hip_pike": "Nhô mông quá cao",
 }
+ARROW_LABELS = {
+    "not_deep_enough": "Ngực chưa đủ thấp",
+    "hip_sag": "Hông bị võng xuống",
+    "body_not_straight": "Hông lệch khỏi trục",
+    "head_misaligned": "Cổ cúi quá mức",
+    "hip_pike": "Hông quá cao",
+}
 SEVERITY_RANK = {"high": 3, "medium": 2, "low": 1}
     if not template_video_path.exists():
         template_video_path = TEMPLATE_SOURCE
     run_id = ""
     run_dir: Path | None = None
     if save_artifacts:
         run_dir = upload_root / ANALYSIS_DIR / run_id
         run_dir.mkdir(parents=True, exist_ok=True)
+    student_data, _ = processor.process_video_from_path(str(student_video_path))
+    result = evaluator.evaluate(expert_data, student_data)
+    if result.get("error"):
+        payload = {
+            "error": result["error"],
+            "exercise": "pushup",
+            "exercise_label": "Hít đất / Push-up",
+            "student_video_path": _safe_relative_path(student_video_path, upload_root),
+        }
+        if save_artifacts and run_dir is not None:
+            with (run_dir / "result.json").open("w", encoding="utf-8") as file:
+                json.dump(payload, file, ensure_ascii=False, indent=2)
+            _write_video_test_run_log(
+                payload=payload,
+                run_id=run_id,
+                run_dir=run_dir,
+                student_video_path=student_video_path,
+                template_video_path=template_video_path,
+            )
+        return payload
     rep_cards = []
     error_counter: Counter[str] = Counter()
     high_severity_reps = 0
     good_reps = 0
     for rep in result["rep_results"]:
+        rep_errors = [_serialize_error(error) for error in rep["errors"]]
+        student_frame = _select_student_frame_for_rep(rep, student_data)
         expert_frame = expert_data[rep["w_pair"][1]]
         student_rel = ""
         expert_rel = ""
+        llm_visual = _empty_rep_visual_feedback()
         if save_artifacts:
             student_artifact = ANALYSIS_DIR / run_id / f"student_rep_{rep['rep_num']}.jpg"
             expert_artifact = ANALYSIS_DIR / run_id / f"expert_rep_{rep['rep_num']}.jpg"
+            student_abs = upload_root / student_artifact
+            expert_abs = upload_root / expert_artifact
+            student_landmarks = _save_pose_frame(
                 processor=processor,
                 video_path=student_video_path,
                 frame_idx=student_frame["frame_idx"],
+                destination=student_abs,
                 flip=student_frame.get("flipped", False),
             )
             _save_pose_frame(
                 processor=processor,
                 video_path=template_video_path,
                 frame_idx=expert_frame["frame_idx"],
+                destination=expert_abs,
                 flip=expert_frame.get("flipped", False),
             )
+            if _should_request_rep_visual_feedback(rep, rep_errors):
+                llm_visual = generate_rep_visual_feedback(
+                    student_image_path=str(student_abs),
+                    expert_image_path=str(expert_abs),
+                    rep_context=_build_rep_visual_context(rep, rep_errors),
+                )
+                rule_arrow = _build_rule_arrow(rep_errors, student_landmarks)
+                if rule_arrow:
+                    llm_visual["arrow"] = rule_arrow
+                    _draw_llm_arrow_on_image(student_abs, rule_arrow)
+                elif llm_visual.get("arrow"):
+                    _draw_llm_arrow_on_image(student_abs, llm_visual["arrow"])
             student_rel = str(student_artifact)
             expert_rel = str(expert_artifact)
         if rep["score"] >= 0.88 and not rep_errors:
             good_reps += 1
         if any(error["severity"] == "high" for error in rep_errors):
                 "error_labels": [error["label"] for error in rep_errors],
                 "primary_error": rep_errors[0]["label"] if rep_errors else "Không có lỗi nghiêm trọng",
                 "feedback": _rep_feedback(rep_errors),
+                "rule_feedback": _rep_feedback(rep_errors),
+                "llm_feedback": llm_visual.get("feedback", ""),
+                "llm_feedback_source": llm_visual.get("source", ""),
+                "llm_feedback_error": llm_visual.get("error", ""),
+                "llm_visual_error_label": llm_visual.get("visual_error_label", ""),
+                "llm_arrow": llm_visual.get("arrow"),
                 "student_frame_path": student_rel,
                 "expert_frame_path": expert_rel,
                 "rule_score_pct": round(float(rep["rule_score"]) * 100, 1),
         "rep_results": rep_cards,
         "student_video_path": "",
     }
+    payload["coach_feedback"] = ""
     if save_artifacts and run_dir is not None:
         payload["student_video_path"] = _safe_relative_path(student_video_path, upload_root)
         with (run_dir / "result.json").open("w", encoding="utf-8") as file:
             json.dump(payload, file, ensure_ascii=False, indent=2)
+        _write_video_test_run_log(
+            payload=payload,
+            run_id=run_id,
+            run_dir=run_dir,
+            student_video_path=student_video_path,
+            template_video_path=template_video_path,
+        )
     return payload
+def _write_video_test_run_log(
+    *,
+    payload: dict,
+    run_id: str,
+    run_dir: Path,
+    student_video_path: Path,
+    template_video_path: Path,
+) -> None:
+    """Append a lightweight per-video test log without copying image artifacts."""
+    try:
+        index_entry = _build_video_test_index_entry(
+            payload=payload,
+            run_id=run_id,
+            run_dir=run_dir,
+            student_video_path=student_video_path,
+            template_video_path=template_video_path,
+        )
+        VIDEO_TEST_LOG_FILE.parent.mkdir(parents=True, exist_ok=True)
+        with VIDEO_TEST_LOG_FILE.open("a", encoding="utf-8") as file:
+            file.write(json.dumps(index_entry, ensure_ascii=False) + "\n")
+    except OSError as exc:
+        print(f"Video test logging failed: {exc}")
+def _build_video_test_index_entry(
+    *,
+    payload: dict,
+    run_id: str,
+    run_dir: Path,
+    student_video_path: Path,
+    template_video_path: Path,
+) -> dict:
+    rep_results = payload.get("rep_results", [])
+    created_at = datetime.now().isoformat(timespec="seconds")
+    return {
+        "ts": created_at,
+        "run_id": run_id,
+        "exercise": payload.get("exercise", "pushup"),
+        "student_video_name": student_video_path.name,
+        "student_video_path": student_video_path.as_posix(),
+        "template_video_path": template_video_path.as_posix(),
+        "overall_score_pct": payload.get("overall_score_pct"),
+        "student_reps": payload.get("student_reps"),
+        "expert_reps": payload.get("expert_reps"),
+        "error": payload.get("error"),
+        "summary": payload.get("summary", ""),
+        "main_errors": payload.get("main_errors", []),
+        "result_json": _project_relative_path(run_dir / "result.json"),
+        "artifact_dir": _project_relative_path(run_dir),
+        "rep_logs": [
+            {
+                "rep_num": rep.get("rep_num"),
+                "score_pct": rep.get("score_pct"),
+                "primary_error": rep.get("primary_error"),
+                "rule_feedback": rep.get("rule_feedback"),
+                "llm_feedback": rep.get("llm_feedback"),
+                "llm_feedback_source": rep.get("llm_feedback_source"),
+                "llm_feedback_error": rep.get("llm_feedback_error"),
+                "student_frame_path": rep.get("student_frame_path", ""),
+                "expert_frame_path": rep.get("expert_frame_path", ""),
+            }
+            for rep in rep_results
+        ],
+    }
 def _save_pose_frame(
     processor: VideoProcessor,
     video_path: Path,
     frame_idx: int,
     destination: Path,
     flip: bool,
+) -> list[dict]:
     destination.parent.mkdir(parents=True, exist_ok=True)
     frame = processor.get_frame(str(video_path), frame_idx, flip=flip)
     if frame is None:
+        return []
     _, landmarks = processor.engine.extract_kinematics(frame.copy(), is_static=True)
+    landmark_points = _landmarks_to_points(landmarks) if landmarks else []
     if landmarks:
         mp.solutions.drawing_utils.draw_landmarks(
             frame,
             mp.solutions.pose.POSE_CONNECTIONS,
         )
     cv2.imwrite(str(destination), frame)
+    return landmark_points
+def _landmarks_to_points(landmarks) -> list[dict]:
+    return [
+        {
+            "x": float(landmark.x),
+            "y": float(landmark.y),
+            "visibility": float(getattr(landmark, "visibility", 1.0)),
+        }
+        for landmark in landmarks.landmark
+    ]
+def _select_student_frame_for_rep(rep: dict, student_data: list[dict]) -> dict:
+    error_frame_indices = []
+    for error in rep.get("errors", []):
+        error_frame_indices.extend(error.get("frames", []))
+    if not error_frame_indices:
+        return student_data[rep["w_pair"][0]]
+    target_frame = error_frame_indices[0]
+    start, end = rep.get("range", (0, len(student_data)))
+    candidates = student_data[start:end] or student_data
+    return min(candidates, key=lambda frame: abs(frame["frame_idx"] - target_frame))
+def _should_request_rep_visual_feedback(rep: dict, rep_errors: list[dict]) -> bool:
+    return bool(rep_errors) or float(rep.get("score", 1.0)) < 0.88
+def _empty_rep_visual_feedback() -> dict:
+    return {
+        "source": "",
+        "is_error": False,
+        "visual_error_label": "",
+        "feedback": "",
+        "arrow": None,
+    }
+def _build_rep_visual_context(rep: dict, rep_errors: list[dict]) -> dict:
+    return {
+        "rep_num": int(rep["rep_num"]),
+        "score_pct": round(float(rep["score"]) * 100, 1),
+        "rule_score_pct": round(float(rep["rule_score"]) * 100, 1),
+        "dtw_score_pct": round(float(rep["dtw_score"]) * 100, 1),
+        "rule_errors": [
+            {
+                "type": error["type"],
+                "label": error["label"],
+                "severity": error["severity"],
+                "guidance": error["guidance"],
+            }
+            for error in rep_errors
+        ],
+        "rule_feedback": _rep_feedback(rep_errors),
+    }
+def _build_rule_arrow(rep_errors: list[dict], landmark_points: list[dict]) -> dict | None:
+    if not rep_errors or not landmark_points:
+        return None
+    for error in rep_errors:
+        error_type = error.get("type", "")
+        target = _arrow_target_for_error(error_type, landmark_points)
+        if target:
+            return {
+                "x": target["x"],
+                "y": target["y"],
+                "label": ARROW_LABELS.get(error_type, error.get("label") or "Vị trí cần sửa"),
+                "source": "rule_landmark",
+                "error_type": error_type,
+            }
+    return None
+def _arrow_target_for_error(error_type: str, landmark_points: list[dict]) -> dict | None:
+    if error_type in {"hip_sag", "hip_pike", "body_not_straight"}:
+        return (
+            _average_pose_points(landmark_points, [23, 24])
+            or _average_pose_points(landmark_points, [11, 12, 23, 24])
+        )
+    if error_type == "not_deep_enough":
+        shoulder = _average_pose_points(landmark_points, [11, 12])
+        hip = _average_pose_points(landmark_points, [23, 24])
+        if shoulder and hip:
+            return _interpolate_pose_points(shoulder, hip, 0.30)
+        return shoulder or _average_pose_points(landmark_points, [13, 14])
+    if error_type == "head_misaligned":
+        return (
+            _average_pose_points(landmark_points, [0, 7, 8])
+            or _average_pose_points(landmark_points, [0, 11, 12])
+        )
+    return _average_pose_points(landmark_points, [11, 12, 23, 24])
+def _average_pose_points(landmark_points: list[dict], indices: list[int]) -> dict | None:
+    valid_points = [
+        point
+        for index in indices
+        if (point := _pose_point(landmark_points, index)) is not None
+        and point["visibility"] >= 0.10
+    ]
+    if not valid_points:
+        valid_points = [
+            point
+            for index in indices
+            if (point := _pose_point(landmark_points, index)) is not None
+        ]
+    if not valid_points:
+        return None
+    return {
+        "x": _clamp_float(sum(point["x"] for point in valid_points) / len(valid_points), 0.02, 0.98),
+        "y": _clamp_float(sum(point["y"] for point in valid_points) / len(valid_points), 0.02, 0.98),
+    }
+def _pose_point(landmark_points: list[dict], index: int) -> dict | None:
+    if index >= len(landmark_points):
+        return None
+    point = landmark_points[index]
+    x = float(point.get("x", -1.0))
+    y = float(point.get("y", -1.0))
+    if not np.isfinite(x) or not np.isfinite(y) or x < -0.05 or x > 1.05 or y < -0.05 or y > 1.05:
+        return None
+    return {
+        "x": _clamp_float(x, 0.02, 0.98),
+        "y": _clamp_float(y, 0.02, 0.98),
+        "visibility": float(point.get("visibility", 1.0)),
+    }
+def _interpolate_pose_points(start: dict, end: dict, ratio: float) -> dict:
+    ratio = _clamp_float(ratio, 0.0, 1.0)
+    return {
+        "x": _clamp_float(start["x"] + (end["x"] - start["x"]) * ratio, 0.02, 0.98),
+        "y": _clamp_float(start["y"] + (end["y"] - start["y"]) * ratio, 0.02, 0.98),
+    }
+def _clamp_float(value: float, lower: float, upper: float) -> float:
+    return max(lower, min(value, upper))
+def _draw_llm_arrow_on_image(image_path: Path, arrow: dict) -> None:
+    frame = cv2.imread(str(image_path))
+    if frame is None:
+        return
+    height, width = frame.shape[:2]
+    target_x = int(float(arrow["x"]) * width)
+    target_y = int(float(arrow["y"]) * height)
+    target_x = _clamp(target_x, 0, width - 1)
+    target_y = _clamp(target_y, 0, height - 1)
+    start_x, start_y = _arrow_start_point(target_x, target_y, width, height)
+    label = str(arrow.get("label") or "Vị trí cần sửa")
+    cv2.arrowedLine(
+        frame,
+        (start_x, start_y),
+        (target_x, target_y),
+        (0, 0, 255),
+        5,
+        tipLength=0.22,
+    )
+    cv2.circle(frame, (target_x, target_y), 22, (0, 0, 255), 5)
+    _draw_label(frame, label, start_x, start_y - 42)
+    cv2.imwrite(str(image_path), frame)
+def _arrow_start_point(target_x: int, target_y: int, width: int, height: int) -> tuple[int, int]:
+    x_offset = _clamp(int(width * 0.24), 150, 240)
+    y_offset = _clamp(int(height * 0.20), 90, 150)
+    start_x = target_x - x_offset
+    if start_x < 20:
+        start_x = target_x + x_offset
+    start_y = target_y - y_offset
+    if start_y < 48:
+        start_y = target_y + y_offset
+    return (
+        _clamp(start_x, 20, width - 40),
+        _clamp(start_y, 48, height - 32),
+    )
+def _draw_label(frame, label: str, x: int, y: int) -> None:
+    height, width = frame.shape[:2]
+    try:
+        from PIL import Image, ImageDraw, ImageFont
+    except ImportError:
+        return
+    font = _load_label_font(ImageFont, _clamp(width // 44, 18, 26))
+    image = Image.fromarray(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
+    draw = ImageDraw.Draw(image)
+    text_bbox = draw.textbbox((0, 0), label, font=font)
+    text_w = text_bbox[2] - text_bbox[0]
+    text_h = text_bbox[3] - text_bbox[1]
+    padding_x = 9
+    padding_y = 7
+    x = _clamp(x, 8, width - text_w - padding_x * 2 - 8)
+    y = _clamp(y, 8, height - text_h - padding_y * 2 - 8)
+    draw.rounded_rectangle(
+        (x, y, x + text_w + padding_x * 2, y + text_h + padding_y * 2),
+        radius=4,
+        fill=(220, 0, 0),
+    )
+    draw.text((x + padding_x, y + padding_y), label, font=font, fill=(255, 255, 255))
+    frame[:, :] = cv2.cvtColor(np.array(image), cv2.COLOR_RGB2BGR)
+def _load_label_font(image_font, size: int):
+    font_candidates = [
+        Path("C:/Windows/Fonts/arial.ttf"),
+        Path("/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf"),
+        Path("/System/Library/Fonts/Supplemental/Arial Unicode.ttf"),
+    ]
+    for font_path in font_candidates:
+        if font_path.exists():
+            return image_font.truetype(str(font_path), size=size)
+    return image_font.load_default()
+def _clamp(value: int, lower: int, upper: int) -> int:
+    if upper < lower:
+        return lower
+    return max(lower, min(value, upper))
 def _serialize_error(error: dict) -> dict:
         return path.relative_to(base).as_posix()
     except ValueError:
         return path.as_posix()
+def _project_relative_path(path: Path) -> str:
+    try:
+        return path.relative_to(BASE_DIR).as_posix()
+    except ValueError:
+        return path.as_posix()

push_up/feedback_graph.py ADDED Viewed

	@@ -0,0 +1,563 @@

+from __future__ import annotations
+import base64
+from pathlib import Path
+from functools import lru_cache
+import ast
+import json
+import logging
+import os
+import re
+import unicodedata
+from typing import Any, TypedDict
+from dotenv import load_dotenv
+load_dotenv()
+logger = logging.getLogger(__name__)
+NVIDIA_BASE_URL = "https://integrate.api.nvidia.com/v1"
+DEFAULT_NVIDIA_MODEL = "deepseek-ai/deepseek-v4-flash"
+DEFAULT_NVIDIA_VISION_MODEL = "meta/llama-3.2-11b-vision-instruct"
+DEFAULT_NVIDIA_VISION_MAX_TOKENS = 1200
+MAX_FEEDBACK_CHARS = 850
+class FeedbackState(TypedDict, total=False):
+    analysis: dict[str, Any]
+    evidence: dict[str, Any]
+    feedback: str
+    warnings: list[str]
+def _env_int(name: str, default: int) -> int:
+    raw_value = os.getenv(name, "").strip()
+    if not raw_value:
+        return default
+    try:
+        return int(raw_value)
+    except ValueError:
+        return default
+def generate_coach_feedback(analysis_result: dict[str, Any]) -> str:
+    """Generate a short Vietnamese coaching note from deterministic analysis.
+    The graph is intentionally narrow: MediaPipe/rules decide the errors, the
+    LLM only rewrites those facts into actionable coaching language.
+    """
+    fallback = _fallback_feedback(analysis_result)
+    api_key = os.getenv("NVIDIA_API_KEY", "").strip()
+    if not api_key or api_key == "nvapi-...":
+        return fallback
+    try:
+        graph = _compiled_graph()
+        final_state = graph.invoke(
+            {
+                "analysis": analysis_result,
+                "feedback": fallback,
+                "warnings": [],
+            }
+        )
+        return final_state.get("feedback") or fallback
+    except Exception:
+        return fallback
+def generate_rep_visual_feedback(
+    *,
+    student_image_path: str,
+    expert_image_path: str,
+    rep_context: dict[str, Any],
+) -> dict[str, Any]:
+    """Use a VLM to inspect one student/expert frame pair.
+    Returns compact text feedback. Arrow placement is handled downstream from
+    rule-based landmarks so it stays anchored to the student's body.
+    """
+    api_key = os.getenv("NVIDIA_API_KEY", "").strip()
+    if not api_key or api_key == "nvapi-...":
+        return _rep_visual_fallback(rep_context, "missing NVIDIA_API_KEY")
+    try:
+        comparison_image_path = _make_comparison_image(student_image_path, expert_image_path)
+        parsed = _call_and_parse_vlm_for_rep(
+            comparison_image_path=str(comparison_image_path),
+            rep_context=rep_context,
+            api_key=api_key,
+        )
+        return _validate_rep_visual_feedback(parsed, rep_context)
+    except Exception as exc:
+        logger.warning("VLM rep feedback failed: %s", exc)
+        return _rep_visual_fallback(rep_context, str(exc))
+@lru_cache(maxsize=1)
+def _compiled_graph():
+    from langgraph.graph import END, START, StateGraph
+    graph = StateGraph(FeedbackState)
+    graph.add_node("build_evidence", _build_evidence)
+    graph.add_node("generate_feedback", _generate_feedback)
+    graph.add_node("validate_feedback", _validate_feedback)
+    graph.add_edge(START, "build_evidence")
+    graph.add_edge("build_evidence", "generate_feedback")
+    graph.add_edge("generate_feedback", "validate_feedback")
+    graph.add_edge("validate_feedback", END)
+    return graph.compile()
+def _build_evidence(state: FeedbackState) -> FeedbackState:
+    analysis = state["analysis"]
+    rep_results = analysis.get("rep_results", [])
+    weakest_reps = sorted(rep_results, key=lambda rep: rep.get("score_pct", 100))[:3]
+    evidence = {
+        "exercise": analysis.get("exercise_label", "Hít đất"),
+        "overall_score_pct": analysis.get("overall_score_pct", 0),
+        "student_reps": analysis.get("student_reps", 0),
+        "expert_reps": analysis.get("expert_reps", 0),
+        "good_reps": analysis.get("good_reps", 0),
+        "serious_reps": analysis.get("serious_reps", 0),
+        "summary": analysis.get("summary", ""),
+        "main_errors": [
+            {
+                "label": err.get("label", ""),
+                "count": err.get("count", 0),
+                "severity": err.get("severity", ""),
+                "guidance": err.get("guidance", ""),
+            }
+            for err in analysis.get("main_errors", [])[:3]
+        ],
+        "weakest_reps": [
+            {
+                "rep_num": rep.get("rep_num"),
+                "score_pct": rep.get("score_pct"),
+                "status": rep.get("status"),
+                "errors": rep.get("error_labels", []),
+                "feedback": rep.get("feedback", ""),
+            }
+            for rep in weakest_reps
+        ],
+    }
+    return {**state, "evidence": evidence}
+def _generate_feedback(state: FeedbackState) -> FeedbackState:
+    from openai import OpenAI
+    client = OpenAI(
+        base_url=os.getenv("NVIDIA_BASE_URL", NVIDIA_BASE_URL),
+        api_key=os.getenv("NVIDIA_API_KEY"),
+    )
+    model = os.getenv("NVIDIA_MODEL", DEFAULT_NVIDIA_MODEL)
+    prompt = _feedback_prompt(state["evidence"])
+    completion = client.chat.completions.create(
+        model=model,
+        messages=[
+            {
+                "role": "system",
+                "content": (
+                    "Bạn là huấn luyện viên thể hình. Chỉ dùng dữ liệu đã cung cấp, "
+                    "không bịa lỗi mới. Trả lời tiếng Việt, ngắn gọn, tối đa 150 chữ."
+                ),
+            },
+            {"role": "user", "content": prompt},
+        ],
+        temperature=0.25,
+        top_p=0.9,
+        max_tokens=500,
+    )
+    feedback = (completion.choices[0].message.content or "").strip()
+    return {**state, "feedback": feedback}
+def _call_vlm_for_rep(
+    *,
+    comparison_image_path: str,
+    rep_context: dict[str, Any],
+    api_key: str,
+    retry: bool = False,
+) -> str:
+    from openai import OpenAI
+    client = OpenAI(
+        base_url=os.getenv("NVIDIA_BASE_URL", NVIDIA_BASE_URL),
+        api_key=api_key,
+    )
+    model = os.getenv("NVIDIA_VISION_MODEL") or DEFAULT_NVIDIA_VISION_MODEL
+    completion = client.chat.completions.create(
+        model=model,
+        messages=[
+            {
+                "role": "system",
+                "content": (
+                    "Bạn là huấn luyện viên hít đất. Bạn nhận 2 nguồn thông tin: "
+                    "1) ảnh rep của học viên và ảnh mẫu mentor, 2) kết quả rule-based có thể sai. "
+                    "Ưu tiên đánh giá trực tiếp từ ảnh. Nếu rule-based mâu thuẫn với ảnh, hãy theo ảnh. "
+                    "Chỉ trả về JSON hợp lệ, không markdown. "
+                    "Giữ mọi string value thật ngắn để JSON không bị cắt giữa chừng."
+                ),
+            },
+            {
+                "role": "user",
+                "content": [
+                    {
+                        "type": "text",
+                        "text": _rep_vlm_prompt(rep_context, retry=retry),
+                    },
+                    {
+                        "type": "image_url",
+                        "image_url": {"url": _image_as_data_url(comparison_image_path)},
+                    },
+                ],
+            },
+        ],
+        temperature=0.15,
+        top_p=0.9,
+        max_tokens=_env_int("NVIDIA_VISION_MAX_TOKENS", DEFAULT_NVIDIA_VISION_MAX_TOKENS),
+    )
+    if not completion.choices:
+        raise ValueError("empty VLM choices")
+    choice = completion.choices[0]
+    content = _extract_message_text(choice.message)
+    if not content:
+        finish_reason = getattr(choice, "finish_reason", "")
+        raise ValueError(f"empty VLM content; finish_reason={finish_reason}; model={model}")
+    finish_reason = getattr(choice, "finish_reason", "")
+    if finish_reason == "length":
+        raise ValueError(
+            "VLM response was truncated before JSON completed; "
+            "increase NVIDIA_VISION_MAX_TOKENS or use a shorter vision model response"
+        )
+    return content.strip()
+def _call_and_parse_vlm_for_rep(
+    *,
+    comparison_image_path: str,
+    rep_context: dict[str, Any],
+    api_key: str,
+) -> dict[str, Any]:
+    first_error = ""
+    for retry in (False, True):
+        try:
+            response_text = _call_vlm_for_rep(
+                comparison_image_path=comparison_image_path,
+                rep_context=rep_context,
+                api_key=api_key,
+                retry=retry,
+            )
+            parsed = _parse_json_object(response_text)
+            if not retry and _needs_vlm_retry(parsed):
+                raise ValueError("VLM returned incomplete feedback")
+            return parsed
+        except Exception as exc:
+            first_error = first_error or str(exc)
+            if retry:
+                raise ValueError(f"invalid VLM JSON after retry: {exc}") from exc
+            logger.info("Retrying VLM rep feedback after parse failure: %s", exc)
+    raise ValueError(first_error or "unknown VLM parse failure")
+def _extract_message_text(message: Any) -> str:
+    content = getattr(message, "content", "")
+    if isinstance(content, str):
+        return content
+    if isinstance(content, list):
+        parts = []
+        for item in content:
+            if isinstance(item, dict):
+                text = item.get("text") or item.get("content")
+                if isinstance(text, str):
+                    parts.append(text)
+            else:
+                text = getattr(item, "text", None)
+                if isinstance(text, str):
+                    parts.append(text)
+        return "\n".join(parts)
+    return str(content or "")
+def _validate_feedback(state: FeedbackState) -> FeedbackState:
+    feedback = (state.get("feedback") or "").strip()
+    fallback = _fallback_feedback(state["analysis"])
+    if not feedback:
+        return {**state, "feedback": fallback}
+    if len(feedback) > MAX_FEEDBACK_CHARS:
+        feedback = feedback[:MAX_FEEDBACK_CHARS].rsplit(" ", 1)[0].strip() + "..."
+    allowed_labels = {
+        err.get("label", "").lower()
+        for err in state["analysis"].get("main_errors", [])
+        if err.get("label")
+    }
+    if allowed_labels and _mentions_unknown_error(feedback, allowed_labels):
+        return {**state, "feedback": fallback}
+    return {**state, "feedback": feedback}
+def _feedback_prompt(evidence: dict[str, Any]) -> str:
+    return f"""
+Bạn là huấn luyện viên thể hình cho người mới tập.
+Dữ liệu phân tích bên dưới đã được tạo bởi hệ thống computer vision và rule engine.
+Yêu cầu:
+- Chỉ dùng lỗi có trong dữ liệu, không tự bịa lỗi mới.
+- Viết tiếng Việt, ngắn gọn, không quá 150 chữ.
+- Có 4 phần: Nhận xét, Lỗi chính, Cách sửa, Bài tập bổ trợ.
+- Nếu không có lỗi nghiêm trọng, tập trung khen form và nhắc giữ nhịp kiểm soát.
+- Không nhắc tới JSON, AI, model, MediaPipe, DTW.
+Dữ liệu:
+{json.dumps(evidence, ensure_ascii=False, indent=2)}
+""".strip()
+def _rep_vlm_prompt(rep_context: dict[str, Any], *, retry: bool = False) -> str:
+    retry_instruction = ""
+    if retry:
+        retry_instruction = """
+Lần gọi trước không trả về JSON hợp lệ. Lần này chỉ trả về đúng MỘT object JSON minified.
+Không dùng markdown, không giải thích, không thêm chữ trước/sau JSON.
+Tất cả key và string value phải dùng dấu nháy kép.
+""".strip()
+    return f"""
+Bạn nhận MỘT ảnh so sánh ghép ngang:
+- Nửa TRÁI là rep của học viên.
+- Nửa PHẢI là rep mẫu của mentor.
+Nguồn thông tin 1 - hình ảnh:
+- So sánh trực tiếp tư thế học viên với mẫu: đầu/cổ, vai, lưng, hông, khuỷu tay, độ sâu khi xuống.
+- Đây là nguồn ưu tiên cao nhất.
+Nguồn thông tin 2 - rule-based:
+- Có thể đúng hoặc sai do góc quay/landmark nhiễu.
+- Chỉ dùng như gợi ý phụ, không được chép lại nếu ảnh không ủng hộ.
+Rule context:
+{json.dumps(rep_context, ensure_ascii=False, indent=2)}
+Trả về JSON đúng schema:
+{{
+  "is_error": true,
+  "visual_error_label": "tên lỗi chính quan sát từ ảnh hoặc 'Không có lỗi rõ ràng'",
+  "diagnosis": "nhận xét ngắn về rep này dựa trên ảnh",
+  "correction": "lời khuyên sửa cụ thể cho lần tập tiếp theo, hoặc 'Tiếp tục giữ form hiện tại' nếu đã đúng",
+  "feedback": "ghép diagnosis + correction thành 1-2 câu tiếng Việt cụ thể cho rep này"
+}}
+Quy tắc:
+- Không trả về tọa độ arrow. Arrow sẽ được hệ thống đặt từ landmark rule-based.
+- diagnosis, correction, feedback đều phải ngắn. Feedback tối đa 2 câu.
+- Feedback không được chỉ là tên lỗi. Phải nói lỗi ảnh hưởng gì và sửa bằng hành động cụ thể.
+- Không trả lời ngoài JSON.
+{retry_instruction}
+""".strip()
+def _image_as_data_url(path: str) -> str:
+    with open(path, "rb") as file:
+        encoded = base64.b64encode(file.read()).decode("ascii")
+    return f"data:image/jpeg;base64,{encoded}"
+def _make_comparison_image(student_image_path: str, expert_image_path: str) -> Path:
+    import cv2
+    import numpy as np
+    student = cv2.imread(student_image_path)
+    expert = cv2.imread(expert_image_path)
+    if student is None:
+        raise ValueError(f"Cannot read student image: {student_image_path}")
+    if expert is None:
+        raise ValueError(f"Cannot read expert image: {expert_image_path}")
+    target_height = min(student.shape[0], expert.shape[0], 720)
+    def resize_to_height(image):
+        ratio = target_height / image.shape[0]
+        width = max(1, int(image.shape[1] * ratio))
+        return cv2.resize(image, (width, target_height), interpolation=cv2.INTER_AREA)
+    student_resized = resize_to_height(student)
+    expert_resized = resize_to_height(expert)
+    separator = np.full((target_height, 10, 3), 255, dtype=np.uint8)
+    combined = np.hstack([student_resized, separator, expert_resized])
+    output_path = Path(student_image_path).with_name(Path(student_image_path).stem + "_vlm_compare.jpg")
+    cv2.imwrite(str(output_path), combined, [int(cv2.IMWRITE_JPEG_QUALITY), 88])
+    return output_path
+def _parse_json_object(text: str) -> dict[str, Any]:
+    cleaned = _strip_json_wrappers(text)
+    candidates = [
+        cleaned,
+        _first_balanced_json_object(cleaned),
+    ]
+    for candidate in candidates:
+        if not candidate:
+            continue
+        parsed = _try_parse_json_object(candidate)
+        if parsed is not None:
+            return parsed
+    preview = cleaned.replace("\n", " ")[:220]
+    raise ValueError(f"VLM returned non-JSON content: {preview or '<empty>'}")
+def _strip_json_wrappers(text: str) -> str:
+    cleaned = (text or "").strip()
+    if cleaned.startswith("```"):
+        cleaned = re.sub(r"^```(?:json)?\s*", "", cleaned, flags=re.IGNORECASE)
+        cleaned = re.sub(r"\s*```$", "", cleaned)
+    return cleaned.strip()
+def _first_balanced_json_object(text: str) -> str:
+    start = text.find("{")
+    if start < 0:
+        return ""
+    depth = 0
+    in_string = False
+    escape = False
+    for index in range(start, len(text)):
+        char = text[index]
+        if escape:
+            escape = False
+            continue
+        if char == "\\":
+            escape = True
+            continue
+        if char == '"':
+            in_string = not in_string
+            continue
+        if in_string:
+            continue
+        if char == "{":
+            depth += 1
+        elif char == "}":
+            depth -= 1
+            if depth == 0:
+                return text[start : index + 1]
+    return text[start:]
+def _try_parse_json_object(text: str) -> dict[str, Any] | None:
+    for candidate in (text, _repair_loose_json(text)):
+        try:
+            parsed = json.loads(candidate)
+        except json.JSONDecodeError:
+            try:
+                parsed = ast.literal_eval(candidate)
+            except (ValueError, SyntaxError):
+                continue
+        if isinstance(parsed, dict):
+            return parsed
+    return None
+def _repair_loose_json(text: str) -> str:
+    repaired = text.strip()
+    repaired = re.sub(r"([{,]\s*)([A-Za-z_][A-Za-z0-9_]*)\s*:", r'\1"\2":', repaired)
+    repaired = re.sub(r"\bTrue\b", "true", repaired)
+    repaired = re.sub(r"\bFalse\b", "false", repaired)
+    repaired = re.sub(r"\bNone\b", "null", repaired)
+    return repaired
+def _validate_rep_visual_feedback(
+    parsed: dict[str, Any],
+    rep_context: dict[str, Any],
+) -> dict[str, Any]:
+    diagnosis = str(parsed.get("diagnosis") or "").strip()
+    correction = str(parsed.get("correction") or "").strip()
+    feedback = str(parsed.get("feedback") or "").strip()
+    if len(feedback) < 35 and (diagnosis or correction):
+        feedback = " ".join(part for part in (diagnosis, correction) if part)
+    if not feedback:
+        return _rep_visual_fallback(rep_context, "empty VLM feedback")
+    return {
+        "source": "vlm",
+        "is_error": bool(parsed.get("is_error", True)),
+        "visual_error_label": str(parsed.get("visual_error_label") or "").strip(),
+        "feedback": feedback[:MAX_FEEDBACK_CHARS],
+        "arrow": None,
+        "error": "",
+    }
+def _rep_visual_fallback(rep_context: dict[str, Any], reason: str = "") -> dict[str, Any]:
+    return {
+        "source": "fallback",
+        "is_error": bool(rep_context.get("rule_errors")),
+        "visual_error_label": "",
+        "feedback": "",
+        "arrow": None,
+        "error": reason,
+    }
+def _needs_vlm_retry(parsed: dict[str, Any]) -> bool:
+    feedback = str(parsed.get("feedback") or parsed.get("diagnosis") or "").strip()
+    return len(feedback) < 20
+def _is_usable_arrow_point(x: float, y: float) -> bool:
+    # Reject common placeholder/edge coordinates that point at image corners.
+    return 0.03 <= x <= 0.97 and 0.03 <= y <= 0.97
+def _safe_arrow_label(label: str) -> str:
+    ascii_label = unicodedata.normalize("NFKD", label).encode("ascii", "ignore").decode("ascii")
+    cleaned = re.sub(r"[^A-Za-z0-9 _-]+", "", ascii_label).strip()
+    return (cleaned or "Can sua")[:18]
+def _fallback_feedback(analysis: dict[str, Any]) -> str:
+    main_errors = analysis.get("main_errors", [])
+    if not main_errors:
+        return (
+            "Nhận xét: Form tổng thể ổn định, chưa phát hiện lỗi nghiêm trọng.\n"
+            "Lỗi chính: Chưa có lỗi nổi bật.\n"
+            "Cách sửa: Tiếp tục giữ thân người thẳng và kiểm soát nhịp xuống-lên.\n"
+            "Bài tập bổ trợ: Plank 2 hiệp, mỗi hiệp 30 giây."
+        )
+    top_error = main_errors[0]
+    return (
+        f"Nhận xét: Bạn hoàn thành {analysis.get('student_reps', 0)} rep với điểm "
+        f"{analysis.get('overall_score_pct', 0)}%.\n"
+        f"Lỗi chính: {top_error.get('label', 'Form chưa ổn định')} xuất hiện "
+        f"{top_error.get('count', 0)} lần.\n"
+        f"Cách sửa: {top_error.get('guidance', 'Giữ nhịp chậm và kiểm soát thân người tốt hơn.')}\n"
+        "Bài tập bổ trợ: Plank 2 hiệp, mỗi hiệp 30 giây trước khi tập lại."
+    )
+def _mentions_unknown_error(feedback: str, allowed_labels: set[str]) -> bool:
+    known_error_words = {
+        "võng lưng": "võng lưng",
+        "nhô mông": "nhô mông",
+        "mông quá cao": "nhô mông",
+        "cúi đầu": "cúi đầu",
+        "gập cổ": "cúi đầu",
+        "chưa hạ": "chưa hạ",
+        "chưa xuống": "chưa hạ",
+        "cơ thể chưa giữ thẳng": "cơ thể chưa giữ thẳng",
+    }
+    text = feedback.lower()
+    for phrase, normalized in known_error_words.items():
+        if phrase in text and not any(normalized in label for label in allowed_labels):
+            return True
+    return False

reflex_frontend/state.py CHANGED Viewed

@@ -44,6 +44,7 @@ class AnalysisState(rx.State):
     good_reps: int = 0
     serious_reps: int = 0
     summary: str = ""
     main_errors: list[dict] = []
     rep_results: list[dict] = []
@@ -110,6 +111,7 @@ class AnalysisState(rx.State):
         self.good_reps = 0
         self.serious_reps = 0
         self.summary = ""
         self.main_errors = []
         self.rep_results = []
         self.is_uploading = False
@@ -130,6 +132,7 @@ class AnalysisState(rx.State):
         self.good_reps = 0
         self.serious_reps = 0
         self.summary = ""
         self.main_errors = []
         self.rep_results = []
@@ -167,6 +170,7 @@ class AnalysisState(rx.State):
                 self.progress_value = 0
                 self.main_errors = []
                 self.rep_results = []
             return rx.toast(result["error"], level="error")
         rep_results = []
@@ -188,6 +192,7 @@ class AnalysisState(rx.State):
             self.good_reps = result["good_reps"]
             self.serious_reps = result["serious_reps"]
             self.summary = result["summary"]
             self.main_errors = result["main_errors"]
             self.rep_results = rep_results
             self.analysis_artifact_relpaths = artifact_paths
@@ -226,20 +231,36 @@ def _cleanup_stale_temp_files(max_age_seconds: int = 6 * 60 * 60) -> None:
     except OSError:
         return
     _cleanup_stale_files(upload_root / "students", max_age_seconds)
     _cleanup_stale_files(upload_root / "analysis", max_age_seconds)
 def _cleanup_stale_files(directory: Path, max_age_seconds: int) -> None:
-    if not directory.exists():
         return
     cutoff = time.time() - max_age_seconds
-    for path in directory.rglob("*"):
-        if path.is_file() and path.stat().st_mtime < cutoff:
             parent = path.parent
             _delete_path(str(path))
             _delete_empty_dir(parent)
 def _delete_empty_dir(path: Path) -> None:

     good_reps: int = 0
     serious_reps: int = 0
     summary: str = ""
+    coach_feedback: str = ""
     main_errors: list[dict] = []
     rep_results: list[dict] = []
         self.good_reps = 0
         self.serious_reps = 0
         self.summary = ""
+        self.coach_feedback = ""
         self.main_errors = []
         self.rep_results = []
         self.is_uploading = False
         self.good_reps = 0
         self.serious_reps = 0
         self.summary = ""
+        self.coach_feedback = ""
         self.main_errors = []
         self.rep_results = []
                 self.progress_value = 0
                 self.main_errors = []
                 self.rep_results = []
+                self.coach_feedback = ""
             return rx.toast(result["error"], level="error")
         rep_results = []
             self.good_reps = result["good_reps"]
             self.serious_reps = result["serious_reps"]
             self.summary = result["summary"]
+            self.coach_feedback = result.get("coach_feedback", "")
             self.main_errors = result["main_errors"]
             self.rep_results = rep_results
             self.analysis_artifact_relpaths = artifact_paths
     except OSError:
         return
+    try:
+        upload_root.mkdir(parents=True, exist_ok=True)
+    except OSError:
+        return
     _cleanup_stale_files(upload_root / "students", max_age_seconds)
     _cleanup_stale_files(upload_root / "analysis", max_age_seconds)
 def _cleanup_stale_files(directory: Path, max_age_seconds: int) -> None:
+    try:
+        directory.mkdir(parents=True, exist_ok=True)
+    except OSError:
         return
     cutoff = time.time() - max_age_seconds
+    try:
+        paths = list(directory.rglob("*"))
+    except OSError:
+        return
+    for path in paths:
+        try:
+            if not path.is_file() or path.stat().st_mtime >= cutoff:
+                continue
             parent = path.parent
             _delete_path(str(path))
             _delete_empty_dir(parent)
+        except OSError:
+            continue
 def _delete_empty_dir(path: Path) -> None:

reflex_frontend/ui.py CHANGED Viewed

@@ -589,6 +589,11 @@ def results_panel() -> rx.Component:
             padding="16px",
             width="100%",
         ),
         rx.vstack(
             rx.foreach(AnalysisState.rep_results, rep_result_item),
             spacing="3",
@@ -602,6 +607,47 @@ def results_panel() -> rx.Component:
     )
 def metric(title: str, value, label: str) -> rx.Component:
     return rx.box(
         rx.text(title, color=COLORS["muted"], font_size="0.84rem"),
@@ -639,13 +685,21 @@ def rep_result_item(rep: dict) -> rx.Component:
                 ),
                 rx.fragment(),
             ),
-            rx.box(
-                rx.text(rep["feedback"], color=COLORS["muted"],
-                        font_size="0.93rem", line_height="1.55"),
-                background=COLORS["surface_soft"],
-                border_radius="12px",
-                padding="12px",
-                width="100%",
             ),
             spacing="3",
             align="start",
@@ -658,6 +712,31 @@ def rep_result_item(rep: dict) -> rx.Component:
     )
 def frame_image(title: str, path) -> rx.Component:
     return rx.box(
         rx.text(title, font_weight="700",

             padding="16px",
             width="100%",
         ),
+        rx.cond(
+            AnalysisState.coach_feedback != "",
+            coach_feedback_card(),
+            rx.fragment(),
+        ),
         rx.vstack(
             rx.foreach(AnalysisState.rep_results, rep_result_item),
             spacing="3",
     )
+def coach_feedback_card() -> rx.Component:
+    return rx.box(
+        rx.vstack(
+            rx.hstack(
+                rx.box(
+                    rx.icon("sparkles", size=18, color="white"),
+                    width="34px",
+                    height="34px",
+                    border_radius="10px",
+                    display="flex",
+                    align_items="center",
+                    justify_content="center",
+                    background=COLORS["blue"],
+                ),
+                rx.vstack(
+                    rx.text("Nhận xét từ AI Coach", font_weight="800"),
+                    rx.text("Dựa trên lỗi từng rep và điểm tổng quan",
+                            color=COLORS["muted"], font_size="0.86rem"),
+                    spacing="1",
+                    align="start",
+                ),
+                spacing="3",
+                align="center",
+            ),
+            rx.text(
+                AnalysisState.coach_feedback,
+                color=COLORS["text"],
+                line_height="1.65",
+                white_space="pre-line",
+            ),
+            spacing="3",
+            align="start",
+        ),
+        background="linear-gradient(135deg, #eff6ff 0%, #ffffff 70%)",
+        border=f"1px solid {COLORS['blue_soft']}",
+        border_radius="16px",
+        padding="16px",
+        width="100%",
+    )
 def metric(title: str, value, label: str) -> rx.Component:
     return rx.box(
         rx.text(title, color=COLORS["muted"], font_size="0.84rem"),
                 ),
                 rx.fragment(),
             ),
+            rx.cond(
+                rep["llm_feedback"] != "",
+                feedback_note(
+                    "Feedback từ VLM",
+                    "Dựa trên ảnh học viên, ảnh mẫu và rule-based context",
+                    rep["llm_feedback"],
+                    "info",
+                ),
+                rx.fragment(),
+            ),
+            feedback_note(
+                "Feedback rule-based",
+                "Dựa trên landmark, góc khớp và ngưỡng kỹ thuật",
+                rep["rule_feedback"],
+                "neutral",
             ),
             spacing="3",
             align="start",
     )
+def feedback_note(title: str, subtitle: str, content, tone: str) -> rx.Component:
+    bg = COLORS["blue_soft"] if tone == "info" else COLORS["surface_soft"]
+    border = COLORS["blue_soft"] if tone == "info" else COLORS["border"]
+    return rx.box(
+        rx.vstack(
+            rx.hstack(
+                rx.text(title, font_weight="800", font_size="0.92rem"),
+                badge("VLM" if tone == "info" else "Rule", "info" if tone == "info" else "neutral"),
+                spacing="2",
+                align="center",
+            ),
+            rx.text(subtitle, color=COLORS["muted"], font_size="0.82rem"),
+            rx.text(content, color=COLORS["muted"],
+                    font_size="0.93rem", line_height="1.55"),
+            spacing="2",
+            align="start",
+        ),
+        background=bg,
+        border=f"1px solid {border}",
+        border_radius="12px",
+        padding="12px",
+        width="100%",
+    )
 def frame_image(title: str, path) -> rx.Component:
     return rx.box(
         rx.text(title, font_weight="700",

requirements.txt CHANGED Viewed

@@ -2,11 +2,12 @@
 fastdtw>=0.3.4
 google-genai>=0.1.0
 httpx>=0.28.0
-langchain-nvidia-ai-endpoints>=0.3.0
 langgraph>=1.0.0
 mediapipe==0.10.11
 numpy>=2.4.4
 opencv-python-headless>=4.13.0.92
 plotly>=6.0.0
 python-dotenv>=1.0.0
 reflex>=0.8.18

 fastdtw>=0.3.4
 google-genai>=0.1.0
 httpx>=0.28.0
 langgraph>=1.0.0
 mediapipe==0.10.11
 numpy>=2.4.4
+openai>=2.0.0
 opencv-python-headless>=4.13.0.92
+Pillow>=11.0.0
 plotly>=6.0.0
 python-dotenv>=1.0.0
 reflex>=0.8.18

scripts/log_hook.py CHANGED Viewed

@@ -16,6 +16,7 @@ from pathlib import Path
 # Vietnam timezone (used by existing logs)
 VN_TZ = timezone(timedelta(hours=7))
 # Typical artifacts when UTF-8 bytes are decoded with a legacy codepage first.
 MOJIBAKE_TOKENS = ('Ã', 'Â', 'Ä', 'Å', 'Æ', 'â€', 'á»')
@@ -97,6 +98,7 @@ def _get_git_metadata() -> dict:
         except Exception:
             pass
     return metadata
@@ -113,6 +115,130 @@ def _write_entry(entry: dict) -> None:
         raise
 def _make_entry(prompt: str, response: str, tool: str = 'manual') -> dict:
     meta = _get_git_metadata()
     return {
@@ -142,8 +268,16 @@ def main() -> None:
         return
     # 2️⃣ Otherwise read from stdin (piped JSON). If nothing comes in, exit silently.
     raw = _read_stdin_text().strip()
     if not raw:
         sys.exit(0)
     raw = _fix_mojibake_text(raw)
@@ -152,12 +286,6 @@ def main() -> None:
     except json.JSONDecodeError:
         sys.exit(0)
-    # Detect tool name – honour explicit flag if provided.
-    parser_tool = None
-    if '--tool' in sys.argv:
-        idx = sys.argv.index('--tool')
-        if idx + 1 < len(sys.argv):
-            parser_tool = sys.argv[idx + 1]
     tool = parser_tool or os.getenv('AI_TOOL_NAME', 'manual')
     meta = _get_git_metadata()

 # Vietnam timezone (used by existing logs)
 VN_TZ = timezone(timedelta(hours=7))
+DEFAULT_STUDENT_EMAIL = 'akirahoang617@gmail.com'
 # Typical artifacts when UTF-8 bytes are decoded with a legacy codepage first.
 MOJIBAKE_TOKENS = ('Ã', 'Â', 'Ä', 'Å', 'Æ', 'â€', 'á»')
         except Exception:
             pass
+    metadata['student'] = os.getenv('AI_LOG_STUDENT_EMAIL') or DEFAULT_STUDENT_EMAIL or metadata['student']
     return metadata
         raise
+def _read_existing_keys() -> set[tuple[str, str, str]]:
+    log_file = Path(os.getenv('AI_LOG_DIR', '.ai-log')) / 'session.jsonl'
+    keys: set[tuple[str, str, str]] = set()
+    if not log_file.exists():
+        return keys
+    try:
+        with open(log_file, 'r', encoding='utf-8') as f:
+            for line in f:
+                try:
+                    item = json.loads(line)
+                except json.JSONDecodeError:
+                    continue
+                keys.add((
+                    str(item.get('tool', '')),
+                    str(item.get('session_id', '')),
+                    str(item.get('prompt', '')),
+                ))
+    except Exception:
+        pass
+    return keys
+def _extract_text_from_content(content) -> str:
+    if isinstance(content, str):
+        return content
+    if isinstance(content, list):
+        parts = []
+        for item in content:
+            if isinstance(item, dict) and isinstance(item.get('text'), str):
+                parts.append(item['text'])
+        return '\n'.join(parts)
+    return ''
+def _latest_codex_session_file() -> Path | None:
+    root = Path(os.getenv('CODEX_HOME', Path.home() / '.codex')) / 'sessions'
+    if not root.exists():
+        return None
+    candidates = list(root.rglob('rollout-*.jsonl'))
+    if not candidates:
+        return None
+    return max(candidates, key=lambda path: path.stat().st_mtime)
+def _extract_codex_prompt_from_session(path: Path) -> dict | None:
+    cwd = str(Path.cwd()).lower()
+    session_id = path.stem
+    current_cwd = ''
+    last_prompt = ''
+    model = ''
+    try:
+        with open(path, 'r', encoding='utf-8') as f:
+            for line in f:
+                try:
+                    item = json.loads(line)
+                except json.JSONDecodeError:
+                    continue
+                payload = item.get('payload') or {}
+                if item.get('type') == 'turn_context':
+                    current_cwd = str(payload.get('cwd', '')).lower()
+                    model = str(payload.get('model') or model)
+                    continue
+                if current_cwd and current_cwd != cwd:
+                    continue
+                if payload.get('type') == 'user_message':
+                    message = str(payload.get('message', '')).strip()
+                    if message:
+                        last_prompt = message
+                    continue
+                if payload.get('type') == 'message' and payload.get('role') == 'user':
+                    message = _extract_text_from_content(payload.get('content')).strip()
+                    if message:
+                        last_prompt = message
+    except Exception:
+        return None
+    if not last_prompt:
+        return None
+    return {
+        'session_id': session_id,
+        'model': model,
+        'prompt': last_prompt,
+    }
+def _log_latest_codex_session() -> bool:
+    session_file = _latest_codex_session_file()
+    if not session_file:
+        return False
+    extracted = _extract_codex_prompt_from_session(session_file)
+    if not extracted:
+        return False
+    prompt = _fix_mojibake_text(extracted['prompt'])[:1000]
+    key = ('codex', extracted['session_id'], prompt)
+    if key in _read_existing_keys():
+        return True
+    meta = _get_git_metadata()
+    entry = {
+        'ts': datetime.now(VN_TZ).isoformat(),
+        'tool': 'codex',
+        'event': 'codex_session_fallback',
+        'session_id': extracted['session_id'],
+        'model': extracted['model'],
+        'repo': meta['repo'],
+        'branch': meta['branch'],
+        'commit': meta['commit'],
+        'student': meta['student'],
+        'prompt': prompt,
+        'response_summary': '',
+    }
+    _write_entry(entry)
+    return True
 def _make_entry(prompt: str, response: str, tool: str = 'manual') -> dict:
     meta = _get_git_metadata()
     return {
         return
     # 2️⃣ Otherwise read from stdin (piped JSON). If nothing comes in, exit silently.
+    parser_tool = None
+    if '--tool' in sys.argv:
+        idx = sys.argv.index('--tool')
+        if idx + 1 < len(sys.argv):
+            parser_tool = sys.argv[idx + 1]
     raw = _read_stdin_text().strip()
     if not raw:
+        if parser_tool == 'codex':
+            _log_latest_codex_session()
         sys.exit(0)
     raw = _fix_mojibake_text(raw)
     except json.JSONDecodeError:
         sys.exit(0)
     tool = parser_tool or os.getenv('AI_TOOL_NAME', 'manual')
     meta = _get_git_metadata()

scripts/run_pushup_eval_tests.py CHANGED Viewed

@@ -1,9 +1,16 @@
 #!/usr/bin/env python3
 from __future__ import annotations
 from pathlib import Path
 import sys
 import time
 ROOT = Path(__file__).resolve().parent.parent
 if str(ROOT) not in sys.path:
@@ -13,74 +20,269 @@ from push_up.analysis_service import TEMPLATE_SOURCE, analyze_pushup, prepare_te
 from push_up.processor import VideoProcessor
-TEST_CASES = [
-    ("template_vs_template", TEMPLATE_SOURCE),
-    ("hv01_cuoi_dau_thap", ROOT / "data" / "tests" / "hv01_cuoi_dau_thap.mp4"),
-    ("hv01_hit_nua_rep", ROOT / "data" / "tests" / "hv01_hit_nua_rep.mp4"),
-    ("hv01_khong_gong_bung", ROOT / "data" / "tests" / "hv01_khong_gong_bung.mp4"),
-    ("hv01_mong_cao", ROOT / "data" / "tests" / "hv01_mong_cao.mp4"),
-    ("hv01_rep_sai_rep_dung", ROOT / "data" / "tests" / "hv01_rep_sai_rep_dung.mp4"),
-    ("hv01_tap_dung", ROOT / "data" / "tests" / "hv01_tap_dung.mp4"),
-    ("hv02_tap_dung", ROOT / "data" / "tests" / "hv02_tap_dung.mp4"),
-    ("vo_teakwondo", ROOT / "data" / "tests" / "vo_teakwondo.mp4"),
-]
-def _orientation_label(video_path: Path) -> str:
-    processor = VideoProcessor()
-    needs_flip = processor._detect_orientation(str(video_path))
-    return "head-left -> flipped" if needs_flip else "head-right/no-flip"
 def main() -> int:
-    print(f"[setup] template: {TEMPLATE_SOURCE}")
-    prepare_template_cache()
-    missing = [path for _, path in TEST_CASES if not path.exists()]
-    if missing:
-        print("[error] Missing test videos:")
-        for path in missing:
-            print(f"  - {path}")
         return 1
-    for label, video_path in TEST_CASES:
         print("=" * 88)
-        print(f"[case] {label}")
-        print(f"[compare] template={TEMPLATE_SOURCE.name} vs student={video_path.name}")
-        print(f"[orientation] {video_path.name}: {_orientation_label(video_path)}")
         started = time.perf_counter()
-        result = analyze_pushup(video_path, save_artifacts=False)
-        elapsed = time.perf_counter() - started
-        if result.get("error"):
-            print(f"[result] ERROR: {result['error']}")
-            print(f"[time] {elapsed:.1f}s")
-            continue
         print(
-            "[result] "
-            f"overall={result['overall_score_pct']}%, "
-            f"student_reps={result['student_reps']}, "
-            f"expert_reps={result['expert_reps']}, "
-            f"good_reps={result['good_reps']}, "
-            f"serious_reps={result['serious_reps']}"
         )
-        print(f"[summary] {result['summary']}")
-        print("[rep_details]")
-        for rep in result["rep_results"]:
-            errors = ", ".join(rep["error_labels"]) if rep["error_labels"] else "none"
-            print(
-                "  "
-                f"rep={rep['rep_num']:02d} "
-                f"score={rep['score_pct']:>5}% "
-                f"rule={rep['rule_score_pct']:>5}% "
-                f"dtw={rep['dtw_score_pct']:>5}% "
-                f"status={rep['status']} "
-                f"errors={errors}"
-            )
-        print(f"[time] {elapsed:.1f}s")
-    return 0
 if __name__ == "__main__":

 #!/usr/bin/env python3
 from __future__ import annotations
+import argparse
+from collections import Counter
+from datetime import datetime
+import json
+import os
 from pathlib import Path
 import sys
 import time
+from typing import Any
 ROOT = Path(__file__).resolve().parent.parent
 if str(ROOT) not in sys.path:
 from push_up.processor import VideoProcessor
+VIDEO_EXTENSIONS = {".mp4", ".mov", ".m4v", ".avi", ".webm"}
+DEFAULT_TESTS_DIR = ROOT / "data" / "tests"
+DEFAULT_OUTPUT_DIR = ROOT / "analysis_artifacts" / "video_test_runs"
+EXPECTED_REJECTION_LABELS = {"vo_teakwondo"}
+def parse_args() -> argparse.Namespace:
+    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
+    parser = argparse.ArgumentParser(
+        description="Run push-up analysis against every video in data/tests and write one JSON result file."
+    )
+    parser.add_argument(
+        "--tests-dir",
+        type=Path,
+        default=DEFAULT_TESTS_DIR,
+        help="Directory containing test videos. Default: data/tests",
+    )
+    parser.add_argument(
+        "--output",
+        type=Path,
+        default=DEFAULT_OUTPUT_DIR / f"pushup_eval_{timestamp}.json",
+        help="Single JSON file to write. Default: analysis_artifacts/video_test_runs/pushup_eval_<timestamp>.json",
+    )
+    parser.add_argument(
+        "--artifact-root",
+        type=Path,
+        default=DEFAULT_OUTPUT_DIR / f"artifacts_{timestamp}",
+        help="Root directory for annotated images when --save-artifacts is used.",
+    )
+    parser.add_argument(
+        "--save-artifacts",
+        action="store_true",
+        help="Save per-rep annotated student/expert frames. Also enables deterministic rule-based arrows.",
+    )
+    parser.add_argument(
+        "--enable-vlm",
+        action="store_true",
+        help="Allow NVIDIA VLM calls for per-rep text feedback. By default VLM is disabled for repeatable tests.",
+    )
+    parser.add_argument(
+        "--include-template",
+        action="store_true",
+        help="Also test template video against itself.",
+    )
+    return parser.parse_args()
 def main() -> int:
+    args = parse_args()
+    tests_dir = (ROOT / args.tests_dir).resolve() if not args.tests_dir.is_absolute() else args.tests_dir
+    output_path = (ROOT / args.output).resolve() if not args.output.is_absolute() else args.output
+    artifact_root = (
+        (ROOT / args.artifact_root).resolve() if not args.artifact_root.is_absolute() else args.artifact_root
+    )
+    if not tests_dir.exists():
+        print(f"[error] Test directory does not exist: {tests_dir}")
+        return 1
+    videos = discover_videos(tests_dir)
+    if args.include_template:
+        videos = [TEMPLATE_SOURCE] + videos
+    if not videos:
+        print(f"[error] No test videos found in: {tests_dir}")
         return 1
+    if not args.enable_vlm:
+        os.environ["NVIDIA_API_KEY"] = "nvapi-..."
+    print(f"[setup] template={TEMPLATE_SOURCE}")
+    print(f"[setup] tests_dir={tests_dir}")
+    print(f"[setup] output={output_path}")
+    print(f"[setup] save_artifacts={args.save_artifacts}")
+    print(f"[setup] vlm={'enabled' if args.enable_vlm else 'disabled'}")
+    prepare_template_cache()
+    batch_started = time.perf_counter()
+    results = []
+    for index, video_path in enumerate(videos, start=1):
+        label = video_path.stem if video_path != TEMPLATE_SOURCE else "template_vs_template"
         print("=" * 88)
+        print(f"[case {index}/{len(videos)}] {label}")
+        print(f"[video] {video_path}")
         started = time.perf_counter()
+        orientation = orientation_label(video_path)
+        try:
+            result = analyze_pushup(
+                video_path,
+                artifact_root if args.save_artifacts else None,
+                save_artifacts=args.save_artifacts,
+            )
+            elapsed = time.perf_counter() - started
+            entry = compact_result(
+                label=label,
+                video_path=video_path,
+                orientation=orientation,
+                elapsed_seconds=elapsed,
+                result=result,
+            )
+        except Exception as exc:
+            elapsed = time.perf_counter() - started
+            entry = {
+                "label": label,
+                "video_path": project_relative_path(video_path),
+                "orientation": orientation,
+                "elapsed_seconds": round(elapsed, 2),
+                "error": f"{type(exc).__name__}: {exc}",
+                "ok": False,
+            }
+        results.append(entry)
+        print_case_summary(entry)
+    payload = {
+        "created_at": datetime.now().isoformat(timespec="seconds"),
+        "tests_dir": project_relative_path(tests_dir),
+        "template_video_path": project_relative_path(TEMPLATE_SOURCE),
+        "save_artifacts": args.save_artifacts,
+        "artifact_root": project_relative_path(artifact_root) if args.save_artifacts else "",
+        "vlm_enabled": args.enable_vlm,
+        "total_videos": len(results),
+        "passed_videos": sum(1 for item in results if item.get("ok")),
+        "failed_videos": sum(1 for item in results if not item.get("ok")),
+        "elapsed_seconds": round(time.perf_counter() - batch_started, 2),
+        "error_distribution": error_distribution(results),
+        "results": results,
+    }
+    output_path.parent.mkdir(parents=True, exist_ok=True)
+    with output_path.open("w", encoding="utf-8") as file:
+        json.dump(payload, file, ensure_ascii=False, indent=2)
+    print("=" * 88)
+    print(f"[done] wrote {output_path}")
+    print(f"[done] passed={payload['passed_videos']} failed={payload['failed_videos']}")
+    return 0 if payload["failed_videos"] == 0 else 1
+def discover_videos(tests_dir: Path) -> list[Path]:
+    return sorted(
+        path
+        for path in tests_dir.rglob("*")
+        if path.is_file() and path.suffix.lower() in VIDEO_EXTENSIONS
+    )
+def orientation_label(video_path: Path) -> str:
+    try:
+        processor = VideoProcessor()
+        needs_flip = processor._detect_orientation(str(video_path))
+    except Exception as exc:
+        return f"unknown: {type(exc).__name__}: {exc}"
+    return "head-left -> flipped" if needs_flip else "head-right/no-flip"
+def compact_result(
+    *,
+    label: str,
+    video_path: Path,
+    orientation: str,
+    elapsed_seconds: float,
+    result: dict[str, Any],
+) -> dict[str, Any]:
+    if result.get("error"):
+        expected_rejection = label in EXPECTED_REJECTION_LABELS
+        return {
+            "label": label,
+            "video_path": project_relative_path(video_path),
+            "orientation": orientation,
+            "elapsed_seconds": round(elapsed_seconds, 2),
+            "ok": expected_rejection,
+            "expected_rejection": expected_rejection,
+            "error": result["error"],
+        }
+    return {
+        "label": label,
+        "video_path": project_relative_path(video_path),
+        "orientation": orientation,
+        "elapsed_seconds": round(elapsed_seconds, 2),
+        "ok": True,
+        "expected_rejection": False,
+        "error": None,
+        "overall_score_pct": result.get("overall_score_pct"),
+        "student_reps": result.get("student_reps"),
+        "expert_reps": result.get("expert_reps"),
+        "good_reps": result.get("good_reps"),
+        "serious_reps": result.get("serious_reps"),
+        "summary": result.get("summary", ""),
+        "main_errors": result.get("main_errors", []),
+        "student_video_path": result.get("student_video_path", ""),
+        "rep_results": [compact_rep(rep) for rep in result.get("rep_results", [])],
+    }
+def compact_rep(rep: dict[str, Any]) -> dict[str, Any]:
+    return {
+        "rep_num": rep.get("rep_num"),
+        "score_pct": rep.get("score_pct"),
+        "rule_score_pct": rep.get("rule_score_pct"),
+        "dtw_score_pct": rep.get("dtw_score_pct"),
+        "status": rep.get("status"),
+        "primary_error": rep.get("primary_error"),
+        "error_labels": rep.get("error_labels", []),
+        "rule_feedback": rep.get("rule_feedback") or rep.get("feedback", ""),
+        "llm_feedback": rep.get("llm_feedback", ""),
+        "llm_feedback_source": rep.get("llm_feedback_source", ""),
+        "llm_feedback_error": rep.get("llm_feedback_error", ""),
+        "llm_visual_error_label": rep.get("llm_visual_error_label", ""),
+        "llm_arrow": rep.get("llm_arrow"),
+        "student_frame_path": rep.get("student_frame_path", ""),
+        "expert_frame_path": rep.get("expert_frame_path", ""),
+    }
+def print_case_summary(entry: dict[str, Any]) -> None:
+    if not entry.get("ok"):
+        print(f"[result] ERROR: {entry.get('error')}")
+        print(f"[time] {entry.get('elapsed_seconds')}s")
+        return
+    if entry.get("expected_rejection"):
+        print(f"[result] EXPECTED_REJECTION: {entry.get('error')}")
+        print(f"[time] {entry.get('elapsed_seconds')}s")
+        return
+    print(
+        "[result] "
+        f"overall={entry.get('overall_score_pct')}%, "
+        f"student_reps={entry.get('student_reps')}, "
+        f"expert_reps={entry.get('expert_reps')}, "
+        f"good_reps={entry.get('good_reps')}, "
+        f"serious_reps={entry.get('serious_reps')}"
+    )
+    print(f"[summary] {entry.get('summary', '')}")
+    for rep in entry.get("rep_results", []):
+        errors = ", ".join(rep.get("error_labels") or []) or "none"
         print(
+            "  "
+            f"rep={int(rep['rep_num']):02d} "
+            f"score={rep.get('score_pct')}% "
+            f"status={rep.get('status')} "
+            f"errors={errors}"
         )
+    print(f"[time] {entry.get('elapsed_seconds')}s")
+def error_distribution(results: list[dict[str, Any]]) -> dict[str, int]:
+    counter: Counter[str] = Counter()
+    for result in results:
+        for error in result.get("main_errors", []):
+            label = error.get("label") or error.get("type") or "unknown"
+            counter[label] += int(error.get("count") or 0)
+    return dict(counter.most_common())
+def project_relative_path(path: Path) -> str:
+    try:
+        return path.resolve().relative_to(ROOT).as_posix()
+    except ValueError:
+        return path.resolve().as_posix()
 if __name__ == "__main__":