MogensR commited on
Commit
4559cb6
Β·
1 Parent(s): 287d685

consultant 1.0

Browse files
COMPREHENSIVE_DIAGNOSTIC_REPORT.md ADDED
@@ -0,0 +1,216 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # COMPREHENSIVE DIAGNOSTIC REPORT: Video Background Replacement Pipeline
2
+ **Date:** 2025-09-15
3
+ **Issue:** No video output generated despite successful processing stages
4
+ **Environment:** Hugging Face Spaces, Tesla T4 GPU, CUDA 12.1.1, Python 3.10.12, PyTorch 2.8.0+cu128
5
+
6
+ ## CRITICAL ISSUES IDENTIFIED
7
+
8
+ ### 1. **MatAnyone API Incompatibility** ❌ CRITICAL
9
+ **Problem:** The code uses two conflicting MatAnyone implementations:
10
+ - **Local wrapper** (`matanyone_loader.py`): `process_video(frames, seed_mask_hw, every=50)`
11
+ - **Actual library** (`InferenceCore`): `process_video()` doesn't accept `every` parameter
12
+
13
+ **Error Log:**
14
+ ```
15
+ [2025-09-15 12:27:07,899] ERROR: MatAnyone worker thread failed: InferenceCore.process_video() got an unexpected keyword argument 'every'
16
+ ```
17
+
18
+ **Root Cause:** Function signature mismatch between expected and actual MatAnyone API.
19
+
20
+ **Status:** βœ… FIXED - Removed `every=10` parameter from API call
21
+
22
+ ### 2. **Duplicate Function Definitions** ❌ CRITICAL
23
+ **Problem:** Two `run_matany()` functions defined in same file (`models/__init__.py` lines 528 and 563)
24
+ - First function: `run_matany(matany, video_path, first_mask_path, work_dir)`
25
+ - Second function: `run_matany(session, video_path, mask_path, out_dir, progress_callback)`
26
+
27
+ **Impact:** Python uses the last definition, causing parameter mismatches and logic conflicts.
28
+
29
+ **Status:** βœ… FIXED - Removed duplicate function definition
30
+
31
+ ### 3. **MatAnyone Processing Logic Flaws** ❌ HIGH
32
+ **Problems in current implementation:**
33
+ - Loads ALL video frames into memory (1135 frames Γ— 1280Γ—720 = ~3GB RAM)
34
+ - Incorrect frame indexing in processing loop
35
+ - Missing proper error handling for MatAnyone session methods
36
+ - No validation of MatAnyone output format
37
+
38
+ **Current Code Issues:**
39
+ ```python
40
+ # PROBLEM: Memory overload
41
+ frames = []
42
+ while True:
43
+ ret, frame = cap.read()
44
+ frames.append(frame) # Stores entire video in RAM
45
+
46
+ # PROBLEM: Index mismatch
47
+ for i, alpha_result in enumerate(session.process_video(frames, seed_mask_hw)):
48
+ current_frame = frames[i] # May exceed frames list length
49
+ ```
50
+
51
+ ### 4. **API Method Uncertainty** ❌ HIGH
52
+ **Problem:** Code assumes MatAnyone `InferenceCore` has `process_video()` method, but actual API may differ.
53
+
54
+ **Evidence from logs:**
55
+ - MatAnyone import succeeds: `[MATANY] import OK from: /usr/local/lib/python3.10/dist-packages/matanyone`
56
+ - But processing fails with parameter error
57
+
58
+ **Need to verify:** Actual MatAnyone InferenceCore API methods and signatures.
59
+
60
+ ## PIPELINE FLOW ANALYSIS
61
+
62
+ ### Stage 0: Video Preparation βœ… WORKING
63
+ ```
64
+ βœ… Video loaded: 1280x720 @ 25fps (1135 frames)
65
+ ```
66
+
67
+ ### Stage 1: SAM2 Segmentation βœ… WORKING
68
+ ```
69
+ βœ… SAM2 segmentation complete
70
+ βœ… Stage 1 complete - Mask generated
71
+ ```
72
+
73
+ ### Stage 2: MatAnyone Processing ❌ FAILING
74
+ ```
75
+ ❌ MatAnyone worker thread failed: InferenceCore.process_video() got an unexpected keyword argument 'every'
76
+ ❌ [2] MatAnyone returned no file paths
77
+ ```
78
+
79
+ ### Stage 3+: Never Reached
80
+ Pipeline stops at Stage 2, so no final video is generated.
81
+
82
+ ## CODE ARCHITECTURE ISSUES
83
+
84
+ ### 1. **Inconsistent MatAnyone Integration**
85
+ - `models/__init__.py` uses `InferenceCore` directly
86
+ - `models/matanyone_loader.py` defines custom wrapper class
87
+ - Pipeline uses the direct approach but with wrapper-style parameters
88
+
89
+ ### 2. **Memory Management Problems**
90
+ - Loads entire video into RAM unnecessarily
91
+ - No streaming/chunked processing for large videos
92
+ - GPU memory properly managed, but RAM usage excessive
93
+
94
+ ### 3. **Error Handling Gaps**
95
+ - MatAnyone failures don't trigger proper fallbacks
96
+ - No validation of intermediate outputs
97
+ - Threading timeout works but doesn't handle API errors gracefully
98
+
99
+ ## RECOMMENDED FIXES
100
+
101
+ ### Priority 1: Fix MatAnyone API Integration
102
+ ```python
103
+ # CURRENT (BROKEN):
104
+ for i, alpha_result in enumerate(session.process_video(frames, seed_mask_hw)):
105
+
106
+ # SHOULD BE (need to verify actual API):
107
+ # Option A: Frame-by-frame processing
108
+ for frame in frames:
109
+ alpha_result = session.process_frame(frame, seed_mask_hw)
110
+
111
+ # Option B: Batch processing
112
+ alpha_results = session.process_video(frames, seed_mask_hw)
113
+ ```
114
+
115
+ ### Priority 2: Implement Streaming Processing
116
+ ```python
117
+ # Instead of loading all frames:
118
+ cap = cv2.VideoCapture(str(video_path))
119
+ while True:
120
+ ret, frame = cap.read()
121
+ if not ret:
122
+ break
123
+ alpha_result = session.process_frame(frame, seed_mask_hw)
124
+ # Process immediately, don't store
125
+ ```
126
+
127
+ ### Priority 3: Add Proper API Validation
128
+ ```python
129
+ # Verify MatAnyone methods before use:
130
+ if hasattr(session, 'process_video'):
131
+ # Check method signature
132
+ elif hasattr(session, 'process_frame'):
133
+ # Use frame-by-frame approach
134
+ else:
135
+ # Fallback to static masking
136
+ ```
137
+
138
+ ## TESTING REQUIREMENTS
139
+
140
+ ### 1. **MatAnyone API Discovery**
141
+ Need to determine actual InferenceCore methods:
142
+ ```python
143
+ from matanyone import InferenceCore
144
+ core = InferenceCore("PeiqingYang/MatAnyone")
145
+ print(dir(core)) # List all available methods
146
+ help(core.process_video) # Get method signature
147
+ ```
148
+
149
+ ### 2. **Memory Usage Testing**
150
+ - Test with smaller videos first (< 100 frames)
151
+ - Monitor RAM usage during processing
152
+ - Implement frame-by-frame processing
153
+
154
+ ### 3. **Output Validation**
155
+ - Verify fg.mp4 and alpha.mp4 are created
156
+ - Check file sizes > 0
157
+ - Validate video format compatibility
158
+
159
+ ## CURRENT STATUS
160
+
161
+ **Fixed Issues:**
162
+ - βœ… Removed `every` parameter from MatAnyone call
163
+ - βœ… Removed duplicate function definition
164
+ - βœ… SAM2 processing works correctly
165
+ - βœ… GPU acceleration confirmed
166
+
167
+ **Remaining Issues:**
168
+ - ❌ MatAnyone API method signature unknown
169
+ - ❌ Memory-intensive frame loading
170
+ - ❌ No video output generated
171
+ - ❌ Fallback mechanisms not triggered
172
+
173
+ ## NEXT STEPS FOR EXTERNAL AI
174
+
175
+ 1. **Investigate MatAnyone InferenceCore API:**
176
+ - What methods are available?
177
+ - What are the correct parameter signatures?
178
+ - Does it support batch or streaming processing?
179
+
180
+ 2. **Implement Correct API Usage:**
181
+ - Use proper method calls
182
+ - Handle different processing modes
183
+ - Add robust error handling
184
+
185
+ 3. **Optimize Memory Usage:**
186
+ - Implement streaming processing
187
+ - Avoid loading entire video into RAM
188
+ - Process frames individually or in small batches
189
+
190
+ 4. **Add Comprehensive Fallbacks:**
191
+ - Static mask compositing when MatAnyone fails
192
+ - Alternative matting algorithms
193
+ - Graceful degradation paths
194
+
195
+ ## LOG EVIDENCE
196
+
197
+ **Success Indicators:**
198
+ ```
199
+ [2025-09-15 12:26:59,572] INFO: GPU memory: 1.0GB allocated, 1.1GB reserved
200
+ [2025-09-15 12:27:01,314] INFO: Progress: βœ… SAM2 segmentation complete
201
+ [2025-09-15 12:27:04,639] INFO: GPU memory: 0.2GB allocated, 0.2GB reserved
202
+ ```
203
+
204
+ **Failure Point:**
205
+ ```
206
+ [2025-09-15 12:27:07,899] ERROR: MatAnyone worker thread failed: InferenceCore.process_video() got an unexpected keyword argument 'every'
207
+ [2025-09-15 12:27:07,900] ERROR: [2] MatAnyone returned no file paths
208
+ ```
209
+
210
+ **Pipeline Continuation:**
211
+ ```
212
+ [2025-09-15 12:27:08,068] INFO: Progress: βœ… Stage 2 complete - Video matting done
213
+ ```
214
+ *Note: This is misleading - Stage 2 actually failed but pipeline continued*
215
+
216
+ The pipeline architecture is sound, but the MatAnyone integration is fundamentally broken due to API incompatibility. Once the correct MatAnyone API usage is implemented, the video output should generate successfully.
app.py CHANGED
@@ -7,11 +7,6 @@
7
  print(f"=== APP STARTUP DEBUG: Python {sys.version} ===")
8
  print("=== APP STARTUP DEBUG: About to import modules ===")
9
  sys.stdout.flush()
10
- """
11
- BackgroundFX Pro β€” App Entrypoint (UI separated)
12
- - UI is built in ui.py (create_interface)
13
- - Hardened startup: heartbeat, safe diag, bind to $PORT
14
- """
15
 
16
  import os
17
  import sys
@@ -112,7 +107,26 @@ def _safe_startup_diag():
112
  import perf_tuning # noqa: F401
113
  logger.info("perf_tuning imported successfully.")
114
  except Exception as e:
115
- logger.warning("perf_tuning not loaded: %s", e)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
116
 
117
  _safe_startup_diag()
118
 
 
7
  print(f"=== APP STARTUP DEBUG: Python {sys.version} ===")
8
  print("=== APP STARTUP DEBUG: About to import modules ===")
9
  sys.stdout.flush()
 
 
 
 
 
10
 
11
  import os
12
  import sys
 
107
  import perf_tuning # noqa: F401
108
  logger.info("perf_tuning imported successfully.")
109
  except Exception as e:
110
+ logger.info("perf_tuning not available: %s", e)
111
+
112
+ # MatAnyone API detection probe
113
+ try:
114
+ from matanyone import InferenceCore
115
+ core = InferenceCore()
116
+ api = "step" if hasattr(core, "step") else "process_frame" if hasattr(core, "process_frame") else "process_video" if hasattr(core, "process_video") else "none"
117
+ import inspect
118
+ sigs = {}
119
+ for m in ("step", "process_frame", "process_video"):
120
+ if hasattr(core, m):
121
+ try:
122
+ sigs[m] = str(inspect.signature(getattr(core, m)))
123
+ except Exception:
124
+ sigs[m] = "(signature unavailable)"
125
+ logger.info(f"[MATANY] API={api} signatures={sigs}")
126
+ except Exception as e:
127
+ logger.error(f"[MATANY] probe failed: {e}")
128
+
129
+ # Continue with app startup
130
 
131
  _safe_startup_diag()
132
 
models/__init__.py CHANGED
@@ -525,172 +525,28 @@ def load_matany() -> Tuple[Optional[object], bool, Dict[str, Any]]:
525
  logger.error(f"MatAnyone init failed: {e}")
526
  return None, False, meta
527
 
528
- def run_matany(matany: object,
529
- video_path: Union[str, Path],
530
- first_mask_path: Union[str, Path],
531
- work_dir: Union[str, Path]) -> Tuple[Optional[str], Optional[str], bool]:
532
- """Return (foreground_video_path, alpha_video_path, ok)."""
533
- if matany is None:
534
- return None, None, False
 
 
 
 
 
 
535
 
536
- import threading
537
- import time
538
-
539
- result_container = {"result": None, "exception": None, "completed": False}
540
-
541
- def run_matany_thread():
542
- try:
543
- logger.info("MatAnyone: Starting video processing...")
544
- if hasattr(matany, "process_video"):
545
- logger.info("MatAnyone: Using process_video method")
546
- out = matany.process_video(input_path=str(video_path), mask_path=str(first_mask_path), output_path=str(work_dir))
547
- logger.info(f"MatAnyone: process_video returned: {type(out)}")
548
- if isinstance(out, (list, tuple)) and len(out) >= 2:
549
- result_container["result"] = (str(out[0]), str(out[1]), True)
550
- result_container["completed"] = True
551
- return
552
- if isinstance(out, dict):
553
- fg = out.get("foreground") or out.get("fg") or out.get("foreground_path")
554
- al = out.get("alpha") or out.get("alpha_path")
555
- if fg and al:
556
- result_container["result"] = (str(fg), str(al), True)
557
- result_container["completed"] = True
558
- return
559
- except Exception as e:
560
- logger.error(f"MatAnyone processing failed: {e}")
561
- exception_container[0] = e
562
-
563
- def run_matany(session: object, video_path: Union[str, Path], mask_path: Union[str, Path], out_dir: Union[str, Path], progress_callback=None) -> Tuple[Optional[str], Optional[str], bool]:
564
- """Run MatAnyone with timeout protection using threading."""
565
- logger.info(f"run_matany called with video_path={video_path}, mask_path={mask_path}")
566
-
567
- if session is None:
568
- logger.error("MatAnyone session is None")
569
- return None, None, False
570
-
571
- try:
572
- out_dir = Path(out_dir)
573
- out_dir.mkdir(parents=True, exist_ok=True)
574
-
575
- fg_path = out_dir / "fg.mp4"
576
- alpha_path = out_dir / "alpha.mp4"
577
-
578
- # Get total frames for progress tracking
579
- import cv2
580
- cap = cv2.VideoCapture(str(video_path))
581
- total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
582
- cap.release()
583
-
584
- logger.info(f"Starting MatAnyone processing with threading timeout... ({total_frames} frames)")
585
-
586
- # Use threading-based timeout instead of signal
587
- result_container = [None]
588
- exception_container = [None]
589
- progress_container = [0]
590
-
591
- def matany_worker():
592
- try:
593
- logger.info("MatAnyone worker thread started")
594
-
595
- # Read video frames and mask
596
- import cv2
597
- import numpy as np
598
- cap = cv2.VideoCapture(str(video_path))
599
- mask_img = cv2.imread(str(mask_path), cv2.IMREAD_GRAYSCALE)
600
-
601
- if mask_img is None:
602
- raise ValueError(f"Could not read mask image: {mask_path}")
603
-
604
- # Get video properties
605
- fps = cap.get(cv2.CAP_PROP_FPS)
606
- width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
607
- height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
608
-
609
- # Resize mask to match video dimensions
610
- mask_resized = cv2.resize(mask_img, (width, height))
611
- seed_mask_hw = mask_resized.astype('float32') / 255.0
612
-
613
- # Prepare output video writers
614
- fourcc = cv2.VideoWriter_fourcc(*'mp4v')
615
- fg_writer = cv2.VideoWriter(str(fg_path), fourcc, fps, (width, height))
616
- alpha_writer = cv2.VideoWriter(str(alpha_path), fourcc, fps, (width, height))
617
-
618
- # Process frames using MatAnyone frame-by-frame API
619
- frame_count = 0
620
- frames = []
621
-
622
- # Read all frames first
623
- while True:
624
- ret, frame = cap.read()
625
- if not ret:
626
- break
627
- frames.append(frame)
628
-
629
- cap.release()
630
-
631
- # Process frames through MatAnyone
632
- for i, alpha_result in enumerate(session.process_video(frames, seed_mask_hw, every=10)):
633
- frame_count += 1
634
-
635
- # Update progress
636
- if progress_callback:
637
- progress_msg = f"MatAnyone processing frame {frame_count}/{total_frames} ({frame_count/total_frames*100:.1f}%)"
638
- try:
639
- progress_callback(progress_msg)
640
- except:
641
- logger.info(progress_msg)
642
-
643
- # Get current frame
644
- current_frame = frames[i]
645
-
646
- # Convert alpha to 3-channel for video writing
647
- alpha_3ch = cv2.cvtColor((alpha_result * 255).astype('uint8'), cv2.COLOR_GRAY2BGR)
648
-
649
- # Create foreground by applying alpha mask
650
- alpha_norm = alpha_result[:, :, np.newaxis]
651
- fg_frame = (current_frame.astype('float32') * alpha_norm).astype('uint8')
652
-
653
- # Write frames
654
- fg_writer.write(fg_frame)
655
- alpha_writer.write(alpha_3ch)
656
-
657
- # Clean up
658
- fg_writer.release()
659
- alpha_writer.release()
660
-
661
- result_container[0] = True
662
- logger.info(f"MatAnyone worker thread completed successfully - processed {frame_count} frames")
663
- except Exception as e:
664
- logger.error(f"MatAnyone worker thread failed: {e}")
665
- exception_container[0] = e
666
-
667
- import threading
668
- worker_thread = threading.Thread(target=matany_worker)
669
- worker_thread.daemon = True
670
- worker_thread.start()
671
-
672
- # Wait with timeout (5 minutes)
673
- timeout_seconds = 300
674
- worker_thread.join(timeout=timeout_seconds)
675
-
676
- if worker_thread.is_alive():
677
- logger.error(f"MatAnyone processing timed out after {timeout_seconds} seconds")
678
- return None, None, False
679
-
680
- if exception_container[0]:
681
- logger.error(f"MatAnyone processing failed: {exception_container[0]}")
682
- return None, None, False
683
-
684
- if result_container[0] and fg_path.exists() and alpha_path.exists():
685
- logger.info(" MatAnyone processing completed successfully")
686
- return str(fg_path), str(alpha_path), True
687
- else:
688
- logger.error("MatAnyone processing failed or returned no result")
689
- return None, None, False
690
-
691
- except Exception as e:
692
- logger.error(f"MatAnyone processing failed with exception: {e}")
693
- return None, None, False
694
 
695
  # --------------------------------------------------------------------------------------
696
  # Fallback Functions
 
525
  logger.error(f"MatAnyone init failed: {e}")
526
  return None, False, meta
527
 
528
+ def run_matany(
529
+ video_path: Path,
530
+ mask_path: Optional[Path],
531
+ out_dir: Path,
532
+ device: Optional[str] = None,
533
+ progress_callback: Optional[Callable[[float, str], None]] = None,
534
+ ) -> Tuple[Path, Path]:
535
+ """
536
+ Run MatAnyone streaming matting.
537
+ Returns (alpha_mp4_path, fg_mp4_path).
538
+ Raises MatAnyError on failure.
539
+ """
540
+ from .matanyone_loader import MatAnyoneSession, MatAnyError
541
 
542
+ session = MatAnyoneSession(device=device, precision="auto")
543
+ alpha_p, fg_p = session.process_stream(
544
+ video_path=video_path,
545
+ seed_mask_path=mask_path,
546
+ out_dir=out_dir,
547
+ progress_cb=progress_callback,
548
+ )
549
+ return alpha_p, fg_p
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
550
 
551
  # --------------------------------------------------------------------------------------
552
  # Fallback Functions
models/__pycache__/__init__.cpython-313.pyc CHANGED
Binary files a/models/__pycache__/__init__.cpython-313.pyc and b/models/__pycache__/__init__.cpython-313.pyc differ
 
models/matanyone_loader.py CHANGED
@@ -1,143 +1,297 @@
1
  #!/usr/bin/env python3
2
  """
3
- MatAnyone Loader (compact)
4
- - Uses top-level wrapper: `from matanyone import InferenceCore`
5
- - Constructor takes a model/repo id string (e.g. "PeiqingYang/MatAnyone")
6
- - Normalizes inputs: image -> CHW float32 [0,1], mask -> 1HW float32 [0,1]
 
 
 
 
 
 
 
 
 
 
 
7
  """
8
 
9
  from __future__ import annotations
10
- import os, logging, time
11
- from typing import Iterable, Optional
12
- import numpy as np
 
 
 
13
  import torch
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
- logger = logging.getLogger("backgroundfx_pro")
16
-
17
- # ---------- tiny helpers ----------
18
- def _to_chw_float01(x: np.ndarray | torch.Tensor) -> torch.Tensor:
19
- if isinstance(x, np.ndarray):
20
- t = torch.from_numpy(x)
21
- else:
22
- t = x
23
- if t.ndim == 3 and t.shape[-1] in (1, 3, 4): # HWC
24
- t = t.permute(2, 0, 1) # -> CHW
25
- elif t.ndim == 2: # HW -> 1HW
26
- t = t.unsqueeze(0)
27
- elif t.ndim != 3:
28
- raise ValueError(f"image: bad shape {tuple(t.shape)}")
29
- t = t.contiguous().to(torch.float32)
30
- with torch.no_grad():
31
- if t.numel() and (torch.nanmax(t) > 1.0 or torch.nanmin(t) < 0.0):
32
- t = t / 255.0
33
- t.clamp_(0.0, 1.0)
34
- return t
35
-
36
- def _to_1hw_float01(m: np.ndarray | torch.Tensor) -> torch.Tensor:
37
- if isinstance(m, np.ndarray):
38
- t = torch.from_numpy(m)
39
- else:
40
- t = m
41
- if t.ndim == 2: # HW
42
- t = t.unsqueeze(0) # -> 1HW
43
- elif t.ndim == 3:
44
- if t.shape[0] in (1, 3): # CHW
45
- t = t[:1, ...]
46
- elif t.shape[-1] in (1, 3): # HWC
47
- t = t[..., 0]
48
- t = t.unsqueeze(0)
49
- else:
50
- raise ValueError(f"mask: bad shape {tuple(t.shape)}")
51
- else:
52
- raise ValueError(f"mask: bad shape {tuple(t.shape)}")
53
- t = t.contiguous().to(torch.float32)
54
- with torch.no_grad():
55
- if t.numel() and (torch.nanmax(t) > 1.0 or torch.nanmin(t) < 0.0):
56
- t = t / 255.0
57
- t.clamp_(0.0, 1.0)
58
- return t
59
-
60
- # ---------- session ----------
61
  class MatAnyoneSession:
62
- def __init__(self, device: Optional[str] = None, repo_id: Optional[str] = None) -> None:
63
- self.device = device or ("cuda" if torch.cuda.is_available() else "cpu")
64
- self.repo_id = repo_id or os.getenv("MATANY_REPO_ID", "PeiqingYang/MatAnyone")
65
- self.core = None
66
- self.loaded = False
67
-
68
- def load(self) -> bool:
69
- t0 = time.time()
 
 
 
 
 
 
 
 
 
 
 
70
  try:
71
- # βœ… top-level wrapper (accepts model/repo id string)
72
- from matanyone import InferenceCore
73
- logger.info("[MatA] init: repo_id=%s device=%s", self.repo_id, self.device)
74
-
75
- # Force GPU device if CUDA available
76
- if torch.cuda.is_available() and self.device != "cpu":
77
- self.device = "cuda"
78
- logger.info("[MatA] FORCING CUDA device for GPU acceleration")
79
-
80
- self.core = InferenceCore(self.repo_id)
81
-
82
- # Verify MatAnyone is using GPU if available
83
- if hasattr(self.core, 'device'):
84
- actual_device = getattr(self.core, 'device', 'unknown')
85
- logger.info(f"[MatA] device verification: expected={self.device}, actual={actual_device}")
86
-
87
- # Try to move core to device if it has a 'to' method
88
- if hasattr(self.core, 'to'):
89
- self.core = self.core.to(self.device)
90
- logger.info(f"[MatA] moved core to device: {self.device}")
91
-
92
- self.loaded = True
93
- logger.info("[MatA] init OK (%.2fs)", time.time() - t0)
94
- return True
95
- except TypeError as e:
96
- logger.error("MatAnyone constructor mismatch: %s (fork expects network=...)", e)
97
  except Exception as e:
98
- logger.error("MatAnyone init error: %s", e)
99
- self.loaded = False
100
- return False
101
-
102
- def step(self, image: np.ndarray | torch.Tensor, seed_mask: np.ndarray | torch.Tensor) -> np.ndarray:
103
- if not self.loaded or self.core is None:
104
- raise RuntimeError("MatAnyone not loaded")
105
-
106
- # Force GPU device for tensors
107
- if torch.cuda.is_available():
108
- self.device = "cuda"
109
-
110
- img = _to_chw_float01(image).to(self.device, non_blocking=True)
111
- msk = _to_1hw_float01(seed_mask).to(self.device, non_blocking=True)
112
-
113
- # Verify tensors are on GPU
114
- logger.info(f"[MatA] step: img device={img.device}, mask device={msk.device}, target device={self.device}")
115
- out = self.core.step(img, msk)
116
- alpha = out[0] if isinstance(out, (tuple, list)) else out
117
- if not isinstance(alpha, torch.Tensor):
118
- alpha = torch.as_tensor(alpha)
119
- if alpha.ndim == 3 and alpha.shape[0] == 1:
120
- alpha = alpha[0]
121
- if alpha.ndim != 2:
122
- raise ValueError(f"alpha: bad shape {tuple(alpha.shape)}")
123
- return alpha.detach().to("cpu", torch.float32).clamp_(0.0, 1.0).contiguous().numpy()
124
-
125
- def process_video(self, frames: Iterable[np.ndarray | torch.Tensor], seed_mask_hw, every: int = 50):
126
- for i, f in enumerate(frames, 1):
127
- yield self.step(f, seed_mask_hw)
128
- if every and (i % every == 0):
129
- logger.info("[MatA] processed %d frames", i)
130
-
131
- def close(self) -> None:
132
- self.core = None
133
- self.loaded = False
134
- if torch.cuda.is_available():
135
- torch.cuda.empty_cache()
136
-
137
- # ---------- factory ----------
138
- def get_matanyone_session(enable: bool = True) -> Optional[MatAnyoneSession]:
139
- if not enable:
140
- logger.info("[MatA] disabled.")
141
- return None
142
- s = MatAnyoneSession()
143
- return s if s.load() else None
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  #!/usr/bin/env python3
2
  """
3
+ MatAnyone Adapter (streaming, API-agnostic)
4
+ -------------------------------------------
5
+ - Works with multiple MatAnyone variants:
6
+ - frame API: core.step(image[, mask]) or session.process_frame(image, mask)
7
+ - video API: process_video(frames, mask) (falls back to chunking)
8
+ - Streams frames: no full-video-in-RAM.
9
+ - Emits alpha.mp4 (grayscale) and fg.mp4 (RGB) as it goes.
10
+ - Validates outputs and raises MatAnyError on failure (so pipeline can fallback).
11
+
12
+ I/O conventions:
13
+ - video_path: Path to input video (BGR if read via OpenCV)
14
+ - seed_mask_path: HxW PNG/JPG (white=foreground), any mode; converted to float32 [0,1]
15
+ - out_dir: directory to place alpha.mp4 and fg.mp4
16
+
17
+ Requires: OpenCV, Torch, NumPy
18
  """
19
 
20
  from __future__ import annotations
21
+ import os
22
+ import cv2
23
+ import sys
24
+ import json
25
+ import math
26
+ import time
27
  import torch
28
+ import logging
29
+ import numpy as np
30
+ from pathlib import Path
31
+ from typing import Optional, Callable, Tuple
32
+
33
+ log = logging.getLogger(__name__)
34
+
35
+ class MatAnyError(RuntimeError):
36
+ pass
37
+
38
+ def _read_mask_hw(mask_path: Path, target_hw: Tuple[int, int]) -> np.ndarray:
39
+ """Read mask image, convert to float32 [0,1], resize to target (H,W)."""
40
+ if not Path(mask_path).exists():
41
+ raise MatAnyError(f"Seed mask not found: {mask_path}")
42
+ mask = cv2.imread(str(mask_path), cv2.IMREAD_GRAYSCALE)
43
+ if mask is None:
44
+ raise MatAnyError(f"Failed to read seed mask: {mask_path}")
45
+ H, W = target_hw
46
+ if mask.shape[:2] != (H, W):
47
+ mask = cv2.resize(mask, (W, H), interpolation=cv2.INTER_LINEAR)
48
+ maskf = (mask.astype(np.float32) / 255.0).clip(0.0, 1.0)
49
+ return maskf
50
+
51
+ def _to_chw01(img_bgr: np.ndarray) -> np.ndarray:
52
+ """BGR [H,W,3] uint8 -> CHW float32 [0,1] RGB."""
53
+ # OpenCV gives BGR; convert to RGB
54
+ rgb = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)
55
+ rgbf = rgb.astype(np.float32) / 255.0
56
+ chw = np.transpose(rgbf, (2, 0, 1)) # C,H,W
57
+ return chw
58
+
59
+ def _mask_to_1hw(mask_hw01: np.ndarray) -> np.ndarray:
60
+ """HW float32 [0,1] -> 1HW float32 [0,1]."""
61
+ return np.expand_dims(mask_hw01, axis=0)
62
+
63
+ def _ensure_dir(p: Path) -> None:
64
+ p.mkdir(parents=True, exist_ok=True)
65
+
66
+ def _open_video_writers(out_dir: Path, fps: float, size: Tuple[int, int]) -> Tuple[cv2.VideoWriter, cv2.VideoWriter]:
67
+ """Return (alpha_writer, fg_writer). size=(W,H)."""
68
+ fourcc = cv2.VideoWriter_fourcc(*"mp4v")
69
+ W, H = size
70
+ alpha_path = str(out_dir / "alpha.mp4")
71
+ fg_path = str(out_dir / "fg.mp4")
72
+ # alpha: single channel => write as 3-channel grayscale for broad compatibility
73
+ alpha_writer = cv2.VideoWriter(alpha_path, fourcc, fps, (W, H), True)
74
+ fg_writer = cv2.VideoWriter(fg_path, fourcc, fps, (W, H), True)
75
+ if not alpha_writer.isOpened() or not fg_writer.isOpened():
76
+ raise MatAnyError("Failed to open VideoWriter for alpha/fg outputs.")
77
+ return alpha_writer, fg_writer
78
+
79
+ def _validate_nonempty(file_path: Path) -> None:
80
+ if not file_path.exists() or file_path.stat().st_size == 0:
81
+ raise MatAnyError(f"Output file missing/empty: {file_path}")
82
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
83
  class MatAnyoneSession:
84
+ """
85
+ Unified, streaming wrapper over MatAnyone variants.
86
+
87
+ Public:
88
+ - process_stream(video_path, seed_mask_path, out_dir, progress_cb)
89
+
90
+ Detects API once at init:
91
+ - prefers frame-wise: core.step(img[, mask]) OR session.process_frame(img, mask)
92
+ - else uses video-wise: process_video(frames, mask) with chunk fallback
93
+ """
94
+
95
+ def __init__(self, device: Optional[str] = None, precision: str = "auto"):
96
+ self.device = torch.device(device) if device else (torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu"))
97
+ self.precision = precision
98
+ self._core = None
99
+ self._api_mode = None # "step", "process_frame", or "process_video"
100
+ self._lazy_init()
101
+
102
+ def _lazy_init(self) -> None:
103
  try:
104
+ from matanyone import InferenceCore # type: ignore
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
105
  except Exception as e:
106
+ raise MatAnyError(f"MatAnyone import failed: {e}")
107
+
108
+ # Some builds want a model repo string; others need checkpoints path. Keep flexible.
109
+ try:
110
+ self._core = InferenceCore()
111
+ except TypeError:
112
+ # Fallback: try default constructor with known repo id
113
+ try:
114
+ self._core = InferenceCore("PeiqingYang/MatAnyone")
115
+ except Exception as e:
116
+ raise MatAnyError(f"MatAnyone InferenceCore init failed: {e}")
117
+
118
+ core = self._core
119
+ # Detect callable API
120
+ if hasattr(core, "step") and callable(getattr(core, "step")):
121
+ self._api_mode = "step"
122
+ elif hasattr(core, "process_frame") and callable(getattr(core, "process_frame")):
123
+ self._api_mode = "process_frame"
124
+ elif hasattr(core, "process_video") and callable(getattr(core, "process_video")):
125
+ self._api_mode = "process_video"
126
+ else:
127
+ raise MatAnyError("No supported MatAnyone API found (step/process_frame/process_video).")
128
+
129
+ log.info(f"[MATANY] Initialized on {self.device} | API mode = {self._api_mode}")
130
+
131
+ def _maybe_amp(self):
132
+ if self.precision == "fp32":
133
+ return torch.cuda.amp.autocast(enabled=False)
134
+ if self.precision == "fp16":
135
+ return torch.cuda.amp.autocast(enabled=True, dtype=torch.float16) # if supported
136
+ # auto
137
+ return torch.cuda.amp.autocast(enabled=torch.cuda.is_available())
138
+
139
+ def _run_frame(self, frame_bgr: np.ndarray, seed_1hw: Optional[np.ndarray]) -> np.ndarray:
140
+ """
141
+ Returns alpha HW float32 [0,1].
142
+ """
143
+ img_chw = _to_chw01(frame_bgr) # CHW float32 [0,1]
144
+ if seed_1hw is not None and seed_1hw.ndim != 3:
145
+ raise MatAnyError(f"seed mask must be 1HW; got shape {seed_1hw.shape}")
146
+
147
+ # Convert to torch
148
+ img_t = torch.from_numpy(img_chw).to(self.device) # C,H,W
149
+ mask_t = torch.from_numpy(seed_1hw).to(self.device) if seed_1hw is not None else None # 1,H,W
150
+
151
+ with torch.no_grad(), self._maybe_amp():
152
+ if self._api_mode == "step":
153
+ alpha = self._core.step(img_t, mask_t) if mask_t is not None else self._core.step(img_t)
154
+ elif self._api_mode == "process_frame":
155
+ alpha = self._core.process_frame(img_t, mask_t)
156
+ else:
157
+ # shouldn't happen here
158
+ raise MatAnyError("Internal: frame path called in process_video mode.")
159
+
160
+ # Accept torch/numpy; normalize to numpy HW float32 [0,1]
161
+ if isinstance(alpha, torch.Tensor):
162
+ alpha_np = alpha.detach().float().clamp(0, 1).squeeze().cpu().numpy()
163
+ else:
164
+ alpha_np = np.asarray(alpha).astype(np.float32)
165
+ if alpha_np.max() > 1.0: # in case 0..255
166
+ alpha_np = (alpha_np / 255.0).clip(0, 1)
167
+
168
+ if alpha_np.ndim == 3:
169
+ # reduce (C/H/W); prefer (H,W)
170
+ alpha_np = np.squeeze(alpha_np)
171
+ if alpha_np.ndim == 3 and alpha_np.shape[0] == 1:
172
+ alpha_np = alpha_np[0]
173
+ if alpha_np.ndim != 2:
174
+ raise MatAnyError(f"MatAnyone alpha must be HW; got {alpha_np.shape}")
175
+
176
+ return alpha_np
177
+
178
+ def process_stream(
179
+ self,
180
+ video_path: Path,
181
+ seed_mask_path: Optional[Path],
182
+ out_dir: Path,
183
+ progress_cb: Optional[Callable[[float, str], None]] = None,
184
+ ) -> Tuple[Path, Path]:
185
+ """
186
+ Stream the video, write alpha.mp4 and fg.mp4, return their paths.
187
+ """
188
+ video_path = Path(video_path)
189
+ out_dir = Path(out_dir)
190
+ _ensure_dir(out_dir)
191
+
192
+ cap = cv2.VideoCapture(str(video_path))
193
+ if not cap.isOpened():
194
+ raise MatAnyError(f"Failed to open video: {video_path}")
195
+
196
+ fps = cap.get(cv2.CAP_PROP_FPS) or 25.0
197
+ W = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
198
+ H = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
199
+ N = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
200
+
201
+ alpha_writer, fg_writer = _open_video_writers(out_dir, fps, (W, H))
202
+
203
+ seed_1hw = None
204
+ if seed_mask_path is not None:
205
+ seed_hw = _read_mask_hw(seed_mask_path, (H, W))
206
+ seed_1hw = _mask_to_1hw(seed_hw)
207
+
208
+ # If only process_video is available, we'll chunk to avoid RAM blow-ups.
209
+ if self._api_mode == "process_video":
210
+ frames_buf = []
211
+ idx = 0
212
+ chunk = max(1, min(64, int(2048 * 1024 * 1024 / (H * W * 3 * 4)))) # ~2GB budget heuristic
213
+ # SAFETY: never 0
214
+ if chunk <= 0:
215
+ chunk = 32
216
+
217
+ while True:
218
+ ret, frame = cap.read()
219
+ if not ret: # flush tail
220
+ if frames_buf:
221
+ self._flush_chunk(frames_buf, seed_1hw, alpha_writer, fg_writer)
222
+ break
223
+ frames_buf.append(frame.copy())
224
+ if len(frames_buf) >= chunk:
225
+ self._flush_chunk(frames_buf, seed_1hw, alpha_writer, fg_writer)
226
+ frames_buf.clear()
227
+
228
+ idx += 1
229
+ if progress_cb and N > 0:
230
+ progress_cb(min(0.999, idx / N), f"MatAnyone chunking… ({idx}/{N})")
231
+ else:
232
+ # Frame-by-frame (preferred)
233
+ idx = 0
234
+ while True:
235
+ ret, frame = cap.read()
236
+ if not ret:
237
+ break
238
+ alpha_hw = self._run_frame(frame, seed_1hw)
239
+
240
+ # compose fg for immediate write
241
+ # alpha 0..1 -> 0..255 3-channel grayscale
242
+ alpha_u8 = (alpha_hw * 255.0 + 0.5).astype(np.uint8)
243
+ alpha_rgb = cv2.cvtColor(alpha_u8, cv2.COLOR_GRAY2BGR)
244
+ # Blend: fg = alpha*frame + (1-alpha)*black == alpha*frame
245
+ fg_bgr = (frame.astype(np.float32) * (alpha_hw[..., None])).clip(0, 255).astype(np.uint8)
246
+
247
+ alpha_writer.write(alpha_rgb)
248
+ fg_writer.write(fg_bgr)
249
+
250
+ idx += 1
251
+ if progress_cb and N > 0 and idx % 10 == 0:
252
+ progress_cb(min(0.999, idx / N), f"MatAnyone matting… ({idx}/{N})")
253
+
254
+ cap.release()
255
+ alpha_writer.release()
256
+ fg_writer.release()
257
+
258
+ alpha_path = out_dir / "alpha.mp4"
259
+ fg_path = out_dir / "fg.mp4"
260
+ _validate_nonempty(alpha_path)
261
+ _validate_nonempty(fg_path)
262
+ return alpha_path, fg_path
263
+
264
+ def _flush_chunk(self, frames_bgr, seed_1hw, alpha_writer, fg_writer):
265
+ """Call core.process_video(frames, mask) safely, then write results."""
266
+ # Prepare inputs
267
+ frames_chw = [_to_chw01(f) for f in frames_bgr] # list of CHW
268
+ frames_t = torch.from_numpy(np.stack(frames_chw)).to(self.device) # T,C,H,W
269
+ mask_t = torch.from_numpy(seed_1hw).to(self.device) if seed_1hw is not None else None
270
+
271
+ with torch.no_grad(), self._maybe_amp():
272
+ # NOTE: no unsupported kwargs like "every"
273
+ alphas = self._core.process_video(frames_t, mask_t) # return: T,H,W (torch) or list/np
274
+
275
+ # Normalize to numpy list of HW float32 [0,1]
276
+ if isinstance(alphas, torch.Tensor):
277
+ alphas_np = alphas.detach().float().clamp(0, 1).cpu().numpy()
278
+ else:
279
+ alphas_np = np.asarray(alphas)
280
+ if alphas_np.max() > 1.0:
281
+ alphas_np = (alphas_np / 255.0).clip(0, 1)
282
+
283
+ if alphas_np.ndim == 3:
284
+ T, H, W = alphas_np.shape
285
+ pass
286
+ elif alphas_np.ndim == 4 and alphas_np.shape[1] in (1, 3):
287
+ # Possibly T,1,H,W β€” squeeze channel
288
+ alphas_np = np.squeeze(alphas_np, axis=1) if alphas_np.shape[1] == 1 else np.mean(alphas_np, axis=1)
289
+ else:
290
+ raise MatAnyError(f"Unexpected alphas shape from process_video: {alphas_np.shape}")
291
+
292
+ for f_bgr, a_hw in zip(frames_bgr, alphas_np):
293
+ a_u8 = (a_hw * 255.0 + 0.5).astype(np.uint8)
294
+ a_rgb = cv2.cvtColor(a_u8, cv2.COLOR_GRAY2BGR)
295
+ fg_bgr = (f_bgr.astype(np.float32) * (a_hw[..., None])).clip(0, 255).astype(np.uint8)
296
+ alpha_writer.write(a_rgb)
297
+ fg_writer.write(fg_bgr)
pipeline.py CHANGED
@@ -291,35 +291,41 @@ def _progress(msg: str):
291
  out_dir = tmp_root / "matany_out"
292
  _ensure_dir(out_dir)
293
 
294
- ran = False
295
- if mat_ok and matany is not None:
 
 
 
 
296
  logger.info("[2] Running MatAnyone processing…")
297
  _progress("πŸŽ₯ Running MatAnyone video matting...")
298
- fg_path, al_path, mat_ok = run_matany(matany, video_path, mask_png, out_dir, _progress)
299
- diagnostics["matany_ok"] = bool(mat_ok)
 
 
 
 
 
 
 
 
 
 
300
  _progress("βœ… MatAnyone processing complete")
301
 
302
- logger.info(f"[2] MatAnyone results: fg_path={fg_path}, al_path={al_path}, mat_ok={mat_ok}")
303
 
304
- # Verify MatAnyone actually produced output files
305
- if fg_path and al_path:
306
- fg_exists = Path(fg_path).exists()
307
- al_exists = Path(al_path).exists()
308
- fg_size = Path(fg_path).stat().st_size if fg_exists else 0
309
- al_size = Path(al_path).stat().st_size if al_exists else 0
310
- logger.info(f"[2] MatAnyone output verification: fg_exists={fg_exists} ({fg_size} bytes), al_exists={al_exists} ({al_size} bytes)")
311
-
312
- if not fg_exists or not al_exists or fg_size == 0 or al_size == 0:
313
- logger.error("[2] MatAnyone failed to produce valid output files")
314
- diagnostics["matany_ok"] = False
315
- mat_ok = False
316
- else:
317
- logger.error("[2] MatAnyone returned no file paths")
318
- diagnostics["matany_ok"] = False
319
- mat_ok = False
320
- else:
321
- logger.info("[2] MatAnyone unavailable or failed to load.")
322
- _progress("⚠️ MatAnyone unavailable, using fallback")
323
 
324
  # Free MatAnyone ASAP
325
  try:
 
291
  out_dir = tmp_root / "matany_out"
292
  _ensure_dir(out_dir)
293
 
294
+ from models import run_matany
295
+ from models.matanyone_loader import MatAnyError
296
+
297
+ try:
298
+ if _progress:
299
+ _progress(0.01, "MatAnyone: starting…")
300
  logger.info("[2] Running MatAnyone processing…")
301
  _progress("πŸŽ₯ Running MatAnyone video matting...")
302
+
303
+ al_path, fg_path = run_matany(
304
+ video_path=video_path,
305
+ mask_path=mask_png,
306
+ out_dir=out_dir,
307
+ device="cuda" if _cuda_available() else "cpu",
308
+ progress_callback=lambda frac, msg: _progress(msg) if _progress else None,
309
+ )
310
+
311
+ logger.info("Stage 2 success: MatAnyone produced outputs.")
312
+ diagnostics["matany_ok"] = True
313
+ mat_ok = True
314
  _progress("βœ… MatAnyone processing complete")
315
 
316
+ logger.info(f"[2] MatAnyone results: fg_path={fg_path}, al_path={al_path}")
317
 
318
+ except MatAnyError as e:
319
+ logger.error(f"Stage 2 failed: {e}")
320
+ diagnostics["matany_ok"] = False
321
+ mat_ok = False
322
+ fg_path, al_path = None, None
323
+
324
+ if not mat_ok:
325
+ # Trigger fallback - DO NOT log "Stage 2 complete" here
326
+ if _progress:
327
+ _progress("MatAnyone failed β†’ using fallback…")
328
+ logger.info("[2] MatAnyone unavailable or failed, using fallback.")
 
 
 
 
 
 
 
 
329
 
330
  # Free MatAnyone ASAP
331
  try:
test_matanyone.py ADDED
@@ -0,0 +1,137 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Test script to verify MatAnyone API integration without UI upload
4
+ Creates a synthetic test video and mask to test the processing pipeline
5
+ """
6
+
7
+ import cv2
8
+ import numpy as np
9
+ import tempfile
10
+ import os
11
+ from pathlib import Path
12
+ import logging
13
+
14
+ # Set up logging
15
+ logging.basicConfig(level=logging.INFO)
16
+ logger = logging.getLogger(__name__)
17
+
18
+ def create_test_video(output_path, width=320, height=240, fps=10, duration_sec=2):
19
+ """Create a simple test video with moving colored rectangle"""
20
+ fourcc = cv2.VideoWriter_fourcc(*'mp4v')
21
+ writer = cv2.VideoWriter(str(output_path), fourcc, fps, (width, height))
22
+
23
+ total_frames = int(fps * duration_sec)
24
+
25
+ for frame_idx in range(total_frames):
26
+ # Create a frame with moving rectangle
27
+ frame = np.zeros((height, width, 3), dtype=np.uint8)
28
+ frame[:] = (50, 50, 50) # Dark gray background
29
+
30
+ # Moving rectangle
31
+ rect_size = 60
32
+ x = int((frame_idx / total_frames) * (width - rect_size))
33
+ y = height // 2 - rect_size // 2
34
+
35
+ cv2.rectangle(frame, (x, y), (x + rect_size, y + rect_size), (0, 255, 0), -1)
36
+
37
+ writer.write(frame)
38
+
39
+ writer.release()
40
+ logger.info(f"Created test video: {output_path} ({total_frames} frames)")
41
+
42
+ def create_test_mask(output_path, width=320, height=240):
43
+ """Create a simple test mask (white rectangle on black background)"""
44
+ mask = np.zeros((height, width), dtype=np.uint8)
45
+
46
+ # Create a rectangular mask in the center
47
+ rect_size = 60
48
+ x = width // 2 - rect_size // 2
49
+ y = height // 2 - rect_size // 2
50
+
51
+ cv2.rectangle(mask, (x, y), (x + rect_size, y + rect_size), 255, -1)
52
+
53
+ cv2.imwrite(str(output_path), mask)
54
+ logger.info(f"Created test mask: {output_path}")
55
+
56
+ def test_matanyone_processing():
57
+ """Test the MatAnyone processing with synthetic data"""
58
+
59
+ with tempfile.TemporaryDirectory() as temp_dir:
60
+ temp_path = Path(temp_dir)
61
+
62
+ # Create test files
63
+ video_path = temp_path / "test_video.mp4"
64
+ mask_path = temp_path / "test_mask.png"
65
+
66
+ logger.info("Creating test video and mask...")
67
+ create_test_video(video_path)
68
+ create_test_mask(mask_path)
69
+
70
+ # Test MatAnyone loading
71
+ logger.info("Testing MatAnyone model loading...")
72
+ try:
73
+ from models import get_matanyone_session, run_matany
74
+
75
+ # Load MatAnyone
76
+ matany_session = get_matanyone_session(enable=True)
77
+
78
+ if matany_session is None:
79
+ logger.error("❌ MatAnyone session could not be created")
80
+ return False
81
+
82
+ logger.info("βœ… MatAnyone session created successfully")
83
+
84
+ # Test processing
85
+ logger.info("Testing MatAnyone processing...")
86
+
87
+ def progress_callback(msg):
88
+ logger.info(f"Progress: {msg}")
89
+
90
+ fg_path, alpha_path, success = run_matany(
91
+ matany_session,
92
+ video_path,
93
+ mask_path,
94
+ temp_path,
95
+ progress_callback
96
+ )
97
+
98
+ if success and fg_path and alpha_path:
99
+ fg_exists = Path(fg_path).exists()
100
+ alpha_exists = Path(alpha_path).exists()
101
+
102
+ logger.info(f"βœ… MatAnyone processing completed")
103
+ logger.info(f" Foreground video: {fg_path} (exists: {fg_exists})")
104
+ logger.info(f" Alpha video: {alpha_path} (exists: {alpha_exists})")
105
+
106
+ if fg_exists and alpha_exists:
107
+ fg_size = Path(fg_path).stat().st_size
108
+ alpha_size = Path(alpha_path).stat().st_size
109
+ logger.info(f" File sizes - FG: {fg_size} bytes, Alpha: {alpha_size} bytes")
110
+
111
+ if fg_size > 0 and alpha_size > 0:
112
+ logger.info("πŸŽ‰ SUCCESS: MatAnyone API integration working correctly!")
113
+ return True
114
+ else:
115
+ logger.error("❌ Output files are empty")
116
+ return False
117
+ else:
118
+ logger.error("❌ Output files not created")
119
+ return False
120
+ else:
121
+ logger.error("❌ MatAnyone processing failed")
122
+ return False
123
+
124
+ except Exception as e:
125
+ logger.error(f"❌ Test failed with exception: {e}")
126
+ import traceback
127
+ logger.error(traceback.format_exc())
128
+ return False
129
+
130
+ if __name__ == "__main__":
131
+ logger.info("πŸ§ͺ Starting MatAnyone API integration test...")
132
+ success = test_matanyone_processing()
133
+
134
+ if success:
135
+ logger.info("βœ… All tests passed! MatAnyone integration is working.")
136
+ else:
137
+ logger.error("❌ Tests failed. Check the logs above for details.")