Inference stuck at 50% (Phase 2) | RTX 5060 Ti 16GB

#10
by Liquidmind111 - opened

During a long-duration generation (235 seconds) with a Batch Size of 2, the process hangs indefinitely at 50.0% (Phase 2: Generating audio codes). The terminal shows a successful prefill but the decoding stage never progresses beyond 0/2.

System Specs:

GPU: NVIDIA GeForce RTX 5060 Ti (16GB VRAM)

Environment: Windows 10 Portable Package (python_embeded)

Models used: DiT acestep-v15-turbo + LM acestep-5Hz-lm-1.7B

Steps to Reproduce:

Set Audio Duration to 235s.

Set Batch Size to 2.

Click Generate Music.

Observe the UI hang at 50% and terminal showing Decode=83tok/s with no progress.

Missing Feature: There is currently no Stop/Cancel button in the Gradio UI. The only way to interrupt a hung process is to force-close the terminal, which can lead to corrupted checkpoints or temporary files.

Requested Improvements:

Implementation of an Interrupt/Stop button for inference.

Investigation into why long-duration batches (Batch Size > 1) cause decoding to hang on 16GB VRAM cards despite being within the "Tier 5" recommendation.

545338074-5c3c59d8-c5d5-4ca6-b1d6-c3947cc26451
545345963-1ce649e1-d0e2-4c53-b8ef-2389864bfb9d

Hardware inactivity: Task Manager shows the GPU (RTX 5060 Ti) idling at 19% usage and VRAM fully available, yet the process is stuck at 50% for 2400+ seconds.

No resource bottleneck: System RAM is only at 38% utilization.

Conclusion: This is a logical hang in the decoding phase, not an Out-of-Memory (OOM) or hardware limitation.

just use comfy ui

just use comfy ui

but comfy does not has audio input for COVER and does not has IN-PAINTING for audio :(

Sign up or log in to comment