Inference stuck at 50% (Phase 2) | RTX 5060 Ti 16GB
During a long-duration generation (235 seconds) with a Batch Size of 2, the process hangs indefinitely at 50.0% (Phase 2: Generating audio codes). The terminal shows a successful prefill but the decoding stage never progresses beyond 0/2.
System Specs:
GPU: NVIDIA GeForce RTX 5060 Ti (16GB VRAM)
Environment: Windows 10 Portable Package (python_embeded)
Models used: DiT acestep-v15-turbo + LM acestep-5Hz-lm-1.7B
Steps to Reproduce:
Set Audio Duration to 235s.
Set Batch Size to 2.
Click Generate Music.
Observe the UI hang at 50% and terminal showing Decode=83tok/s with no progress.
Missing Feature: There is currently no Stop/Cancel button in the Gradio UI. The only way to interrupt a hung process is to force-close the terminal, which can lead to corrupted checkpoints or temporary files.
Requested Improvements:
Implementation of an Interrupt/Stop button for inference.
Investigation into why long-duration batches (Batch Size > 1) cause decoding to hang on 16GB VRAM cards despite being within the "Tier 5" recommendation.
Hardware inactivity: Task Manager shows the GPU (RTX 5060 Ti) idling at 19% usage and VRAM fully available, yet the process is stuck at 50% for 2400+ seconds.
No resource bottleneck: System RAM is only at 38% utilization.
Conclusion: This is a logical hang in the decoding phase, not an Out-of-Memory (OOM) or hardware limitation.
just use comfy ui
just use comfy ui
but comfy does not has audio input for COVER and does not has IN-PAINTING for audio :(

