multimodalart HF Staff Claude Opus 4.6 commited on
Commit
a263769
·
1 Parent(s): 930ae2b

Disable NeMo CUDA Graphs to fix CUDA failure 35 on RNNT decoding

Browse files

NeMo 2.6 RNNT decoding uses CUDA Graphs which fail with error 35
on CUDA 12.8 + PyTorch 2.9 (NVIDIA-NeMo/NeMo#15145). Disable them
on the parakeet-tdt model after loading.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

preprocess/tools/lyric_transcription.py CHANGED
@@ -150,6 +150,12 @@ class _ASREnModel:
150
  map_location=device,
151
  )
152
  self.model.eval()
 
 
 
 
 
 
153
 
154
  @staticmethod
155
  def _clean_word(word: str) -> str:
 
150
  map_location=device,
151
  )
152
  self.model.eval()
153
+ # Disable CUDA Graphs to avoid "CUDA failure! 35" on CUDA 12.8 + PyTorch 2.9
154
+ # See: https://github.com/NVIDIA-NeMo/NeMo/issues/15145
155
+ if hasattr(self.model, 'decoding') and hasattr(self.model.decoding, 'decoding'):
156
+ comp = getattr(self.model.decoding.decoding, 'decoding_computer', None)
157
+ if comp is not None and hasattr(comp, 'disable_cuda_graphs'):
158
+ comp.disable_cuda_graphs()
159
 
160
  @staticmethod
161
  def _clean_word(word: str) -> str: