Text Generation
Transformers
Safetensors
PyTorch
nemotron_h
nvidia
conversational
custom_code

Fix modeling_nemotron_h.py

#28
by rrs1616 - opened

modeling_nemotron_h.py is incompatible with the newer DynamicCache API in transformers that changed key_cache to a read-only property. This patch fixes it by using the new cache API (cache.layers[idx].keys instead of cache.key_cache)

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment