Fix KV cache compatibility with transformers 4.50+

#42

With transformers >= 4.50, use_cache=True fails with:
AttributeError: 'NoneType' object has no attribute 'shape'

Cause: transformers 4.50 changed empty caches from None to EncoderDecoderCache objects.
Code checks "if past_key_values is not None" which now passes, then fails accessing
past_key_values[0][0].shape when cache entries are still None.

Fix: Add null checks before accessing cache tensor shapes.
Backward compatible with all transformers versions.

Changes (8 locations):

  • Attention shape checks: add "past_key_value[0] is not None and"
  • Attention elif conditions: add "and past_key_value[0] is not None"
  • kv_seq_len update: wrap in null check
  • Decoder forward ternary: add full null check chain
  • prepare_inputs_for_generation: add full null check chain

Tested on transformers 4.57.1 with both ROCm and CUDA.
Enables ~10-25% speedup from proper KV caching.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment