Fix KV cache compatibility with transformers 4.50+
#42
by
kebabman
- opened
With transformers >= 4.50, use_cache=True fails with:
AttributeError: 'NoneType' object has no attribute 'shape'
Cause: transformers 4.50 changed empty caches from None to EncoderDecoderCache objects.
Code checks "if past_key_values is not None" which now passes, then fails accessing
past_key_values[0][0].shape when cache entries are still None.
Fix: Add null checks before accessing cache tensor shapes.
Backward compatible with all transformers versions.
Changes (8 locations):
- Attention shape checks: add "past_key_value[0] is not None and"
- Attention elif conditions: add "and past_key_value[0] is not None"
- kv_seq_len update: wrap in null check
- Decoder forward ternary: add full null check chain
- prepare_inputs_for_generation: add full null check chain
Tested on transformers 4.57.1 with both ROCm and CUDA.
Enables ~10-25% speedup from proper KV caching.